Search | arXiv e-print repository

Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction

Authors: Zeyu Wang, Tianyi Jiang, Yao Lu, Xiaoze Bao, Shanqing Yu, Bin Wei, Qi Xuan

Abstract: Recently, few-shot molecular property prediction (FSMPP) has garnered increasing attention. Despite impressive breakthroughs achieved by existing methods, they often overlook the inherent many-to-many relationships between molecules and properties, which limits their performance. For instance, similar substructures of molecules can inspire the exploration of new compounds. Additionally, the relati… ▽ More Recently, few-shot molecular property prediction (FSMPP) has garnered increasing attention. Despite impressive breakthroughs achieved by existing methods, they often overlook the inherent many-to-many relationships between molecules and properties, which limits their performance. For instance, similar substructures of molecules can inspire the exploration of new compounds. Additionally, the relationships between properties can be quantified, with high-related properties providing more information in exploring the target property than those low-related. To this end, this paper proposes a novel meta-learning FSMPP framework (KRGTS), which comprises the Knowledge-enhanced Relation Graph module and the Task Sampling module. The knowledge-enhanced relation graph module constructs the molecule-property multi-relation graph (MPMRG) to capture the many-to-many relationships between molecules and properties. The task sampling module includes a meta-training task sampler and an auxiliary task sampler, responsible for scheduling the meta-training process and sampling high-related auxiliary tasks, respectively, thereby achieving efficient meta-knowledge learning and reducing noise introduction. Empirically, extensive experiments on five datasets demonstrate the superiority of KRGTS over a variety of state-of-the-art methods. The code is available in https://github.com/Vencent-Won/KRGTS-public. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2404.14755 [pdf, other]

SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models

Authors: Bo Lin, Ying**g Xu, Xuanwen Bao, Zhou Zhao, Zuyong Zhang, Zhouyang Wang, Jie Zhang, Shuiguang Deng, Jianwei Yin

Abstract: With the continuous advancement of vision language models (VLMs) technology, remarkable research achievements have emerged in the dermatology field, the fourth most prevalent human disease category. However, despite these advancements, VLM still faces "hallucination" in dermatological diagnosis, and due to the inherent complexity of dermatological conditions, existing tools offer relatively limite… ▽ More With the continuous advancement of vision language models (VLMs) technology, remarkable research achievements have emerged in the dermatology field, the fourth most prevalent human disease category. However, despite these advancements, VLM still faces "hallucination" in dermatological diagnosis, and due to the inherent complexity of dermatological conditions, existing tools offer relatively limited support for user comprehension. We propose SkinGEN, a diagnosis-to-generation framework that leverages the stable diffusion (SD) method to generate reference demonstrations from diagnosis results provided by VLM, thereby enhancing the visual explainability for users. Through extensive experiments with Low-Rank Adaptation (LoRA), we identify optimal strategies for skin condition image generation. We conduct a user study with 32 participants evaluating both the system performance and explainability. Results demonstrate that SkinGEN significantly improves users' comprehension of VLM predictions and fosters increased trust in the diagnostic process. This work paves the way for more transparent and user-centric VLM applications in dermatology and beyond. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.05673 [pdf, other]

CoReS: Orchestrating the Dance of Reasoning and Segmentation

Authors: Xiaoyi Bao, Siyang Sun, Shuailei Ma, Kecheng Zheng, Yuxin Guo, Guosheng Zhao, Yun Zheng, Xingang Wang

Abstract: The reasoning segmentation task, which demands a nuanced comprehension of intricate queries to accurately pinpoint object regions, is attracting increasing attention. However, Multi-modal Large Language Models (MLLM) often find it difficult to accurately localize the objects described in complex reasoning contexts. We believe that the act of reasoning segmentation should mirror the cognitive stage… ▽ More The reasoning segmentation task, which demands a nuanced comprehension of intricate queries to accurately pinpoint object regions, is attracting increasing attention. However, Multi-modal Large Language Models (MLLM) often find it difficult to accurately localize the objects described in complex reasoning contexts. We believe that the act of reasoning segmentation should mirror the cognitive stages of human visual search, where each step is a progressive refinement of thought toward the final object. Thus we introduce the Chains of Reasoning and Segmenting (CoReS) and find this top-down visual hierarchy indeed enhances the visual search process. Specifically, we propose a dual-chain structure that generates multi-modal, chain-like outputs to aid the segmentation process. Furthermore, to steer the MLLM's outputs into this intended hierarchy, we incorporate in-context inputs as guidance. Extensive experiments demonstrate the superior performance of our CoReS, which surpasses the state-of-the-art method by 7.1\% on the ReasonSeg dataset. Project: https://chain-of-reasoning-and-segmentation.github.io/. △ Less

Submitted 17 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.00730 [pdf, other]

Long-range dipole-dipole exchange-induced atomic grating

Authors: Xuan-Qian Bao, Xue-Dong Tian, Dong-Xiao Li, Yi-Mou Liu

Abstract: We propose a theoretical scheme for dipole exchange-induced grating (DEIG) based on a hybrid system consisting of ultra-cold Rubidium ($^{87}$Rb) atomic ensemble and movable Rydberg spin atoms. The optical response of the grating appears as a superposition of three- and four-level configurations, similar to the cooperative optical nonlinear effect caused by the dipole blockade effect. However, suc… ▽ More We propose a theoretical scheme for dipole exchange-induced grating (DEIG) based on a hybrid system consisting of ultra-cold Rubidium ($^{87}$Rb) atomic ensemble and movable Rydberg spin atoms. The optical response of the grating appears as a superposition of three- and four-level configurations, similar to the cooperative optical nonlinear effect caused by the dipole blockade effect. However, such Rydberg atomic grating uniquely responds to the spatial positions of spin atoms, offering a novel approach to dynamically control electromagnetically induced gratings (EIG) except for input probe intensity. △ Less

Submitted 2 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

arXiv:2403.06845 [pdf, other]

DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation

Authors: Guosheng Zhao, Xiaofeng Wang, Zheng Zhu, Xinze Chen, Guan Huang, Xiaoyi Bao, Xingang Wang

Abstract: World models have demonstrated superiority in autonomous driving, particularly in the generation of multi-view driving videos. However, significant challenges still exist in generating customized driving videos. In this paper, we propose DriveDreamer-2, which builds upon the framework of DriveDreamer and incorporates a Large Language Model (LLM) to generate user-defined driving videos. Specificall… ▽ More World models have demonstrated superiority in autonomous driving, particularly in the generation of multi-view driving videos. However, significant challenges still exist in generating customized driving videos. In this paper, we propose DriveDreamer-2, which builds upon the framework of DriveDreamer and incorporates a Large Language Model (LLM) to generate user-defined driving videos. Specifically, an LLM interface is initially incorporated to convert a user's query into agent trajectories. Subsequently, a HDMap, adhering to traffic regulations, is generated based on the trajectories. Ultimately, we propose the Unified Multi-View Model to enhance temporal and spatial coherence in the generated driving videos. DriveDreamer-2 is the first world model to generate customized driving videos, it can generate uncommon driving videos (e.g., vehicles abruptly cut in) in a user-friendly manner. Besides, experimental results demonstrate that the generated videos enhance the training of driving perception methods (e.g., 3D detection and tracking). Furthermore, video generation quality of DriveDreamer-2 surpasses other state-of-the-art methods, showcasing FID and FVD scores of 11.2 and 55.7, representing relative improvements of 30% and 50%. △ Less

Submitted 11 April, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: Project Page: https://drivedreamer2.github.io

arXiv:2403.01203 [pdf, other]

Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment

Authors: Luyao Wang, Pengnian Qi, Xigang Bao, Chunlai Zhou, Biao Qin

Abstract: Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs for integration. Unfortunately, prior arts have attempted to improve the interaction and fusion of multi-modal information, which have overlooked the influence of modal-specific noise and the usage of labeled and unlabeled data in semi-supervised settings. In this work, we introduce a… ▽ More Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs for integration. Unfortunately, prior arts have attempted to improve the interaction and fusion of multi-modal information, which have overlooked the influence of modal-specific noise and the usage of labeled and unlabeled data in semi-supervised settings. In this work, we introduce a Pseudo-label Calibration Multi-modal Entity Alignment (PCMEA) in a semi-supervised way. Specifically, in order to generate holistic entity representations, we first devise various embedding modules and attention mechanisms to extract visual, structural, relational, and attribute features. Different from the prior direct fusion methods, we next propose to exploit mutual information maximization to filter the modal-specific noise and to augment modal-invariant commonality. Then, we combine pseudo-label calibration with momentum-based contrastive learning to make full use of the labeled and unlabeled data, which improves the quality of pseudo-label and pulls aligned entities closer. Finally, extensive experiments on two MMEA datasets demonstrate the effectiveness of our PCMEA, which yields state-of-the-art performance. △ Less

Submitted 2 March, 2024; originally announced March 2024.

Comments: accepted by AAAI2024

arXiv:2401.04470 [pdf, other]

Single-Shot Readout of a Nuclear Spin in Silicon Carbide

Authors: Xiao-Yi Lai, Ren-Zhou Fang, Tao Li, Ren-Zhu Su, Jia Huang, Hao Li, Li-Xing You, Xiao-Hui Bao, Jian-Wei Pan

Abstract: Solid-state qubits with a photonic interface is very promising for quantum networks. Color centers in silicon carbide have shown excellent optical and spin coherence, even when integrated with membranes and nano-structures. Additionally, nuclear spins coupled with electron spins can serve as long-lived quantum memories. Pioneering work in previous has realized the initialization of a single nuclea… ▽ More Solid-state qubits with a photonic interface is very promising for quantum networks. Color centers in silicon carbide have shown excellent optical and spin coherence, even when integrated with membranes and nano-structures. Additionally, nuclear spins coupled with electron spins can serve as long-lived quantum memories. Pioneering work in previous has realized the initialization of a single nuclear spin and demonstrated its entanglement with an electron spin. In this paper, we report the first realization of single-shot readout for a nuclear spin in SiC. We obtain a deterministic readout fidelity of 98.2% with a measurement duration of 1.13 ms. With a dual-step readout scheme, we obtain a readout fidelity as high as 99.5% with a success efficiency of 89.8%. Our work complements the experimental toolbox of harnessing both electron and nuclear spins in SiC for future quantum networks. △ Less

Submitted 9 January, 2024; originally announced January 2024.

Comments: 5 pages, 4 figures

arXiv:2312.11570 [pdf, other]

Understanding the Multi-modal Prompts of the Pre-trained Vision-Language Model

Authors: Shuailei Ma, Chen-Wei Xie, Ying Wei, Siyang Sun, Jiaqi Fan, Xiaoyi Bao, Yuxin Guo, Yun Zheng

Abstract: Prompt learning has emerged as an efficient alternative for fine-tuning foundational models, such as CLIP, for various downstream tasks. However, there is no work that provides a comprehensive explanation for the working mechanism of the multi-modal prompts. In this paper, we conduct a direct analysis of the multi-modal prompts by asking the following questions: $(i)$ How do the learned multi-moda… ▽ More Prompt learning has emerged as an efficient alternative for fine-tuning foundational models, such as CLIP, for various downstream tasks. However, there is no work that provides a comprehensive explanation for the working mechanism of the multi-modal prompts. In this paper, we conduct a direct analysis of the multi-modal prompts by asking the following questions: $(i)$ How do the learned multi-modal prompts improve the recognition performance? $(ii)$ What do the multi-modal prompts learn? To answer these questions, we begin by isolating the component of the formula where the prompt influences the calculation of self-attention at each layer in two distinct ways, \ie, $(1)$ introducing prompt embeddings makes the $[cls]$ token focus on foreground objects. $(2)$ the prompts learn a bias term during the update of token embeddings, allowing the model to adapt to the target domain. Subsequently, we conduct extensive visualization and statistical experiments on the eleven diverse downstream recognition datasets. From the experiments, we reveal that the learned prompts improve the performance mainly through the second way, which acts as the dataset bias to improve the recognition performance of the pre-trained model on the corresponding dataset. Meanwhile, we propose the bias tuning way to validate our finding. With a deeper understanding of the multi-modal prompt, we hope our work can inspire new and solid research in this direction. △ Less

Submitted 11 March, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

Comments: We find that the statistical information in Figure 2 neglect the statistics for tSOS, so we make corrections. Additionally, we change the statistical samples to those where CLIP misidentify, but prompt tuning identify correctly. At the same time, we also revise some of the descriptions. The changes to the supplementary materials will be updated shortly. arXiv admin note: text overlap with arXiv:2307.06948 by other authors

arXiv:2312.10898 [pdf]

Replica symmetry breaking in 1D Rayleigh scattering system: theory and validations

Authors: Yifei Qi, Longqun Ni, Zhenyu Ye, Jiaojiao Zhang, Xingyu Bao, Pan Wang, Yunjiang Rao, Ernesto P. Raposo, Anderson S. L. Gomes, Zinan Wang

Abstract: Spin glass theory, as a paradigm for describing disordered magnetic systems, constitutes a prominent subject of study within statistical physics. Replica symmetry breaking (RSB), as one of the pivotal concepts for the understanding of spin glass theory, means that, under identical conditions disordered systems can yield distinct states with nontrivial correlations. Random fiber laser (RFL) based o… ▽ More Spin glass theory, as a paradigm for describing disordered magnetic systems, constitutes a prominent subject of study within statistical physics. Replica symmetry breaking (RSB), as one of the pivotal concepts for the understanding of spin glass theory, means that, under identical conditions disordered systems can yield distinct states with nontrivial correlations. Random fiber laser (RFL) based on Rayleigh scattering (RS) is a complex disordered system, owing to the disorder and stochasticity of RS. In this work, for the first time, we elaborate a precise theoretical model for studying the photonic phase transition via the platform of RS-based RFL, in which we clearly reveal that, apart from the pump power, the photon phase variation in RFL is also an analogy to the temperature term in spin glass phase transition, leading to a novel insight into the intrinsic mechanisms of photonic phase transition. In addition, based on this model and real-time high-fidelity detection spectral evolution, we theoretically predict and experimentally observe the mode-asymmetric characteristics of photonic phase transition in RS-based RFL. This finding contributes to a deeper understanding of the photonic RSB regime and the dynamics of RS-based RFL. △ Less

Submitted 17 December, 2023; originally announced December 2023.

Comments: 15 pages, 9 figures

arXiv:2312.06474 [pdf, other]

Relevant Intrinsic Feature Enhancement Network for Few-Shot Semantic Segmentation

Authors: Xiaoyi Bao, Jie Qin, Siyang Sun, Yun Zheng, Xingang Wang

Abstract: For few-shot semantic segmentation, the primary task is to extract class-specific intrinsic information from limited labeled data. However, the semantic ambiguity and inter-class similarity of previous methods limit the accuracy of pixel-level foreground-background classification. To alleviate these issues, we propose the Relevant Intrinsic Feature Enhancement Network (RiFeNet). To improve the sem… ▽ More For few-shot semantic segmentation, the primary task is to extract class-specific intrinsic information from limited labeled data. However, the semantic ambiguity and inter-class similarity of previous methods limit the accuracy of pixel-level foreground-background classification. To alleviate these issues, we propose the Relevant Intrinsic Feature Enhancement Network (RiFeNet). To improve the semantic consistency of foreground instances, we propose an unlabeled branch as an efficient data utilization method, which teaches the model how to extract intrinsic features robust to intra-class differences. Notably, during testing, the proposed unlabeled branch is excluded without extra unlabeled data and computation. Furthermore, we extend the inter-class variability between foreground and background by proposing a novel multi-level prototype generation and interaction module. The different-grained complementarity between global and local prototypes allows for better distinction between similar categories. The qualitative and quantitative performance of RiFeNet surpasses the state-of-the-art methods on PASCAL-5i and COCO benchmarks. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: Accepted in AAAI 2024

arXiv:2312.05906 [pdf]

Characterization and regulation of statistical properties in Er-doped random fiber laser

Authors: Xingyu Bao, Shengtao Lin, Jiaojiao Zhang, Yongxin Liang, Anchi Wan, Yifei Qi, Zinan Wang

Abstract: Er-doped random fiber laser (ERFL) is a complex physical system, and understanding its intrinsic physical mechanisms is crucial for promoting applications. In this paper, we experimentally investigate the time-domain statistical properties of ERFL under full-bandwidth condition for the first time. We also analyze the effects of the transmission process and amplification process on the output chara… ▽ More Er-doped random fiber laser (ERFL) is a complex physical system, and understanding its intrinsic physical mechanisms is crucial for promoting applications. In this paper, we experimentally investigate the time-domain statistical properties of ERFL under full-bandwidth condition for the first time. We also analyze the effects of the transmission process and amplification process on the output characteristics of ERFL, on the basis of which we realize its regulation. This study guides RFL systems requiring transmission and amplification, offering fresh insights for regulating the time-domain stability. △ Less

Submitted 10 December, 2023; originally announced December 2023.

arXiv:2311.17455 [pdf, other]

Experimental Generation of Spin-Photon Entanglement in Silicon Carbide

Authors: Ren-Zhou Fang, Xiao-Yi Lai, Tao Li, Ren-Zhu Su, Bo-Wei Lu, Chao-Wei Yang, Run-Ze Liu, Yu-Kun Qiao, Cheng Li, Zhi-Gang He, Jia Huang, Hao Li, Li-Xing You, Yong-Heng Huo, Xiao-Hui Bao, Jian-Wei Pan

Abstract: A solid-state approach for quantum networks is advantages, as it allows the integration of nanophotonics to enhance the photon emission and the utilization of weakly coupled nuclear spins for long-lived storage. Silicon carbide, specifically point defects within it, shows great promise in this regard due to the easy of availability and well-established nanofabrication techniques. Despite of remark… ▽ More A solid-state approach for quantum networks is advantages, as it allows the integration of nanophotonics to enhance the photon emission and the utilization of weakly coupled nuclear spins for long-lived storage. Silicon carbide, specifically point defects within it, shows great promise in this regard due to the easy of availability and well-established nanofabrication techniques. Despite of remarkable progresses made, achieving spin-photon entanglement remains a crucial aspect to be realized. In this paper, we experimentally generate entanglement between a silicon vacancy defect in silicon carbide and a scattered single photon in the zero-phonon line. The spin state is measured by detecting photons scattered in the phonon sideband. The photonic qubit is encoded in the time-bin degree-of-freedom and measured using an unbalanced Mach-Zehnder interferometer. Photonic correlations not only reveal the quality of the entanglement but also verify the deterministic nature of the entanglement creation process. By harnessing two pairs of such spin-photon entanglement, it becomes straightforward to entangle remote quantum nodes at long distance. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: 8 pages in total, 4 figures in the main text, 1 figure in the supplemental material

arXiv:2310.12175 [pdf, ps, other]

Analysis on the Derivation of the Schrödinger Equation with Analogy to Electromagnetic Wave Equation

Authors: Xuefeng Bao

Abstract: The Schrödinger equation is universally accepted due to its excellent predictions aligning with observed results within its defined conditions. Nevertheless, it does not seem to possess the simplicity of fundamental laws, such as Newton's laws of motion. Various insightful attempts have been made to elucidate the rationale behind the Schrödinger equation. This paper seeks to review existing explan… ▽ More The Schrödinger equation is universally accepted due to its excellent predictions aligning with observed results within its defined conditions. Nevertheless, it does not seem to possess the simplicity of fundamental laws, such as Newton's laws of motion. Various insightful attempts have been made to elucidate the rationale behind the Schrödinger equation. This paper seeks to review existing explanations and propose some prospectives on the derivation of the Schrödinger equation. △ Less

Submitted 15 October, 2023; originally announced October 2023.

arXiv:2310.04780 [pdf, other]

IPMix: Label-Preserving Data Augmentation Method for Training Robust Classifiers

Authors: Zhenglin Huang, Xiaoan Bao, Na Zhang, Qingqi Zhang, Xiaomei Tu, Biao Wu, Xi Yang

Abstract: Data augmentation has been proven effective for training high-accuracy convolutional neural network classifiers by preventing overfitting. However, building deep neural networks in real-world scenarios requires not only high accuracy on clean data but also robustness when data distributions shift. While prior methods have proposed that there is a trade-off between accuracy and robustness, we propo… ▽ More Data augmentation has been proven effective for training high-accuracy convolutional neural network classifiers by preventing overfitting. However, building deep neural networks in real-world scenarios requires not only high accuracy on clean data but also robustness when data distributions shift. While prior methods have proposed that there is a trade-off between accuracy and robustness, we propose IPMix, a simple data augmentation approach to improve robustness without hurting clean accuracy. IPMix integrates three levels of data augmentation (image-level, patch-level, and pixel-level) into a coherent and label-preserving technique to increase the diversity of training data with limited computational overhead. To further improve the robustness, IPMix introduces structural complexity at different levels to generate more diverse images and adopts the random mixing method for multi-scale information fusion. Experiments demonstrate that IPMix outperforms state-of-the-art corruption robustness on CIFAR-C and ImageNet-C. In addition, we show that IPMix also significantly improves the other safety measures, including robustness to adversarial perturbations, calibration, prediction consistency, and anomaly detection, achieving state-of-the-art or comparable results on several benchmarks, including ImageNet-R, ImageNet-A, and ImageNet-O. △ Less

Submitted 13 March, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

Comments: NeurIPS 2023

arXiv:2309.14661 [pdf, other]

Micro-Macro Modeling of Polymeric Fluids and Shear-Induced Microscopic Behaviors

Authors: Xuelian Bao, Huaxiong Huang, Zilong Song, Shixin Xu

Abstract: This article delves into the micro-macro modeling of polymeric fluids, considering various microscopic potential energies, including the classical Hookean potential, as well as newly proposed modified Morse and Elastic-plastic potentials. These proposed potentials encompass microscopic-scale bond-breaking processes. The development of a thermodynamically consistent micro-macro model is revisited,… ▽ More This article delves into the micro-macro modeling of polymeric fluids, considering various microscopic potential energies, including the classical Hookean potential, as well as newly proposed modified Morse and Elastic-plastic potentials. These proposed potentials encompass microscopic-scale bond-breaking processes. The development of a thermodynamically consistent micro-macro model is revisited, employing the energy variational method. To validate the model's predictions, we conduct numerical simulations utilizing a deterministic particle-FEM method. Our numerical findings shed light on the distinct behaviors exhibited by polymer chains at the micro-scale in comparison to the macro-scale velocity and induced shear stresses of fluids under shear flow. Notably, we observe that polymer elongation, rotation, and bond breaking contribute to the zero polymer-induced stress in the micro-macro model when employing Morse and Elastic-plastic potentials. Furthermore, at high shear rates, polymer rotation is found to induce shear-thinning behavior in the model employing the classical Hookean potential. △ Less

Submitted 26 September, 2023; originally announced September 2023.

arXiv:2309.00221 [pdf, other]

A multinode quantum network over a metropolitan area

Authors: Jian-Long Liu, Xi-Yu Luo, Yong Yu, Chao-Yang Wang, Bin Wang, Yi Hu, Jun Li, Ming-Yang Zheng, Bo Yao, Zi Yan, Da Teng, **-Wei Jiang, Xiao-Bing Liu, Xiu-** Xie, Jun Zhang, Qing-He Mao, Xiao Jiang, Qiang Zhang, Xiao-Hui Bao, Jian-Wei Pan

Abstract: Towards realizing the future quantum internet, a pivotal milestone entails the transition from two-node proof-of-principle experiments conducted in laboratories to comprehensive, multi-node setups on large scales. Here, we report on the debut implementation of a multi-node entanglement-based quantum network over a metropolitan area. We equipped three quantum nodes with atomic quantum memories and… ▽ More Towards realizing the future quantum internet, a pivotal milestone entails the transition from two-node proof-of-principle experiments conducted in laboratories to comprehensive, multi-node setups on large scales. Here, we report on the debut implementation of a multi-node entanglement-based quantum network over a metropolitan area. We equipped three quantum nodes with atomic quantum memories and their telecom interfaces, and combined them into a scalable phase-stabilized architecture through a server node. We demonstrated heralded entanglement generation between two quantum nodes situated 12.5 km apart, and the storage of entanglement exceeding the round-trip communication time. We also showed the concurrent entanglement generation on three links. Our work provides a metropolitan-scale testbed for the evaluation and exploration of multi-node quantum network protocols and starts a new stage of quantum internet research. △ Less

Submitted 31 August, 2023; originally announced September 2023.

Comments: 21 pages in total, 4 figures and 1 table in the main text, 5 figures and 8 tables in the supplementary material

arXiv:2308.16399 [pdf, other]

Complex quantum momentum due to correlation

Authors: Matthew Albert, Xiaoyi Bao, Liang Chen

Abstract: Real numbers provide a sufficient description of classical physics and all measurable phenomena; however, complex numbers are occasionally utilized as a convenient mathematical tool to aid our calculations. On the other hand, the formalism of quantum mechanics integrates complex numbers within its fundamental principles, and whether this arises out of necessity or not is an important question that… ▽ More Real numbers provide a sufficient description of classical physics and all measurable phenomena; however, complex numbers are occasionally utilized as a convenient mathematical tool to aid our calculations. On the other hand, the formalism of quantum mechanics integrates complex numbers within its fundamental principles, and whether this arises out of necessity or not is an important question that many have attempted to answer. Here, we will consider two electrons in a one-dimensional quantum well where the interaction potential between the two electrons is attractive as opposed to the usual repulsive coulomb potential. Pairs of electrons exhibiting such effective attraction towards each other occur in other settings, namely within superconductivity. We will demonstrate that this attractive interaction leads to the necessity of complex momentum solutions, which further emphasizes the significance of complex numbers in quantum theory. The complex momentum solutions are solved using a perturbative analysis approach in tandem with Newton's method. The probability densities arising from these complex momentum solutions allow for a comparison with the probability densities of the typical real momentum solutions occurring from the standard repulsive interaction potential. △ Less

Submitted 30 August, 2023; originally announced August 2023.

arXiv:2308.15501 [pdf, ps, other]

The AIMS Site Survey

Authors: Xingming Bao, Jian Wang, Shuai **g, Yuanyong Deng, Dongguang Wang

Abstract: This paper reports site survey results for the Infrared System for the Accurate Measurement of Solar Magnetic Field, especially in Saishiteng Mountain, Qinghai, China. Since 2017, we have installed weather station, spectrometer for precipitable water vapor (PWV) and S-DIMM and carried out observation on weather elements, precipitable water vapor and daytime seeing condition for more than one year… ▽ More This paper reports site survey results for the Infrared System for the Accurate Measurement of Solar Magnetic Field, especially in Saishiteng Mountain, Qinghai, China. Since 2017, we have installed weather station, spectrometer for precipitable water vapor (PWV) and S-DIMM and carried out observation on weather elements, precipitable water vapor and daytime seeing condition for more than one year in almost all candidates. At Mt. Saishiteng, the median value of daytime precipitable water vapor is 5.25 mm and its median value in winter season is 2.1 mm. The median value of Fried parameter of daytime seeing observation at Saishiteng Mountain is 3.42 cm. Its solar direct radiation data shows that solar average observable time is 446 minutes per day and premium time is 401 minutes per day in August 2019. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: 10 pages, 18 figures

arXiv:2308.12231 [pdf, other]

SPPNet: A Single-Point Prompt Network for Nuclei Image Segmentation

Authors: Qing Xu, Wenwei Kuang, Zeyu Zhang, Xueyao Bao, Haoran Chen, Wenting Duan

Abstract: Image segmentation plays an essential role in nuclei image analysis. Recently, the segment anything model has made a significant breakthrough in such tasks. However, the current model exists two major issues for cell segmentation: (1) the image encoder of the segment anything model involves a large number of parameters. Retraining or even fine-tuning the model still requires expensive computationa… ▽ More Image segmentation plays an essential role in nuclei image analysis. Recently, the segment anything model has made a significant breakthrough in such tasks. However, the current model exists two major issues for cell segmentation: (1) the image encoder of the segment anything model involves a large number of parameters. Retraining or even fine-tuning the model still requires expensive computational resources. (2) in point prompt mode, points are sampled from the center of the ground truth and more than one set of points is expected to achieve reliable performance, which is not efficient for practical applications. In this paper, a single-point prompt network is proposed for nuclei image segmentation, called SPPNet. We replace the original image encoder with a lightweight vision transformer. Also, an effective convolutional block is added in parallel to extract the low-level semantic information from the image and compensate for the performance degradation due to the small image encoder. We propose a new point-sampling method based on the Gaussian kernel. The proposed model is evaluated on the MoNuSeg-2018 dataset. The result demonstrated that SPPNet outperforms existing U-shape architectures and shows faster convergence in training. Compared to the segment anything model, SPPNet shows roughly 20 times faster inference, with 1/70 parameters and computational cost. Particularly, only one set of points is required in both the training and inference phases, which is more reasonable for clinical applications. The code for our work and more technical details can be found at https://github.com/xq141839/SPPNet. △ Less

Submitted 23 August, 2023; originally announced August 2023.

arXiv:2308.10155 [pdf, other]

Unilaterally Aggregated Contrastive Learning with Hierarchical Augmentation for Anomaly Detection

Authors: Guodong Wang, Yunhong Wang, Jie Qin, Dongming Zhang, Xiuguo Bao, Di Huang

Abstract: Anomaly detection (AD), aiming to find samples that deviate from the training distribution, is essential in safety-critical applications. Though recent self-supervised learning based attempts achieve promising results by creating virtual outliers, their training objectives are less faithful to AD which requires a concentrated inlier distribution as well as a dispersive outlier distribution. In thi… ▽ More Anomaly detection (AD), aiming to find samples that deviate from the training distribution, is essential in safety-critical applications. Though recent self-supervised learning based attempts achieve promising results by creating virtual outliers, their training objectives are less faithful to AD which requires a concentrated inlier distribution as well as a dispersive outlier distribution. In this paper, we propose Unilaterally Aggregated Contrastive Learning with Hierarchical Augmentation (UniCon-HA), taking into account both the requirements above. Specifically, we explicitly encourage the concentration of inliers and the dispersion of virtual outliers via supervised and unsupervised contrastive losses, respectively. Considering that standard contrastive data augmentation for generating positive views may induce outliers, we additionally introduce a soft mechanism to re-weight each augmented inlier according to its deviation from the inlier distribution, to ensure a purified concentration. Moreover, to prompt a higher concentration, inspired by curriculum learning, we adopt an easy-to-hard hierarchical augmentation strategy and perform contrastive aggregation at different depths of the network based on the strengths of data augmentation. Our method is evaluated under three AD settings including unlabeled one-class, unlabeled multi-class, and labeled multi-class, demonstrating its consistent superiority over other competitors. △ Less

Submitted 20 August, 2023; originally announced August 2023.

Comments: Accepted by ICCV'2023

arXiv:2308.09678 [pdf, other]

PoSynDA: Multi-Hypothesis Pose Synthesis Domain Adaptation for Robust 3D Human Pose Estimation

Authors: Hanbing Liu, Jun-Yan He, Zhi-Qi Cheng, Wangmeng Xiang, Qize Yang, Wenhao Chai, Gaoang Wang, Xu Bao, Bin Luo, Yifeng Geng, Xuansong Xie

Abstract: Existing 3D human pose estimators face challenges in adapting to new datasets due to the lack of 2D-3D pose pairs in training sets. To overcome this issue, we propose \textit{Multi-Hypothesis \textbf{P}ose \textbf{Syn}thesis \textbf{D}omain \textbf{A}daptation} (\textbf{PoSynDA}) framework to bridge this data disparity gap in target domain. Typically, PoSynDA uses a diffusion-inspired structure to… ▽ More Existing 3D human pose estimators face challenges in adapting to new datasets due to the lack of 2D-3D pose pairs in training sets. To overcome this issue, we propose \textit{Multi-Hypothesis \textbf{P}ose \textbf{Syn}thesis \textbf{D}omain \textbf{A}daptation} (\textbf{PoSynDA}) framework to bridge this data disparity gap in target domain. Typically, PoSynDA uses a diffusion-inspired structure to simulate 3D pose distribution in the target domain. By incorporating a multi-hypothesis network, PoSynDA generates diverse pose hypotheses and aligns them with the target domain. To do this, it first utilizes target-specific source augmentation to obtain the target domain distribution data from the source domain by decoupling the scale and position parameters. The process is then further refined through the teacher-student paradigm and low-rank adaptation. With extensive comparison of benchmarks such as Human3.6M and MPI-INF-3DHP, PoSynDA demonstrates competitive performance, even comparable to the target-trained MixSTE model\cite{zhang2022mixste}. This work paves the way for the practical application of 3D human pose estimation in unseen domains. The code is available at https://github.com/hbing-l/PoSynDA. △ Less

Submitted 16 October, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

Comments: Accepted to ACM Multimedia 2023; 10 pages, 4 figures, 8 tables; the code is at https://github.com/hbing-l/PoSynDA

arXiv:2307.07927 [pdf, ps, other]

Normalized bound state solutions for the fractional Schrödinger equation with potential

Authors: Xin Bao, Ying Lv, Zeng-Qi Ou

Abstract: In this paper, we study the following fractional Schrödinger equation with prescribed mass \begin{equation*} \left\{ \begin{aligned} &(-Δ)^{s}u=λu+a(x)|u|^{p-2}u,\quad\text{in $\mathbb{R}^{N}$},\\ &\int_{\mathbb{R}^{N}}|u|^{2}dx=c^{2},\quad u\in H^{s}(\mathbb{R}^{N}), \end{aligned} \right. \end{equation*} where $0<s<1$, $N>2s$, $2+\frac{4s}{N}<p<2_{s}^{*}:=\frac{2N}{N-2s}$, $c>0$,… ▽ More In this paper, we study the following fractional Schrödinger equation with prescribed mass \begin{equation*} \left\{ \begin{aligned} &(-Δ)^{s}u=λu+a(x)|u|^{p-2}u,\quad\text{in $\mathbb{R}^{N}$},\\ &\int_{\mathbb{R}^{N}}|u|^{2}dx=c^{2},\quad u\in H^{s}(\mathbb{R}^{N}), \end{aligned} \right. \end{equation*} where $0<s<1$, $N>2s$, $2+\frac{4s}{N}<p<2_{s}^{*}:=\frac{2N}{N-2s}$, $c>0$, $λ\in \mathbb{R}$ and $a(x)\in C^{1}(\mathbb{R}^{N},\mathbb{R}^{+})$ is a potential function. By using a minimax principle, we prove the existence of bounded state normalized solution under various conditions on $a(x)$. △ Less

Submitted 15 July, 2023; originally announced July 2023.

arXiv:2307.07379 [pdf, other]

Normalized bound state solutions of fractional Schrödinger equations with general potential

Authors: Xin Bao, Ying Lv, Zeng-Qi Ou

Abstract: In this paper, we study a class of fractional Schrödinger equation \begin{equation} \label{eq0} \left\{ \begin{aligned} &(-Δ)^{s}u=λu+a(x)|u|^{p-2}u,\\ &\int_{\mathbb{R}^{N}}|u|^{2}dx=c^{2},\ u\in H^{s}(\mathbb{R}^{N}), \end{aligned} \right. \end{equation} where $N>2s$, $s\in(0,1)$ and $p\in(2,2+4s/N), c>0$. $a(x)\in C(\mathbb{R}^{N},\mathbb{R})$ is a positive potential function. By using Fixed Po… ▽ More In this paper, we study a class of fractional Schrödinger equation \begin{equation} \label{eq0} \left\{ \begin{aligned} &(-Δ)^{s}u=λu+a(x)|u|^{p-2}u,\\ &\int_{\mathbb{R}^{N}}|u|^{2}dx=c^{2},\ u\in H^{s}(\mathbb{R}^{N}), \end{aligned} \right. \end{equation} where $N>2s$, $s\in(0,1)$ and $p\in(2,2+4s/N), c>0$. $a(x)\in C(\mathbb{R}^{N},\mathbb{R})$ is a positive potential function. By using Fixed Point Theorem of Brouwer, barycenter function and variational method, we obtain the existence of normalized bound solutions for the problem. △ Less

Submitted 14 July, 2023; originally announced July 2023.

arXiv:2306.08925 [pdf, other]

Opinion Tree Parsing for Aspect-based Sentiment Analysis

Authors: Xiaoyi Bao, Xiaotong Jiang, Zhongqing Wang, Yue Zhang, Guodong Zhou

Abstract: Extracting sentiment elements using pre-trained generative models has recently led to large improvements in aspect-based sentiment analysis benchmarks. However, these models always need large-scale computing resources, and they also ignore explicit modeling of structure between sentiment elements. To address these challenges, we propose an opinion tree parsing model, aiming to parse all the sentim… ▽ More Extracting sentiment elements using pre-trained generative models has recently led to large improvements in aspect-based sentiment analysis benchmarks. However, these models always need large-scale computing resources, and they also ignore explicit modeling of structure between sentiment elements. To address these challenges, we propose an opinion tree parsing model, aiming to parse all the sentiment elements from an opinion tree, which is much faster, and can explicitly reveal a more comprehensive and complete aspect-level sentiment structure. In particular, we first introduce a novel context-free opinion grammar to normalize the opinion tree structure. We then employ a neural chart-based opinion tree parser to fully explore the correlations among sentiment elements and parse them into an opinion tree structure. Extensive experiments show the superiority of our proposed model and the capacity of the opinion tree parser with the proposed context-free opinion grammar. More importantly, the results also prove that our model is much faster than previous models. △ Less

Submitted 15 June, 2023; originally announced June 2023.

arXiv:2305.16437 [pdf, other]

KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired True-Range Multilateration

Authors: Xu Bao, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Wangmeng Xiang, **gdong Sun, Hanbing Liu, Wei Liu, Bin Luo, Yifeng Geng, Xuansong Xie

Abstract: Accurate facial landmark detection is critical for facial analysis tasks, yet prevailing heatmap and coordinate regression methods grapple with prohibitive computational costs and quantization errors. Through comprehensive theoretical analysis and experimentation, we identify and elucidate the limitations of existing techniques. To overcome these challenges, we pioneer the application of True-Rang… ▽ More Accurate facial landmark detection is critical for facial analysis tasks, yet prevailing heatmap and coordinate regression methods grapple with prohibitive computational costs and quantization errors. Through comprehensive theoretical analysis and experimentation, we identify and elucidate the limitations of existing techniques. To overcome these challenges, we pioneer the application of True-Range Multilateration, originally devised for GPS localization, to facial landmark detection. We propose KeyPoint Positioning System (KeyPosS) - the first framework to deduce exact landmark coordinates by triangulating distances between points of interest and anchor points predicted by a fully convolutional network. A key advantage of KeyPosS is its plug-and-play nature, enabling flexible integration into diverse decoding pipelines. Extensive experiments on four datasets demonstrate state-of-the-art performance, with KeyPosS outperforming existing methods in low-resolution settings despite minimal computational overhead. By spearheading the integration of Multilateration with facial analysis, KeyPosS marks a paradigm shift in facial landmark detection. The code is available at https://github.com/zhiqic/KeyPosS. △ Less

Submitted 23 September, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: Accepted to ACM Multimedia 2023; 10 pages, 7 figures, 6 tables; the code is at https://github.com/zhiqic/KeyPosS

arXiv:2305.08360 [pdf, other]

Improving ChatGPT Prompt for Code Generation

Authors: Chao Liu, Xuanlin Bao, Hongyu Zhang, Neng Zhang, Haibo Hu, Xiaohong Zhang, Meng Yan

Abstract: Automated code generation can be a powerful technique for software development, significantly reducing developers' efforts and time required to create new code by generating it automatically based on requirements. Recently, OpenAI's language model ChatGPT has emerged as a powerful tool for generating human-like responses to a wide range of textual inputs (i.e., prompts), including those related to… ▽ More Automated code generation can be a powerful technique for software development, significantly reducing developers' efforts and time required to create new code by generating it automatically based on requirements. Recently, OpenAI's language model ChatGPT has emerged as a powerful tool for generating human-like responses to a wide range of textual inputs (i.e., prompts), including those related to code generation. However, the effectiveness of ChatGPT for code generation is not well understood, and the generation performance could be heavily influenced by the choice of prompt. To answer these questions, we conducted experiments using the CodeXGlue dataset to evaluate ChatGPT's capabilities for two code generation tasks, including text-to-code and code-to-code generation. We designed prompts by leveraging the chain-of-thought strategy with multi-step optimizations. Our results showed that by carefully designing prompts to guide ChatGPT, the generation performance can be improved substantially. We also analyzed the factors that influenced the prompt design and provided insights that could guide future research. △ Less

Submitted 15 May, 2023; originally announced May 2023.

Comments: 12 pages, 1 figure

arXiv:2304.06926 [pdf, other]

Anomalous non-Hermitian skin effect: the topological inequivalence of skin modes versus point gap

Authors: Gang-Feng Guo, Xi-Xi Bao, Han-Jie Zhu, Xiao-Ming Zhao, Lin Zhuang, Lei Tan, Wu-Ming Liu

Abstract: Non-Hermitian skin effect, the localization of an extensive number of eigenstates at the ends of the system, has greatly expanded the frontier of physical laws. It has long been believed that the present of skin modes is equivalent to the topologically nontrivial point gap of complex eigenvalues under periodic boundary conditions, and vice versa. However, we find that this concomitance can be brok… ▽ More Non-Hermitian skin effect, the localization of an extensive number of eigenstates at the ends of the system, has greatly expanded the frontier of physical laws. It has long been believed that the present of skin modes is equivalent to the topologically nontrivial point gap of complex eigenvalues under periodic boundary conditions, and vice versa. However, we find that this concomitance can be broken, i.e., the skin modes can be present or absent whereas the point gap is topologically trivial or nontrivial, respectively, named anomalous non-Hermitian skin effect. This anomalous phenomenon arises when the unidirectional hop** amplitudes leading to the decoupling-like behaviors among subsystems are emergence. The emergence of the anomalous non-Hermitian skin effect is accompanied by the mutations of the open boundary energy spectrum, whose structure exhibits the multifold exceptional point and can not be recovered by continuum bands. Moreover, an experimental setup using circuits is proposed to simulate this novel quantum effect. Our results reveal the topologically inequivalent between skin modes and point gap. This new effect not only can give a deeper understanding of non-Bloch theory and the critical phenomenon in non-Hermitian systems, but may also inspire new applications such as in the sensors field. △ Less

Submitted 14 April, 2023; originally announced April 2023.

arXiv:2211.16252 [pdf]

doi 10.1007/s12274-023-5542-0

Lateral quantum confinement regulates charge carrier transfer and biexciton interaction in CdSe/CdSeS core/crown nanoplatelets

Authors: Yige Yao, Xiaotian Bao, Yunke Zhu, Xinyu Sui, An Hu, Peng Bai, Shufeng Wang, Hong Yang, Xinfeng Liu, Yunan Gao

Abstract: Charge carrier dynamics essentially determine the performance of various optoelectronic applications of colloidal semiconductor nanocrystals. Among them, two-dimensional nanoplatelets provide new adjustment freedom for their unique core/crown heterostructure. Herein, we demonstrate that by fine-tuning the core size and the lateral quantum confinement, the charge carrier transfer rate from the crow… ▽ More Charge carrier dynamics essentially determine the performance of various optoelectronic applications of colloidal semiconductor nanocrystals. Among them, two-dimensional nanoplatelets provide new adjustment freedom for their unique core/crown heterostructure. Herein, we demonstrate that by fine-tuning the core size and the lateral quantum confinement, the charge carrier transfer rate from the crown to the core can be varied by one order of magnitude in CdSe/CdSeS core/alloy-crown nanoplatelets. In addition, the transfer can be affected by a carrier blocking mechanism, i.e., the filled carriers hinder further possible transfer. Furthermore, we found that the biexciton interaction is oppositely affected by quantum confinement and electron delocalization, resulting in a non-monotonic variation of the biexciton binding energy with the emission wavelength. This work provides new observations and insights into the charge carrier transfer dynamics and exciton interactions in colloidal nanoplatelets and will promote their further applications in lasing, display, sensing, etc. △ Less

Submitted 29 November, 2022; originally announced November 2022.

Comments: main and SI

arXiv:2211.06239 [pdf, other]

A monitoring framework for deployed machine learning models with supply chain examples

Authors: Bradley Eck, Duygu Kabakci-Zorlu, Yan Chen, France Savard, Xiaowei Bao

Abstract: Actively monitoring machine learning models during production operations helps ensure prediction quality and detection and remediation of unexpected or undesired conditions. Monitoring models already deployed in big data environments brings the additional challenges of adding monitoring in parallel to the existing modelling workflow and controlling resource requirements. In this paper, we describe… ▽ More Actively monitoring machine learning models during production operations helps ensure prediction quality and detection and remediation of unexpected or undesired conditions. Monitoring models already deployed in big data environments brings the additional challenges of adding monitoring in parallel to the existing modelling workflow and controlling resource requirements. In this paper, we describe (1) a framework for monitoring machine learning models; and, (2) its implementation for a big data supply chain application. We use our implementation to study drift in model features, predictions, and performance on three real data sets. We compare hypothesis test and information theoretic approaches to drift detection in features and predictions using the Kolmogorov-Smirnov distance and Bhattacharyya coefficient. Results showed that model performance was stable over the evaluation period. Features and predictions showed statistically significant drifts; however, these drifts were not linked to changes in model performance during the time of our study. △ Less

Submitted 11 November, 2022; originally announced November 2022.

Comments: 8 pages, 9 figures, IEEE Big Data 2022

arXiv:2210.15511 [pdf, other]

ProContEXT: Exploring Progressive Context Transformer for Tracking

Authors: **-Peng Lan, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Bin Luo, Xu Bao, Wangmeng Xiang, Yifeng Geng, Xuansong Xie

Abstract: Existing Visual Object Tracking (VOT) only takes the target area in the first frame as a template. This causes tracking to inevitably fail in fast-changing and crowded scenes, as it cannot account for changes in object appearance between frames. To this end, we revamped the tracking framework with Progressive Context Encoding Transformer Tracker (ProContEXT), which coherently exploits spatial and… ▽ More Existing Visual Object Tracking (VOT) only takes the target area in the first frame as a template. This causes tracking to inevitably fail in fast-changing and crowded scenes, as it cannot account for changes in object appearance between frames. To this end, we revamped the tracking framework with Progressive Context Encoding Transformer Tracker (ProContEXT), which coherently exploits spatial and temporal contexts to predict object motion trajectories. Specifically, ProContEXT leverages a context-aware self-attention module to encode the spatial and temporal context, refining and updating the multi-scale static and dynamic templates to progressively perform accurately tracking. It explores the complementary between spatial and temporal context, raising a new pathway to multi-context modeling for transformer-based trackers. In addition, ProContEXT revised the token pruning technique to reduce computational complexity. Extensive experiments on popular benchmark datasets such as GOT-10k and TrackingNet demonstrate that the proposed ProContEXT achieves state-of-the-art performance. △ Less

Submitted 30 March, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

Comments: Accepted at ICASSP 2023, source code is at https://github.com/zhiqic/ProContEXT

arXiv:2208.04897 [pdf, other]

Sports Video Analysis on Large-Scale Data

Authors: Dekun Wu, He Zhao, Xingce Bao, Richard P. Wildes

Abstract: This paper investigates the modeling of automated machine description on sports video, which has seen much progress recently. Nevertheless, state-of-the-art approaches fall quite short of capturing how human experts analyze sports scenes. There are several major reasons: (1) The used dataset is collected from non-official providers, which naturally creates a gap between models trained on those dat… ▽ More This paper investigates the modeling of automated machine description on sports video, which has seen much progress recently. Nevertheless, state-of-the-art approaches fall quite short of capturing how human experts analyze sports scenes. There are several major reasons: (1) The used dataset is collected from non-official providers, which naturally creates a gap between models trained on those datasets and real-world applications; (2) previously proposed methods require extensive annotation efforts (i.e., player and ball segmentation at pixel level) on localizing useful visual features to yield acceptable results; (3) very few public datasets are available. In this paper, we propose a novel large-scale NBA dataset for Sports Video Analysis (NSVA) with a focus on captioning, to address the above challenges. We also design a unified approach to process raw videos into a stack of meaningful features with minimum labelling efforts, showing that cross modeling on such features using a transformer architecture leads to strong performance. In addition, we demonstrate the broad application of NSVA by addressing two additional tasks, namely fine-grained sports action recognition and salient player identification. Code and dataset are available at https://github.com/jackwu502/NSVA. △ Less

Submitted 9 August, 2022; originally announced August 2022.

arXiv:2208.04718 [pdf, other]

doi 10.1016/j.compbiomed.2022.106417

Improving COVID-19 CT Classification of CNNs by Learning Parameter-Efficient Representation

Authors: Yujia Xu, Hak-Keung Lam, Guangyu Jia, Jian Jiang, Junkai Liao, Xinqi Bao

Abstract: COVID-19 pandemic continues to spread rapidly over the world and causes a tremendous crisis in global human health and the economy. Its early detection and diagnosis are crucial for controlling the further spread. Many deep learning-based methods have been proposed to assist clinicians in automatic COVID-19 diagnosis based on computed tomography imaging. However, challenges still remain, including… ▽ More COVID-19 pandemic continues to spread rapidly over the world and causes a tremendous crisis in global human health and the economy. Its early detection and diagnosis are crucial for controlling the further spread. Many deep learning-based methods have been proposed to assist clinicians in automatic COVID-19 diagnosis based on computed tomography imaging. However, challenges still remain, including low data diversity in existing datasets, and unsatisfied detection resulting from insufficient accuracy and sensitivity of deep learning models. To enhance the data diversity, we design augmentation techniques of incremental levels and apply them to the largest open-access benchmark dataset, COVIDx CT-2A. Meanwhile, similarity regularization (SR) derived from contrastive learning is proposed in this study to enable CNNs to learn more parameter-efficient representations, thus improving the accuracy and sensitivity of CNNs. The results on seven commonly used CNNs demonstrate that CNN performance can be improved stably through applying the designed augmentation and SR techniques. In particular, DenseNet121 with SR achieves an average test accuracy of 99.44% in three trials for three-category classification, including normal, non-COVID-19 pneumonia, and COVID-19 pneumonia. And the achieved precision, sensitivity, and specificity for the COVID-19 pneumonia category are 98.40%, 99.59%, and 99.50%, respectively. These statistics suggest that our method has surpassed the existing state-of-the-art methods on the COVIDx CT-2A dataset. △ Less

Submitted 9 August, 2022; originally announced August 2022.

arXiv:2208.03128 [pdf, other]

Time-Frequency Distributions of Heart Sound Signals: A Comparative Study using Convolutional Neural Networks

Authors: Xinqi Bao, Yujia Xu, Hak-Keung Lam, Mohamed Trabelsi, Ines Chihi, Lilia Sidhom, Ernest N. Kamavuako

Abstract: Time-Frequency Distributions (TFDs) support the heart sound characterisation and classification in early cardiac screening. However, despite the frequent use of TFDs in signal analysis, no study comprehensively compared their performances on deep learning for automatic diagnosis. Furthermore, the combination of signal processing methods as inputs for Convolutional Neural Networks (CNNs) has been p… ▽ More Time-Frequency Distributions (TFDs) support the heart sound characterisation and classification in early cardiac screening. However, despite the frequent use of TFDs in signal analysis, no study comprehensively compared their performances on deep learning for automatic diagnosis. Furthermore, the combination of signal processing methods as inputs for Convolutional Neural Networks (CNNs) has been proved as a practical approach to increasing signal classification performance. Therefore, this study aimed to investigate the optimal use of TFD/ combined TFDs as input for CNNs. The presented results revealed that: 1) The transformation of the heart sound signal into the TF domain achieves higher classification performance than using of raw signals. Among the TFDs, the difference in the performance was slight for all the CNN models (within $1.3\%$ in average accuracy). However, Continuous wavelet transform (CWT) and Chirplet transform (CT) outperformed the rest. 2) The appropriate increase of the CNN capacity and architecture optimisation can improve the performance, while the network architecture should not be overly complicated. Based on the ResNet or SEResNet family results, the increase in the number of parameters and the depth of the structure do not improve the performance apparently. 3) Combining TFDs as CNN inputs did not significantly improve the classification results. The findings of this study provided the knowledge for selecting TFDs as CNN input and designing CNN architecture for heart sound classification. △ Less

Submitted 5 August, 2022; originally announced August 2022.

arXiv:2207.10172 [pdf, other]

Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles

Authors: Guodong Wang, Yunhong Wang, Jie Qin, Dongming Zhang, Xiuguo Bao, Di Huang

Abstract: Video Anomaly Detection (VAD) is an important topic in computer vision. Motivated by the recent advances in self-supervised learning, this paper addresses VAD by solving an intuitive yet challenging pretext task, i.e., spatio-temporal jigsaw puzzles, which is cast as a multi-label fine-grained classification problem. Our method exhibits several advantages over existing works: 1) the spatio-tempora… ▽ More Video Anomaly Detection (VAD) is an important topic in computer vision. Motivated by the recent advances in self-supervised learning, this paper addresses VAD by solving an intuitive yet challenging pretext task, i.e., spatio-temporal jigsaw puzzles, which is cast as a multi-label fine-grained classification problem. Our method exhibits several advantages over existing works: 1) the spatio-temporal jigsaw puzzles are decoupled in terms of spatial and temporal dimensions, responsible for capturing highly discriminative appearance and motion features, respectively; 2) full permutations are used to provide abundant jigsaw puzzles covering various difficulty levels, allowing the network to distinguish subtle spatio-temporal differences between normal and abnormal events; and 3) the pretext task is tackled in an end-to-end manner without relying on any pre-trained models. Our method outperforms state-of-the-art counterparts on three public benchmarks. Especially on ShanghaiTech Campus, the result is superior to reconstruction and prediction-based methods by a large margin. △ Less

Submitted 21 July, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

Comments: Accepted by ECCV'2022; Code is available at https://github.com/gdwang08/Jigsaw-VAD

arXiv:2207.00179 [pdf, other]

Reentrant Localized Bulk and Localized-Extended Edge in Quasiperiodic Non-Hermitian Systems

Authors: Gang-Feng Guo, Xi-Xi Bao, Lei Tan

Abstract: The localization is one of the active and fundamental research in topology physics. Based on a generalized Su-Schrieffer-Heeger model with the quasiperiodic non-Hermitian emerging at the off-diagonal location, we propose a novel systematic method to analyze the localization behaviors for the bulk and the edge, respectively. For the bulk, it can be found that it undergoes an extended-coexisting-loc… ▽ More The localization is one of the active and fundamental research in topology physics. Based on a generalized Su-Schrieffer-Heeger model with the quasiperiodic non-Hermitian emerging at the off-diagonal location, we propose a novel systematic method to analyze the localization behaviors for the bulk and the edge, respectively. For the bulk, it can be found that it undergoes an extended-coexisting-localized-coexisting-localized transition induced by the quasidisorder and nonHermiticity. While for the edge state, it can be broken and recovered with the increase of the quasidisorder strength, and its localized transition is synchronous exactly with the topological phase transition. In addition, the inverse participation ratio of the edge state oscillates with an increase of the disorder strength. Finally, numerical results elucidate that the derivative of the normalized participation ratio exhibits an enormous discontinuity at the localized transition point. Here, our results not only demonstrate the diversity of localization properties of bulk and edge state, but also may provide an extension of the ordinary method for investigating the localization. △ Less

Submitted 30 June, 2022; originally announced July 2022.

Comments: 7 pages, 4 figures

arXiv:2206.07325 [pdf, ps, other]

Second order stabilized semi-implicit scheme for the Cahn-Hilliard model with dynamic boundary conditions

Authors: Xiangjun Meng, Xuelian Bao, Zhengru Zhang

Abstract: We study the numerical algorithm and error analysis for the Cahn-Hilliard equation with dynamic boundary conditions. A second-order in time, linear and energy stable scheme is proposed, which is an extension of the first-order stabilized approach. The corresponding energy stability and convergence analysis of the scheme are derived theoretically. Some numerical experiments are performed to verify… ▽ More We study the numerical algorithm and error analysis for the Cahn-Hilliard equation with dynamic boundary conditions. A second-order in time, linear and energy stable scheme is proposed, which is an extension of the first-order stabilized approach. The corresponding energy stability and convergence analysis of the scheme are derived theoretically. Some numerical experiments are performed to verify the effectiveness and accuracy of the second-order numerical scheme, including numerical simulations under various initial conditions and energy potential functions, and comparisons with the literature works. △ Less

Submitted 15 June, 2022; originally announced June 2022.

arXiv:2205.03569 [pdf, other]

Representation Learning for Compressed Video Action Recognition via Attentive Cross-modal Interaction with Motion Enhancement

Authors: Bing Li, Jiaxin Chen, Dongming Zhang, Xiuguo Bao, Di Huang

Abstract: Compressed video action recognition has recently drawn growing attention, since it remarkably reduces the storage and computational cost via replacing raw videos by sparsely sampled RGB frames and compressed motion cues (e.g., motion vectors and residuals). However, this task severely suffers from the coarse and noisy dynamics and the insufficient fusion of the heterogeneous RGB and motion modalit… ▽ More Compressed video action recognition has recently drawn growing attention, since it remarkably reduces the storage and computational cost via replacing raw videos by sparsely sampled RGB frames and compressed motion cues (e.g., motion vectors and residuals). However, this task severely suffers from the coarse and noisy dynamics and the insufficient fusion of the heterogeneous RGB and motion modalities. To address the two issues above, this paper proposes a novel framework, namely Attentive Cross-modal Interaction Network with Motion Enhancement (MEACI-Net). It follows the two-stream architecture, i.e. one for the RGB modality and the other for the motion modality. Particularly, the motion stream employs a multi-scale block embedded with a denoising module to enhance representation learning. The interaction between the two streams is then strengthened by introducing the Selective Motion Complement (SMC) and Cross-Modality Augment (CMA) modules, where SMC complements the RGB modality with spatio-temporally attentive local motion features and CMA further combines the two modalities with selective feature augmentation. Extensive experiments on the UCF-101, HMDB-51 and Kinetics-400 benchmarks demonstrate the effectiveness and efficiency of MEACI-Net. △ Less

Submitted 15 June, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

Comments: Accepted to IJCAI 2022

arXiv:2204.09783 [pdf, other]

TopoEmbedding, a web tool for the interactive analysis of persistent homology

Authors: Xueyi Bao, Guoxi Liu, Federico Iuricich

Abstract: Software libraries for Topological Data Analysis (TDA) offer limited support for interactive visualization. Most libraries only allow to visualize topological descriptors (e.g., persistence diagrams), and lose the connection with the original domain of data. This makes it challenging for users to interpret the results of a TDA pipeline in an exploratory context. In this paper, we present TopoEmbed… ▽ More Software libraries for Topological Data Analysis (TDA) offer limited support for interactive visualization. Most libraries only allow to visualize topological descriptors (e.g., persistence diagrams), and lose the connection with the original domain of data. This makes it challenging for users to interpret the results of a TDA pipeline in an exploratory context. In this paper, we present TopoEmbedding, a web-based tool that simplifies the interactive visualization and analysis of persistence-based descriptors. TopoEmbedding allows non-experts in TDA to explore similarities and differences found by TDA descriptors with simple yet effective visualization techniques. △ Less

Submitted 20 April, 2022; originally announced April 2022.

Report number: TDAatSDM/2022/10

arXiv:2203.15644 [pdf, other]

doi 10.1088/1361-648X/ac8a37

Floquet topological properties in the Non-Hermitian long-range system with complex hop** amplitudes

Authors: Gang-Feng Guo, Yan Wang, Xi-Xi Bao, Lei Tan

Abstract: Non-equilibrium phases of matter have attracted much attention in recent years, among which the Floquet phase is a hot point. In this work, based on the Periodic driving Non-Hermitian model, we reveal that the winding number calculated in the framework of the Bloch band theory has a direct connection with the number of edge states even the Non-Hermiticity is present. Further, we find that the chan… ▽ More Non-equilibrium phases of matter have attracted much attention in recent years, among which the Floquet phase is a hot point. In this work, based on the Periodic driving Non-Hermitian model, we reveal that the winding number calculated in the framework of the Bloch band theory has a direct connection with the number of edge states even the Non-Hermiticity is present. Further, we find that the change of the phase of the hop** amplitude can induce the topological phase transitions. Precisely speaking, the increase of the value of the phase can bring the system into the larger topological phase. Moreover, it can be unveiled that the introduction of the purely imaginary hop** term brings an extremely rich phase diagram. In addition, we can select the even topological invariant exactly from the unlimited winding numbers if we only consider the next-nearest neighbor hop** term. Here, the results obtained may be useful for understanding the Periodic driving Non-Hermitian theory. △ Less

Submitted 29 March, 2022; originally announced March 2022.

arXiv:2203.13727 [pdf, other]

doi 10.1088/1674-1056/acc3f6

Topological state transfers in cavity-magnon system

Authors: Xi-Xi Bao, Gang-Feng Guo, Lei Tan

Abstract: We propose an experimentally feasible scheme for realizing quantum state transfer via the topological edge states in a one-dimensional cavity-magnon lattice. We find that the cavity-magnon system can be mapped analytically into the generalized Su-Schrieffer-Heeger model with tunable cavity-magnon coupling. It can be shown that the edge state can be served as a quantum channel to realize the photon… ▽ More We propose an experimentally feasible scheme for realizing quantum state transfer via the topological edge states in a one-dimensional cavity-magnon lattice. We find that the cavity-magnon system can be mapped analytically into the generalized Su-Schrieffer-Heeger model with tunable cavity-magnon coupling. It can be shown that the edge state can be served as a quantum channel to realize the photonic and magnonic state transfers by adjusting the cavity-cavity coupling strength. Further, our scheme can realize the quantum state transfer between photonic state and magnonic state by changing the amplitude of the intracell hop**. With a numerical simulation, we quantitatively show that the photonic, magnonic and magnon-to-photon state transfers can be achieved with high fidelity in the cavity-magnon lattice. Spectacularly, the three different types of quantum state transfer schemes can be even transformed to each other in a controllable fashion. This system provides a novel way of realizing quantum state transfer and can be implemented in quantum computing platforms. △ Less

Submitted 25 March, 2022; originally announced March 2022.

arXiv:2203.08406 [pdf, ps, other]

Levenberg-Marquardt Method Based Cooperative Source Localization in SIMO Molecular Communication via Diffusion Systems

Authors: Yuqi Miao, Wence Zhang, Xu Bao

Abstract: Molecular communication underpins nano-scale communications in nanotechnology. The combination of multinanomachines to form nano-networks is one of the main enabling methods. Due to the importance of source localization in establishing nano-networks, this paper proposes a cooperative source localization method for Molecular Communication via Diffusion (MCvD) systems using multiple spherical absorp… ▽ More Molecular communication underpins nano-scale communications in nanotechnology. The combination of multinanomachines to form nano-networks is one of the main enabling methods. Due to the importance of source localization in establishing nano-networks, this paper proposes a cooperative source localization method for Molecular Communication via Diffusion (MCvD) systems using multiple spherical absorption receivers. Since there is no exact mathematical expression of the channel impulse response for multiple absorbing receivers, we adopt an empirical expression and use Levenberg-Marquardt method to estimate the distance of the transmitter to each receiver, based on which the location of the transmitter is obtained using an iterative scheme where the initial point is obtained using a multi-point localization method. Particle based simulation is carried out to evaluate the performance of the proposed method. Simulation results show that the proposed method can accurately estimate the location of transmitter in short to medium communication ranges. △ Less

Submitted 16 March, 2022; originally announced March 2022.

arXiv:2201.11953 [pdf, other]

Entangling metropolitan-distance separated quantum memories

Authors: Xi-Yu Luo, Yong Yu, Jian-Long Liu, Ming-Yang Zheng, Chao-Yang Wang, Bin Wang, Jun Li, Xiao Jiang, Xiu-** Xie, Qiang Zhang, Xiao-Hui Bao, Jian-Wei Pan

Abstract: Quantum internet gives the promise of getting all quantum resources connected, and it will enable applications far beyond a localized scenario. A prototype is a network of quantum memories that are entangled and well separated. Previous realizations are limited in the distance. In this paper, we report the establishment of remote entanglement between two atomic quantum memories physically separate… ▽ More Quantum internet gives the promise of getting all quantum resources connected, and it will enable applications far beyond a localized scenario. A prototype is a network of quantum memories that are entangled and well separated. Previous realizations are limited in the distance. In this paper, we report the establishment of remote entanglement between two atomic quantum memories physically separated by 12.5 km directly in a metropolitan area. We create atom-photon entanglement in one node and send the photon to a second node for storage. We harness low-loss transmission through a field-deployed fiber of 20.5 km by making use of frequency down-conversion and up-conversion. The final memory-memory entanglement is verified to have a fidelity of 90% via retrieving to photons. Our experiment paves the way to study quantum network applications in a practical scenario. △ Less

Submitted 28 January, 2022; originally announced January 2022.

Comments: 9 pages in total, 4 figures in the main text, and 5 figures in the supplementary material

arXiv:2112.10970 [pdf, other]

A deterministic-particle-based scheme for micro-macro viscoelastic flows

Authors: Xuelian Bao, Chun Liu, Yiwei Wang

Abstract: In this article, we introduce a new method for discretizing micro-macro models of dilute polymeric fluids based on deterministic particles. Our approach integrates a finite element discretization for the macroscopic fluid dynamic equation with a deterministic variational particle scheme for the microscopic Fokker-Planck equation. To address challenges arising from micro-macro coupling, we employ a… ▽ More In this article, we introduce a new method for discretizing micro-macro models of dilute polymeric fluids based on deterministic particles. Our approach integrates a finite element discretization for the macroscopic fluid dynamic equation with a deterministic variational particle scheme for the microscopic Fokker-Planck equation. To address challenges arising from micro-macro coupling, we employ a discrete energetic variational approach to derive a coarse-grained micro-macro model with a particle approximation first and then develop a particle-FEM discretization for the coarse-grained model. The accuracy of our method is evaluated for a Hookean dumbbell model in a Couette flow by comparing the computed velocity field with existing analytical solutions. We also use our method to study nonlinear FENE dumbbell models in different scenarios, such as extension flow, pure shear flow, and lid-driven cavity flow. Numerical examples demonstrate that the proposed deterministic particle approach can accurately capture the various key rheological phenomena in the original FENE model, including hysteresis and $δ$-function-like spike behavior in extension flows, velocity overshoot phenomenon in pure shear flow, symmetries breaking, vortex center shifting and vortices weakening in the lid-driven cavity flow, with a small number of particles. △ Less

Submitted 20 April, 2023; v1 submitted 20 December, 2021; originally announced December 2021.

arXiv:2112.09447 [pdf, other]

doi 10.1038/s41566-022-01054-3

Sequential generation of multiphoton entanglement with a Rydberg superatom

Authors: Chao-Wei Yang, Yong Yu, Jun Li, Bo **g, Xiao-Hui Bao, Jian-Wei Pan

Abstract: Multiqubit entanglement is an indispensable resource for quantum information science. In particular, the entanglement of photons is of conceptual interest due to its implications in measurement-based quantum computing, communication, and metrology. The traditional way of spontaneous parametric down-conversion already demonstrates entanglement of up to a dozen photons but is hindered by its probabi… ▽ More Multiqubit entanglement is an indispensable resource for quantum information science. In particular, the entanglement of photons is of conceptual interest due to its implications in measurement-based quantum computing, communication, and metrology. The traditional way of spontaneous parametric down-conversion already demonstrates entanglement of up to a dozen photons but is hindered by its probabilistic nature. Here, we experimentally demonstrate an efficient approach for multi-photon generation with a Rydberg superatom, a mesoscopic atomic ensemble under Rydberg blockade. Using it as an efficient single-photon interface, we iterate the photon creation process that gives rise to a train of temporal photonic modes entangled in photon number degree. We detect the multiphoton entanglement via converting the photon number degree to a time-bin degree. Photon correlations verify entanglement up to 12 modes. The efficiency scaling factor for adding one photon is 0.27, surpassing previous results, and can be increased significantly without fundamental limitations. △ Less

Submitted 17 December, 2021; originally announced December 2021.

Comments: 11 pages, 9 figures

arXiv:2111.05794 [pdf, other]

PIMIP: An Open Source Platform for Pathology Information Management and Integration

Authors: Jialun Wu, Anyu Mao, Xinrui Bao, Haichuan Zhang, Zeyu Gao, Chunbao Wang, Tieliang Gong, Chen Li

Abstract: Digital pathology plays a crucial role in the development of artificial intelligence in the medical field. The digital pathology platform can make the pathological resources digital and networked, and realize the permanent storage of visual data and the synchronous browsing processing without the limitation of time and space. It has been widely used in various fields of pathology. However, there i… ▽ More Digital pathology plays a crucial role in the development of artificial intelligence in the medical field. The digital pathology platform can make the pathological resources digital and networked, and realize the permanent storage of visual data and the synchronous browsing processing without the limitation of time and space. It has been widely used in various fields of pathology. However, there is still a lack of an open and universal digital pathology platform to assist doctors in the management and analysis of digital pathological sections, as well as the management and structured description of relevant patient information. Most platforms cannot integrate image viewing, annotation and analysis, and text information management. To solve the above problems, we propose a comprehensive and extensible platform PIMIP. Our PIMIP has developed the image annotation functions based on the visualization of digital pathological sections. Our annotation functions support multi-user collaborative annotation and multi-device annotation, and realize the automation of some annotation tasks. In the annotation task, we invited a professional pathologist for guidance. We introduce a machine learning module for image analysis. The data we collected included public data from local hospitals and clinical examples. Our platform is more clinical and suitable for clinical use. In addition to image data, we also structured the management and display of text information. So our platform is comprehensive. The platform framework is built in a modular way to support users to add machine learning modules independently, which makes our platform extensible. △ Less

Submitted 9 November, 2021; originally announced November 2021.

Comments: BIBM 2021 accepted, including 8 pages, 8 figures

arXiv:2110.13670 [pdf, other]

W-Net: A Two-Stage Convolutional Network for Nucleus Detection in Histopathology Image

Authors: Anyu Mao, Jialun Wu, Xinrui Bao, Zeyu Gao, Tieliang Gong, Chen Li

Abstract: Pathological diagnosis is the gold standard for cancer diagnosis, but it is labor-intensive, in which tasks such as cell detection, classification, and counting are particularly prominent. A common solution for automating these tasks is using nucleus segmentation technology. However, it is hard to train a robust nucleus segmentation model, due to several challenging problems, the nucleus adhesion,… ▽ More Pathological diagnosis is the gold standard for cancer diagnosis, but it is labor-intensive, in which tasks such as cell detection, classification, and counting are particularly prominent. A common solution for automating these tasks is using nucleus segmentation technology. However, it is hard to train a robust nucleus segmentation model, due to several challenging problems, the nucleus adhesion, stacking, and excessive fusion with the background. Recently, some researchers proposed a series of automatic nucleus segmentation methods based on point annotation, which can significant improve the model performance. Nevertheless, the point annotation needs to be marked by experienced pathologists. In order to take advantage of segmentation methods based on point annotation, further alleviate the manual workload, and make cancer diagnosis more efficient and accurate, it is necessary to develop an automatic nucleus detection algorithm, which can automatically and efficiently locate the position of the nucleus in the pathological image and extract valuable information for pathologists. In this paper, we propose a W-shaped network for automatic nucleus detection. Different from the traditional U-Net based method, map** the original pathology image to the target mask directly, our proposed method split the detection task into two sub-tasks. The first sub-task maps the original pathology image to the binary mask, then the binary mask is mapped to the density mask in the second sub-task. After the task is split, the task's difficulty is significantly reduced, and the network's overall performance is improved. △ Less

Submitted 26 October, 2021; originally announced October 2021.

Comments: BIBM 2021 accepted,including 8 pages, 3 figures

arXiv:2110.13652 [pdf, other]

A Precision Diagnostic Framework of Renal Cell Carcinoma on Whole-Slide Images using Deep Learning

Authors: Jialun Wu, Haichuan Zhang, Zeyu Gao, Xinrui Bao, Tieliang Gong, Chunbao Wang, Chen Li

Abstract: Diagnostic pathology, which is the basis and gold standard of cancer diagnosis, provides essential information on the prognosis of the disease and vital evidence for clinical treatment. Tumor region detection, subtype and grade classification are the fundamental diagnostic indicators for renal cell carcinoma (RCC) in whole-slide images (WSIs). However, pathological diagnosis is subjective, differe… ▽ More Diagnostic pathology, which is the basis and gold standard of cancer diagnosis, provides essential information on the prognosis of the disease and vital evidence for clinical treatment. Tumor region detection, subtype and grade classification are the fundamental diagnostic indicators for renal cell carcinoma (RCC) in whole-slide images (WSIs). However, pathological diagnosis is subjective, differences in observation and diagnosis between pathologists is common in hospitals with inadequate diagnostic capacity. The main challenge for develo** deep learning based RCC diagnostic system is the lack of large-scale datasets with precise annotations. In this work, we proposed a deep learning-based framework for analyzing histopathological images of patients with renal cell carcinoma, which has the potential to achieve pathologist-level accuracy in diagnosis. A deep convolutional neural network (InceptionV3) was trained on the high-quality annotated dataset of The Cancer Genome Atlas (TCGA) whole-slide histopathological image for accurate tumor area detection, classification of RCC subtypes, and ISUP grades classification of clear cell carcinoma subtypes. These results suggest that our framework can help pathologists in the detection of cancer region and classification of subtypes and grades, which could be applied to any cancer type, providing auxiliary diagnosis and promoting clinical consensus. △ Less

Submitted 26 October, 2021; originally announced October 2021.

Comments: BIBM 2021 accepted, 9 pages including reference, 3 figures and 1 table

arXiv:2108.07535 [pdf, other]

SPMoE: Generate Multiple Pattern-Aware Outputs with Sparse Pattern Mixture of Experts

Authors: Shaobo Cui, Xintong Bao, Xuming Lin, Zhongzhou Zhao, Ji Zhang, Wei Zhou, Haiqing Chen

Abstract: Many generation tasks follow a one-to-many map** relationship: each input could be associated with multiple outputs. Existing methods like Conditional Variational AutoEncoder(CVAE) employ a latent variable to model this one-to-many relationship. However, this high-dimensional and dense latent variable lacks explainability and usually leads to poor and uncontrollable generations. In this paper, w… ▽ More Many generation tasks follow a one-to-many map** relationship: each input could be associated with multiple outputs. Existing methods like Conditional Variational AutoEncoder(CVAE) employ a latent variable to model this one-to-many relationship. However, this high-dimensional and dense latent variable lacks explainability and usually leads to poor and uncontrollable generations. In this paper, we innovatively introduce the linguistic concept of pattern to decompose the one-to-many map** into multiple one-to-one map**s and further propose a model named Sparse Pattern Mixture of Experts(SPMoE). Each one-to-one map** is associated with a conditional generation pattern and is modeled with an expert in SPMoE. To ensure each language pattern can be exclusively handled with an expert model for better explainability and diversity, a sparse mechanism is employed to coordinate all the expert models in SPMoE. We assess the performance of our SPMoE on the paraphrase generation task and the experiment results prove that SPMoE can achieve a good balance in terms of quality, pattern-level diversity, and corpus-level diversity. △ Less

Submitted 17 August, 2021; v1 submitted 17 August, 2021; originally announced August 2021.

arXiv:2108.02768 [pdf, other]

Learning to Elect

Authors: Cem Anil, Xuchan Bao

Abstract: Voting systems have a wide range of applications including recommender systems, web search, product design and elections. Limited by the lack of general-purpose analytical tools, it is difficult to hand-engineer desirable voting rules for each use case. For this reason, it is appealing to automatically discover voting rules geared towards each scenario. In this paper, we show that set-input neural… ▽ More Voting systems have a wide range of applications including recommender systems, web search, product design and elections. Limited by the lack of general-purpose analytical tools, it is difficult to hand-engineer desirable voting rules for each use case. For this reason, it is appealing to automatically discover voting rules geared towards each scenario. In this paper, we show that set-input neural network architectures such as Set Transformers, fully-connected graph networks and DeepSets are both theoretically and empirically well-suited for learning voting rules. In particular, we show that these network models can not only mimic a number of existing voting rules to compelling accuracy -- both position-based (such as Plurality and Borda) and comparison-based (such as Kemeny, Copeland and Maximin) -- but also discover near-optimal voting rules that maximize different social welfare functions. Furthermore, the learned voting rules generalize well to different voter utility distributions and election sizes unseen during training. △ Less

Submitted 1 October, 2021; v1 submitted 5 August, 2021; originally announced August 2021.

arXiv:2108.02436 [pdf, other]

doi 10.1103/PhysRevLett.128.060502

Deterministic Time-Bin Entanglement between a Single Photon and an Atomic Ensemble

Authors: Peng-Fei Sun, Yong Yu, Zi-Ye An, Jun Li, Chao-Wei Yang, Xiao-Hui Bao, Jian-Wei Pan

Abstract: Hybrid matter-photon entanglement is the building block for quantum networks. It is very favorable if the entanglement can be prepared with a high probability. In this paper, we report the deterministic creation of entanglement between an atomic ensemble and a single photon by harnessing Rydberg blockade. We design a scheme that creates entanglement between a single photon's temporal modes and the… ▽ More Hybrid matter-photon entanglement is the building block for quantum networks. It is very favorable if the entanglement can be prepared with a high probability. In this paper, we report the deterministic creation of entanglement between an atomic ensemble and a single photon by harnessing Rydberg blockade. We design a scheme that creates entanglement between a single photon's temporal modes and the Rydberg levels that host a collective excitation, using a process of cyclical retrieving and patching. The hybrid entanglement is tested via retrieving the atomic excitation as a second photon and performing correlation measurements, which suggest an entanglement fidelity of 87.8%. Our source of matter-photon entanglement will enable the entangling of remote quantum memories with much higher efficiency. △ Less

Submitted 5 August, 2021; originally announced August 2021.

Comments: 6 pages, 4 figures

Showing 1–50 of 132 results for author: Ba, X