Search | arXiv e-print repository

SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training

Authors: Rui Xu, Wenkang Qin, Peixiang Huang, Hao Wang, Lin Luo

Abstract: Deep Neural Networks (DNNs) are expected to provide explanation for users to understand their black-box predictions. Saliency map is a common form of explanation illustrating the heatmap of feature attributions, but it suffers from noise in distinguishing important features. In this paper, we propose a model-agnostic learning method called Saliency Constrained Adaptive Adversarial Training (SCAAT)… ▽ More Deep Neural Networks (DNNs) are expected to provide explanation for users to understand their black-box predictions. Saliency map is a common form of explanation illustrating the heatmap of feature attributions, but it suffers from noise in distinguishing important features. In this paper, we propose a model-agnostic learning method called Saliency Constrained Adaptive Adversarial Training (SCAAT) to improve the quality of such DNN interpretability. By constructing adversarial samples under the guidance of saliency map, SCAAT effectively eliminates most noise and makes saliency maps sparser and more faithful without any modification to the model architecture. We apply SCAAT to multiple DNNs and evaluate the quality of the generated saliency maps on various natural and pathological image datasets. Evaluations on different domains and metrics show that SCAAT significantly improves the interpretability of DNNs by providing more faithful saliency maps without sacrificing their predictive power. △ Less

Submitted 10 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

arXiv:2311.03774 [pdf, other]

Meta-Adapter: An Online Few-shot Learner for Vision-Language Model

Authors: Cheng Cheng, Lin Song, Ruoyi Xue, Hang Wang, Hongbin Sun, Yixiao Ge, Ying Shan

Abstract: The contrastive vision-language pre-training, known as CLIP, demonstrates remarkable potential in perceiving open-world visual concepts, enabling effective zero-shot image recognition. Nevertheless, few-shot learning methods based on CLIP typically require offline fine-tuning of the parameters on few-shot samples, resulting in longer inference time and the risk of over-fitting in certain domains.… ▽ More The contrastive vision-language pre-training, known as CLIP, demonstrates remarkable potential in perceiving open-world visual concepts, enabling effective zero-shot image recognition. Nevertheless, few-shot learning methods based on CLIP typically require offline fine-tuning of the parameters on few-shot samples, resulting in longer inference time and the risk of over-fitting in certain domains. To tackle these challenges, we propose the Meta-Adapter, a lightweight residual-style adapter, to refine the CLIP features guided by the few-shot samples in an online manner. With a few training samples, our method can enable effective few-shot learning capabilities and generalize to unseen data or tasks without additional fine-tuning, achieving competitive performance and high efficiency. Without bells and whistles, our approach outperforms the state-of-the-art online few-shot learning method by an average of 3.6\% on eight image classification datasets with higher inference speed. Furthermore, our model is simple and flexible, serving as a plug-and-play module directly applicable to downstream tasks. Without further fine-tuning, Meta-Adapter obtains notable performance improvements in open-vocabulary object detection and segmentation tasks. △ Less

Submitted 11 January, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

Comments: Accepted by NeurIPS 2023

arXiv:2311.02893 [pdf]

Topological electronic structure and spin texture of quasi-one-dimensional higher-order topological insulator Bi4Br4

Authors: W. X. Zhao, M. Yang, R. Z. Xu, X. Du, Y. D. Li, K. Y. Zhai, C. Peng, D. Pei, H. Gao, Y. W. Li, L. X. Xu, J. F. Han, Y. Huang, Z. K. Liu, Y. G. Yao, J. C. Zhuang, Y. Du, J. J. Zhou, Y. L. Chen, L. X. Yang

Abstract: The notion of topological insulators (TIs), characterized by an insulating bulk and conducting topological surface states, can be extended to higher-order topological insulators (HOTIs) hosting gapless modes localized at the boundaries of two or more dimensions lower than the insulating bulk1-5. In this work, by performing high-resolution angle-resolved photoemission spectroscopy (ARPES) measureme… ▽ More The notion of topological insulators (TIs), characterized by an insulating bulk and conducting topological surface states, can be extended to higher-order topological insulators (HOTIs) hosting gapless modes localized at the boundaries of two or more dimensions lower than the insulating bulk1-5. In this work, by performing high-resolution angle-resolved photoemission spectroscopy (ARPES) measurements with submicron spatial and spin resolutions, we systematically investigate the electronic structure and spin texture of quasi-one-dimensional (1D) HOTI candidate Bi4Br4. In contrast to the bulk-state-dominant spectra on the (001) surface, we observe gapped surface states on the (100) surface, whose dispersion and spin-polarization agree well with our ab initio calculations. Moreover, we reveal in-gap states connecting the surface valence and conduction bands, which is an explicit signature of the existence of hinge states inside the (100) surface gap. Our findings provide compelling evidence for the HOTI phase of Bi4Br4. The identification of the higher-order topological phase will lay the promising prospect of applications based on 1D spin-momentum locked current in electronic and spintronic devices. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2311.00530 [pdf, other]

Advances in Embodied Navigation Using Large Language Models: A Survey

Authors: **zhou Lin, Han Gao, Xuxiang Feng, Rongtao Xu, Changwei Wang, Man Zhang, Li Guo, Shibiao Xu

Abstract: In recent years, the rapid advancement of Large Language Models (LLMs) such as the Generative Pre-trained Transformer (GPT) has attracted increasing attention due to their potential in a variety of practical applications. The application of LLMs with Embodied Intelligence has emerged as a significant area of focus. Among the myriad applications of LLMs, navigation tasks are particularly noteworthy… ▽ More In recent years, the rapid advancement of Large Language Models (LLMs) such as the Generative Pre-trained Transformer (GPT) has attracted increasing attention due to their potential in a variety of practical applications. The application of LLMs with Embodied Intelligence has emerged as a significant area of focus. Among the myriad applications of LLMs, navigation tasks are particularly noteworthy because they demand a deep understanding of the environment and quick, accurate decision-making. LLMs can augment embodied intelligence systems with sophisticated environmental perception and decision-making support, leveraging their robust language and image-processing capabilities. This article offers an exhaustive summary of the symbiosis between LLMs and embodied intelligence with a focus on navigation. It reviews state-of-the-art models, research methodologies, and assesses the advantages and disadvantages of existing embodied navigation models and datasets. Finally, the article elucidates the role of LLMs in embodied intelligence, based on current research, and forecasts future directions in the field. A comprehensive list of studies in this survey is available at https://github.com/Rongtao-Xu/Awesome-LLM-EN. △ Less

Submitted 7 June, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

arXiv:2311.00287 [pdf, other]

Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models

Authors: Ran Xu, Hejie Cui, Yue Yu, Xuan Kan, Wenqi Shi, Yuchen Zhuang, Wei **, Joyce Ho, Carl Yang

Abstract: Clinical natural language processing requires methods that can address domain-specific challenges, such as complex medical terminology and clinical contexts. Recently, large language models (LLMs) have shown promise in this domain. Yet, their direct deployment can lead to privacy issues and are constrained by resources. To address this challenge, we delve into synthetic clinical text generation us… ▽ More Clinical natural language processing requires methods that can address domain-specific challenges, such as complex medical terminology and clinical contexts. Recently, large language models (LLMs) have shown promise in this domain. Yet, their direct deployment can lead to privacy issues and are constrained by resources. To address this challenge, we delve into synthetic clinical text generation using LLMs for clinical NLP tasks. We propose an innovative, resource-efficient approach, ClinGen, which infuses knowledge into the process. Our model involves clinical knowledge extraction and context-informed LLM prompting. Both clinical topics and writing styles are drawn from external domain-specific knowledge graphs and LLMs to guide data generation. Our extensive empirical study across 7 clinical NLP tasks and 16 datasets reveals that ClinGen consistently enhances performance across various tasks, effectively aligning the distribution of real datasets and significantly enriching the diversity of generated training instances. We will publish our code and all the generated data in \url{https://github.com/ritaranx/ClinGen}. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.20607 [pdf, other]

What a Whole Slide Image Can Tell? Subtype-guided Masked Transformer for Pathological Image Captioning

Authors: Wenkang Qin, Rui Xu, Peixiang Huang, Xiaomin Wu, Heyu Zhang, Lin Luo

Abstract: Pathological captioning of Whole Slide Images (WSIs), though is essential in computer-aided pathological diagnosis, has rarely been studied due to the limitations in datasets and model training efficacy. In this paper, we propose a new paradigm Subtype-guided Masked Transformer (SGMT) for pathological captioning based on Transformers, which treats a WSI as a sequence of sparse patches and generate… ▽ More Pathological captioning of Whole Slide Images (WSIs), though is essential in computer-aided pathological diagnosis, has rarely been studied due to the limitations in datasets and model training efficacy. In this paper, we propose a new paradigm Subtype-guided Masked Transformer (SGMT) for pathological captioning based on Transformers, which treats a WSI as a sequence of sparse patches and generates an overall caption sentence from the sequence. An accompanying subtype prediction is introduced into SGMT to guide the training process and enhance the captioning accuracy. We also present an Asymmetric Masked Mechansim approach to tackle the large size constraint of pathological image captioning, where the numbers of sequencing patches in SGMT are sampled differently in the training and inferring phases, respectively. Experiments on the PatchGastricADC22 dataset demonstrate that our approach effectively adapts to the task with a transformer-based model and achieves superior performance than traditional RNN-based methods. Our codes are to be made available for further research and development. △ Less

Submitted 31 October, 2023; originally announced October 2023.

arXiv:2310.20427 [pdf, other]

Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology

Authors: Peixiang Huang, Songtao Zhang, Yulu Gan, Rui Xu, Rongqi Zhu, Wenkang Qin, Limei Guo, Shan Jiang, Lin Luo

Abstract: Deep learning in digital pathology brings intelligence and automation as substantial enhancements to pathological analysis, the gold standard of clinical diagnosis. However, multiple steps from tissue preparation to slide imaging introduce various image corruptions, making it difficult for deep neural network (DNN) models to achieve stable diagnostic results for clinical use. In order to assess an… ▽ More Deep learning in digital pathology brings intelligence and automation as substantial enhancements to pathological analysis, the gold standard of clinical diagnosis. However, multiple steps from tissue preparation to slide imaging introduce various image corruptions, making it difficult for deep neural network (DNN) models to achieve stable diagnostic results for clinical use. In order to assess and further enhance the robustness of the models, we analyze the physical causes of the full-stack corruptions throughout the pathological life-cycle and propose an Omni-Corruption Emulation (OmniCE) method to reproduce 21 types of corruptions quantified with 5-level severity. We then construct three OmniCE-corrupted benchmark datasets at both patch level and slide level and assess the robustness of popular DNNs in classification and segmentation tasks. Further, we explore to use the OmniCE-corrupted datasets as augmentation data for training and experiments to verify that the generalization ability of the models has been significantly enhanced. △ Less

Submitted 31 October, 2023; originally announced October 2023.

arXiv:2310.18804 [pdf, other]

Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting

Authors: Hejie Cui, Xinyu Fang, Zihan Zhang, Ran Xu, Xuan Kan, Xin Liu, Yue Yu, Manling Li, Yangqiu Song, Carl Yang

Abstract: Images contain rich relational knowledge that can help machines understand the world. Existing methods on visual knowledge extraction often rely on the pre-defined format (e.g., sub-verb-obj tuples) or vocabulary (e.g., relation types), restricting the expressiveness of the extracted knowledge. In this work, we take a first exploration to a new paradigm of open visual knowledge extraction. To achi… ▽ More Images contain rich relational knowledge that can help machines understand the world. Existing methods on visual knowledge extraction often rely on the pre-defined format (e.g., sub-verb-obj tuples) or vocabulary (e.g., relation types), restricting the expressiveness of the extracted knowledge. In this work, we take a first exploration to a new paradigm of open visual knowledge extraction. To achieve this, we present OpenVik which consists of an open relational region detector to detect regions potentially containing relational knowledge and a visual knowledge generator that generates format-free knowledge by prompting the large multimodality model with the detected region of interest. We also explore two data enhancement techniques for diversifying the generated format-free visual knowledge. Extensive knowledge quality evaluations highlight the correctness and uniqueness of the extracted open visual knowledge by OpenVik. Moreover, integrating our extracted knowledge across various visual reasoning applications shows consistent improvements, indicating the real-world applicability of OpenVik. △ Less

Submitted 28 October, 2023; originally announced October 2023.

Comments: Accepted to NeurIPS 2023

arXiv:2310.18596 [pdf, other]

How Hard is Takeover in DPoS Blockchains? Understanding the Security of Coin-based Voting Governance

Authors: Chao Li, Balaji Palanisamy, Runhua Xu, Li Duan, Jiqiang Liu, Wei Wang

Abstract: Delegated-Proof-of-Stake (DPoS) blockchains, such as EOSIO, Steem and TRON, are governed by a committee of block producers elected via a coin-based voting system. We recently witnessed the first de facto blockchain takeover that happened between Steem and TRON. Within one hour of this incident, TRON founder took over the entire Steem committee, forcing the original Steem community to leave the blo… ▽ More Delegated-Proof-of-Stake (DPoS) blockchains, such as EOSIO, Steem and TRON, are governed by a committee of block producers elected via a coin-based voting system. We recently witnessed the first de facto blockchain takeover that happened between Steem and TRON. Within one hour of this incident, TRON founder took over the entire Steem committee, forcing the original Steem community to leave the blockchain that they maintained for years. This is a historical event in the evolution of blockchains and Web 3.0. Despite its significant disruptive impact, little is known about how vulnerable DPoS blockchains are in general to takeovers and the ways in which we can improve their resistance to takeovers. In this paper, we demonstrate that the resistance of a DPoS blockchain to takeovers is governed by both the theoretical design and the actual use of its underlying coin-based voting governance system. When voters actively cooperate to resist potential takeovers, our theoretical analysis reveals that the current active resistance of DPoS blockchains is far below the theoretical upper bound. However in practice, voter preferences could be significantly different. This paper presents the first large-scale empirical study of the passive takeover resistance of EOSIO, Steem and TRON. Our study identifies the diversity in voter preferences and characterizes the impact of this diversity on takeover resistance. Through both theoretical and empirical analyses, our study provides novel insights into the security of coin-based voting governance and suggests potential ways to improve the takeover resistance of any blockchain that implements this governance model. △ Less

Submitted 28 October, 2023; originally announced October 2023.

Comments: This work has been accepted by ACM CCS 2023

arXiv:2310.17976 [pdf, other]

InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews

Authors: Xintao Wang, Yunze Xiao, Jen-tse Huang, Siyu Yuan, Rui Xu, Haoran Guo, Quan Tu, Yaying Fei, Ziang Leng, Wei Wang, Jiangjie Chen, Cheng Li, Yanghua Xiao

Abstract: Role-playing agents (RPAs), powered by large language models, have emerged as a flourishing field of applications. However, a key challenge lies in assessing whether RPAs accurately reproduce the personas of target characters, namely their character fidelity. Existing methods mainly focus on the knowledge and linguistic patterns of characters. This paper, instead, introduces a novel perspective to… ▽ More Role-playing agents (RPAs), powered by large language models, have emerged as a flourishing field of applications. However, a key challenge lies in assessing whether RPAs accurately reproduce the personas of target characters, namely their character fidelity. Existing methods mainly focus on the knowledge and linguistic patterns of characters. This paper, instead, introduces a novel perspective to evaluate the personality fidelity of RPAs with psychological scales. Overcoming drawbacks of previous self-report assessments on RPAs, we propose InCharacter, namely Interviewing Character agents for personality tests. Experiments include various types of RPAs and LLMs, covering 32 distinct characters on 14 widely used psychological scales. The results validate the effectiveness of InCharacter in measuring RPA personalities. Then, with InCharacter, we show that state-of-the-art RPAs exhibit personalities highly aligned with the human-perceived personalities of the characters, achieving an accuracy up to 80.7%. △ Less

Submitted 7 June, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

Comments: ACL 2024

arXiv:2310.17082 [pdf, ps, other]

Does or did the supernova remnant Cassiopeia A operate as a PeVatron?

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE;… ▽ More For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE; $E_γ\geq 100$~TeV) $γ$-rays. In this context, the historical SNR Cassiopeia A (Cas A) is considered one of the most promising target for UHE observations. This paper presents the observation of Cas A and its vicinity by the LHAASO KM2A detector. The exceptional sensitivity of LHAASO KM2A in the UHE band, combined with the young age of Cas A, enabled us to derive stringent model-independent limits on the energy budget of UHE protons and nuclei accelerated by Cas A at any epoch after the explosion. The results challenge the prevailing paradigm that Cas A-type SNRs are major suppliers of PeV CRs in the Milky Way. △ Less

Submitted 25 October, 2023; originally announced October 2023.

Comments: 11 pages, 3 figures, Accepted by the APJL

arXiv:2310.16421 [pdf, other]

Graph Agent: Explicit Reasoning Agent for Graphs

Authors: Qinyong Wang, Zhenxiang Gao, Rong Xu

Abstract: Graph embedding methods such as Graph Neural Networks (GNNs) and Graph Transformers have contributed to the development of graph reasoning algorithms for various tasks on knowledge graphs. However, the lack of interpretability and explainability of graph embedding methods has limited their applicability in scenarios requiring explicit reasoning. In this paper, we introduce the Graph Agent (GA), an… ▽ More Graph embedding methods such as Graph Neural Networks (GNNs) and Graph Transformers have contributed to the development of graph reasoning algorithms for various tasks on knowledge graphs. However, the lack of interpretability and explainability of graph embedding methods has limited their applicability in scenarios requiring explicit reasoning. In this paper, we introduce the Graph Agent (GA), an intelligent agent methodology of leveraging large language models (LLMs), inductive-deductive reasoning modules, and long-term memory for knowledge graph reasoning tasks. GA integrates aspects of symbolic reasoning and existing graph embedding methods to provide an innovative approach for complex graph reasoning tasks. By converting graph structures into textual data, GA enables LLMs to process, reason, and provide predictions alongside human-interpretable explanations. The effectiveness of the GA was evaluated on node classification and link prediction tasks. Results showed that GA reached state-of-the-art performance, demonstrating accuracy of 90.65%, 95.48%, and 89.32% on Cora, PubMed, and PrimeKG datasets, respectively. Compared to existing GNN and transformer models, GA offered advantages of explicit reasoning ability, free-of-training, easy adaption to various graph reasoning tasks △ Less

Submitted 25 October, 2023; originally announced October 2023.

arXiv:2310.16414 [pdf]

V2C MXene-modified g-C3N4 for enhanced visible-light photocatalytic activity

Authors: Ruizheng Xu, Guiyu Wei, Zhemin Xie, Sijie Diao, Jianfeng Wen, Tao Tang, Li Jiang, Ming Li, Guanghui Hu

Abstract: Increasing the efficiency of charge transfer and separation efficiency of photogenerated carriers are still the main challenges in the field of semiconductor-based photocatalysts. Herein, we synthesized g-C3N4@V2C MXene photocatalyst by modifying g-C3N4 using V2C MXene. The prepared photocatalyst exhibited outstanding photocatalytic performance under visible light. The degradation efficiency of me… ▽ More Increasing the efficiency of charge transfer and separation efficiency of photogenerated carriers are still the main challenges in the field of semiconductor-based photocatalysts. Herein, we synthesized g-C3N4@V2C MXene photocatalyst by modifying g-C3N4 using V2C MXene. The prepared photocatalyst exhibited outstanding photocatalytic performance under visible light. The degradation efficiency of methyl orange by g-C3N4@V2C MXene photocatalyst was as high as 94.5%, which is 1.56 times higher than that by g-C3N4. This was attributed to the V2C MXene inhibiting the rapid recombination of photogenerated carriers and facilitating rapid transfer of photogenerated electrons (e) from g-C3N4 to MXene. Moreover, g-C3N4@V2C MXene photocatalyst showed good cycling stability. The photocatalytic performance was higher than 85% after three cycles. Experiments to capture free radicals revealed that superoxide radicals (02) are the main contributors to the photocatalytic activity. Thus, the proposed g-C3N4@V2C MXene photocatalyst is a promising visible-light catalyst. △ Less

Submitted 25 October, 2023; originally announced October 2023.

Comments: 20 pages, 9 figures

arXiv:2310.16133 [pdf, other]

The renormalization of the shell-model GT operator starting from effective field theory for nuclear systems

Authors: L. Coraggio, N. Itaco, G. De Gregorio, A. Gargano, Z. H. Cheng, Y. Z. Ma, F. R. Xu, M. Viviani

Abstract: For the first time, we approach in this work the problem of the renormalization of the Gamow-Teller decay operator for nuclear shell-model calculations by way of many-body perturbation theory, starting from a nuclear Hamiltonian and electroweak currents derived consistently by way of the chiral perturbation theory. These are the inputs we need to construct microscopically the effective shell-model… ▽ More For the first time, we approach in this work the problem of the renormalization of the Gamow-Teller decay operator for nuclear shell-model calculations by way of many-body perturbation theory, starting from a nuclear Hamiltonian and electroweak currents derived consistently by way of the chiral perturbation theory. These are the inputs we need to construct microscopically the effective shell-model Hamiltonians and decay operators. The goal is to assess the role of both electroweak currents and many-body correlations as the origins of the well-known problem of the quenching of the axial coupling constant gA. To this end, the calculation of observables related to the Gamow-Teller transitions has been performed for several nuclear systems outside the 40Ca and 56Ni closed cores and compared with the available data. △ Less

Submitted 22 December, 2023; v1 submitted 24 October, 2023; originally announced October 2023.

Comments: 15 pages, 13 figures, 3 tables, to be published in Physical Review C

arXiv:2310.15694 [pdf, other]

COPR: Continual Learning Human Preference through Optimal Policy Regularization

Authors: Han Zhang, Lin Gui, Yuanzhao Zhai, Hui Wang, Yu Lei, Ruifeng Xu

Abstract: The technique of Reinforcement Learning from Human Feedback (RLHF) is a commonly employed method to improve pre-trained Language Models (LM), enhancing their ability to conform to human preferences. Nevertheless, the current RLHF-based LMs necessitate full retraining each time novel queries or feedback are introduced, which becomes a challenging task because human preferences can vary between diff… ▽ More The technique of Reinforcement Learning from Human Feedback (RLHF) is a commonly employed method to improve pre-trained Language Models (LM), enhancing their ability to conform to human preferences. Nevertheless, the current RLHF-based LMs necessitate full retraining each time novel queries or feedback are introduced, which becomes a challenging task because human preferences can vary between different domains or tasks. Retraining LMs poses practical difficulties in many real-world situations due to the significant time and computational resources required, along with concerns related to data privacy. To address this limitation, we propose a new method called Continual Optimal Policy Regularization (COPR), in which we compute the distribution of optimal policy bypassing the partition function and then regularize the current policy based on the historically optimal distribution to mitigate Catastrophic Forgetting (CF). COPR involves a single learning phase and doesn't necessitate complex reinforcement learning. Importantly, it shares the capability with RLHF to learn from unlabeled data by maintaining a scoring module, similar to reward model, making it flexible for continually learning without human feedback. Our experimental results show that COPR outperforms strong Continuous Learning (CL) baselines when it comes to consistently aligning with human preferences on incremental tasks and domains. △ Less

Submitted 26 March, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

arXiv:2310.15423 [pdf]

doi 10.1038/s41566-023-01300-2

Electric quadrupole second harmonic generation revealing dual magnetic orders in a magnetic Weyl semimetal

Authors: Youngjun Ahn, Xiaoyu Guo, Rui Xue, Kejian Qu, Kai Sun, David Mandrus, Liuyan Zhao

Abstract: Broken symmetries and electronic topology are nicely manifested together in the second order nonlinear optical responses from topologically nontrivial materials. While second order nonlinear optical effects from the electric dipole (ED) contribution have been extensively explored in polar Weyl semimetals (WSMs) with broken spatial inversion (SI) symmetry, they are rarely studied in centrosymmetric… ▽ More Broken symmetries and electronic topology are nicely manifested together in the second order nonlinear optical responses from topologically nontrivial materials. While second order nonlinear optical effects from the electric dipole (ED) contribution have been extensively explored in polar Weyl semimetals (WSMs) with broken spatial inversion (SI) symmetry, they are rarely studied in centrosymmetric magnetic WSMs with broken time reversal (TR) symmetry due to complete suppression of the ED contribution. Here, we report experimental demonstration of optical second harmonic generation (SHG) in a magnetic WSM Co$_{3}$Sn$_{2}$S$_{2}$ from the electric quadrupole (EQ) contribution. By tracking the temperature dependence of the rotation anisotropy (RA) of SHG, we capture two magnetic phase transitions, with both the SHG intensity increasing and its RA pattern rotating at $T_{C,1}$=175K and $T_{C,2}$=120K subsequently. The fitted critical exponents for the SHG intensity and RA orientation near $T_{C,1}$ and $T_{C,2}$ suggest that the magnetic phase at $T_{C,1}$ is a 3D Ising-type out-of-plane ferromagnetism while the other at $T_{C,2}$ is a 3D XY-type all-in-all-out in-plane antiferromagnetism. Our results show the success of detection and exploration of EQ SHG in a centrosymmetric magnetic WSM, and hence open the pathway towards the future investigation of its tie to the band topology. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: 19 pages, 4 figures

arXiv:2310.14448 [pdf, other]

Semiparametrically Efficient Score for the Survival Odds Ratio

Authors: Denise Rava, Jelena Bradic, Ronghui Xu

Abstract: We consider a general proportional odds model for survival data under binary treatment, where the functional form of the covariates is left unspecified. We derive the efficient score for the conditional survival odds ratio given the covariates using modern semiparametric theory. The efficient score may be useful in the development of doubly robust estimators, although computational challenges rema… ▽ More We consider a general proportional odds model for survival data under binary treatment, where the functional form of the covariates is left unspecified. We derive the efficient score for the conditional survival odds ratio given the covariates using modern semiparametric theory. The efficient score may be useful in the development of doubly robust estimators, although computational challenges remain. △ Less

Submitted 14 May, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

arXiv:2310.14323 [pdf, other]

Strangelets at finite temperature: nucleon emission rates, interface and shell effects

Authors: Hao-Song You, Huai-Min Chen, Jian-Feng Xu, Cheng-Jun Xia, Guang-Xiong Peng, Ren-Xin Xu

Abstract: We investigate the properties of strangelets at finite temperature $T$, where an equivparticle model is adopted with both the linear confinement and leading-order perturbative interactions accounted for using density-dependent quark masses. The shell effects are examined by solving the Dirac equations for quarks in the mean-field approximation, which diminish with temperature as the occupation pro… ▽ More We investigate the properties of strangelets at finite temperature $T$, where an equivparticle model is adopted with both the linear confinement and leading-order perturbative interactions accounted for using density-dependent quark masses. The shell effects are examined by solving the Dirac equations for quarks in the mean-field approximation, which diminish with temperature as the occupation probability of each single-particle levels fixed by the Fermi-Dirac statistics, i.e., shell dampening. Consequently, instead of decreasing with temperature, the surface tension extracted from a liquid-drop formula increases with $T$ until reaching its peak at $T\approx 20$-40 MeV with vanishing shell corrections, where the formula roughly reproduces the free energy per baryon of all strangelets. The curvature term, nevertheless, decreases with $T$ despite the presence of shell effects. The neutron and proton emission rates are fixed microscopically according to the external nucleon gas densities that are in equilibrium with strangelets, which generally increase with $T$ ($\lesssim 50$ MeV) for stable strangelets but decrease for those that are unstable against nucleon emission at $T=0$. The energy, free energy, entropy, charge-to-mass ratio, strangeness per baryon, and root-mean-square radius of $β$-stable strangelets obtained with various parameter sets are presented as well. The results indicated in this work are useful for understanding the products of binary compact star mergers and heavy-ion collisions. △ Less

Submitted 22 October, 2023; originally announced October 2023.

arXiv:2310.13882 [pdf]

NMR Spectra Denoising with Vandermonde Constraints

Authors: Di Guo, Runmin Xu, **yu Wu, Mei** Lin, Xiaofeng Du, Xiaobo Qu

Abstract: Nuclear magnetic resonance (NMR) spectroscopy serves as an important tool to analyze chemicals and proteins in bioengineering. However, NMR signals are easily contaminated by noise during the data acquisition, which can affect subsequent quantitative analysis. Therefore, denoising NMR signals has been a long-time concern. In this work, we propose an optimization model-based iterative denoising met… ▽ More Nuclear magnetic resonance (NMR) spectroscopy serves as an important tool to analyze chemicals and proteins in bioengineering. However, NMR signals are easily contaminated by noise during the data acquisition, which can affect subsequent quantitative analysis. Therefore, denoising NMR signals has been a long-time concern. In this work, we propose an optimization model-based iterative denoising method, CHORD-V, by treating the time-domain NMR signal as damped exponentials and maintaining the exponential signal form with a Vandermonde factorization. Results on both synthetic and realistic NMR data show that CHORD-V has a superior denoising performance over typical Cadzow and rQRd methods, and the state-of-the-art CHORD method. CHORD-V restores low-intensity spectral peaks more accurately, especially when the noise is relatively high. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: 10 pages, 9 figures

arXiv:2310.12848 [pdf, other]

Neural Degradation Representation Learning for All-In-One Image Restoration

Authors: Mingde Yao, Ruikang Xu, Yuanshen Guan, Jie Huang, Zhiwei Xiong

Abstract: Existing methods have demonstrated effective performance on a single degradation type. In practical applications, however, the degradation is often unknown, and the mismatch between the model and the degradation will result in a severe performance drop. In this paper, we propose an all-in-one image restoration network that tackles multiple degradations. Due to the heterogeneous nature of different… ▽ More Existing methods have demonstrated effective performance on a single degradation type. In practical applications, however, the degradation is often unknown, and the mismatch between the model and the degradation will result in a severe performance drop. In this paper, we propose an all-in-one image restoration network that tackles multiple degradations. Due to the heterogeneous nature of different types of degradations, it is difficult to process multiple degradations in a single network. To this end, we propose to learn a neural degradation representation (NDR) that captures the underlying characteristics of various degradations. The learned NDR decomposes different types of degradations adaptively, similar to a neural dictionary that represents basic degradation components. Subsequently, we develop a degradation query module and a degradation injection module to effectively recognize and utilize the specific degradation based on NDR, enabling the all-in-one restoration ability for multiple degradations. Moreover, we propose a bidirectional optimization strategy to effectively drive NDR to learn the degradation representation by optimizing the degradation and restoration processes alternately. Comprehensive experiments on representative types of degradations (including noise, haze, rain, and downsampling) demonstrate the effectiveness and generalization capability of our method. △ Less

Submitted 19 October, 2023; originally announced October 2023.

arXiv:2310.09833 [pdf, other]

Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization

Authors: Simin Li, Ruixiao Xu, **gqiao Xiu, Yuwei Zheng, Pu Feng, Yaodong Yang, Xianglong Liu

Abstract: In multi-agent reinforcement learning (MARL), ensuring robustness against unpredictable or worst-case actions by allies is crucial for real-world deployment. Existing robust MARL methods either approximate or enumerate all possible threat scenarios against worst-case adversaries, leading to computational intensity and reduced robustness. In contrast, human learning efficiently acquires robust beha… ▽ More In multi-agent reinforcement learning (MARL), ensuring robustness against unpredictable or worst-case actions by allies is crucial for real-world deployment. Existing robust MARL methods either approximate or enumerate all possible threat scenarios against worst-case adversaries, leading to computational intensity and reduced robustness. In contrast, human learning efficiently acquires robust behaviors in daily life without preparing for every possible threat. Inspired by this, we frame robust MARL as an inference problem, with worst-case robustness implicitly optimized under all threat scenarios via off-policy evaluation. Within this framework, we demonstrate that Mutual Information Regularization as Robust Regularization (MIR3) during routine training is guaranteed to maximize a lower bound on robustness, without the need for adversaries. Further insights show that MIR3 acts as an information bottleneck, preventing agents from over-reacting to others and aligning policies with robust action priors. In the presence of worst-case adversaries, our MIR3 significantly surpasses baseline methods in robustness and training efficiency while maintaining cooperative performance in StarCraft II and robot swarm control. When deploying the robot swarm control algorithm in the real world, our method also outperforms the best baseline by 14.29%. △ Less

Submitted 21 May, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

Comments: arXiv admin note: text overlap with arXiv:2310.00339

arXiv:2310.09824 [pdf, other]

Overconstrained Robotic Limb with Energy-Efficient, Omni-directional Locomotion

Authors: Ronghan Xu, Jiayi Yin, Shihao Feng, Bangchao Huang, Haoran Sun, Jia Pan, Fang Wan, Chaoyang Song

Abstract: This paper studies the design, modeling, and control of a novel quadruped, featuring overconstrained robotic limbs employing the Bennett linkage for motion and power transmission. The modular limb design allows the robot to morph into reptile- or mammal-inspired forms. In contrast to the prevailing focus on planar limbs, this research delves into the classical overconstrained linkages, which have… ▽ More This paper studies the design, modeling, and control of a novel quadruped, featuring overconstrained robotic limbs employing the Bennett linkage for motion and power transmission. The modular limb design allows the robot to morph into reptile- or mammal-inspired forms. In contrast to the prevailing focus on planar limbs, this research delves into the classical overconstrained linkages, which have strong theoretical foundations in advanced kinematics but limited engineering applications. The study showcases the morphological superiority of overconstrained robotic limbs that can transform into planar or spherical limbs, exemplifying the Bennett linkage. By conducting kinematic and dynamic modeling, we apply model predictive control to simulate a range of locomotion tasks, revealing that overconstrained limbs outperform planar designs in omni-directional tasks like forward trotting, lateral trotting, and turning on the spot when considering foothold distances. These findings highlight the biological distinctions in limb design between reptiles and mammals and represent the first documented instance of overconstrained robotic limbs outperforming planar designs in dynamic locomotion. △ Less

Submitted 3 February, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

Comments: 19 pages, 13 figures, 2 tables

arXiv:2310.08845 [pdf, other]

doi 10.1126/sciadv.adj2778

Very high energy gamma-ray emission beyond 10 TeV from GRB 221009A

Authors: Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the t… ▽ More The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the trigger. The intrinsic energy spectrum of gamma-rays can be described by a power-law after correcting for extragalactic background light (EBL) absorption. Such a hard spectrum challenges the synchrotron self-Compton (SSC) scenario of relativistic electrons for the afterglow emission above several TeV. Observations of gamma-rays up to 13 TeV from a source with a measured redshift of z=0.151 hints more transparency in intergalactic space than previously expected. Alternatively, one may invoke new physics such as Lorentz Invariance Violation (LIV) or an axion origin of very high energy (VHE) signals. △ Less

Submitted 22 November, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: 49pages, 11figures

Journal ref: Science Advances, 9, eadj2778 (2023) 15 November 2023

arXiv:2310.08583 [pdf, other]

doi 10.1145/3610548.3618176

Discovering Fatigued Movements for Virtual Character Animation

Authors: Noshaba Cheema, Rui Xu, Nam Hee Kim, Perttu Hämäläinen, Vladislav Golyanik, Marc Habermann, Christian Theobalt, Philipp Slusallek

Abstract: Virtual character animation and movement synthesis have advanced rapidly during recent years, especially through a combination of extensive motion capture datasets and machine learning. A remaining challenge is interactively simulating characters that fatigue when performing extended motions, which is indispensable for the realism of generated animations. However, capturing such movements is probl… ▽ More Virtual character animation and movement synthesis have advanced rapidly during recent years, especially through a combination of extensive motion capture datasets and machine learning. A remaining challenge is interactively simulating characters that fatigue when performing extended motions, which is indispensable for the realism of generated animations. However, capturing such movements is problematic, as performing movements like backflips with fatigued variations up to exhaustion raises capture cost and risk of injury. Surprisingly, little research has been done on faithful fatigue modeling. To address this, we propose a deep reinforcement learning-based approach, which -- for the first time in literature -- generates control policies for full-body physically simulated agents aware of cumulative fatigue. For this, we first leverage Generative Adversarial Imitation Learning (GAIL) to learn an expert policy for the skill; Second, we learn a fatigue policy by limiting the generated constant torque bounds based on endurance time to non-linear, state- and time-dependent limits in the joint-actuation space using a Three-Compartment Controller (3CC) model. Our results demonstrate that agents can adapt to different fatigue and rest rates interactively, and discover realistic recovery strategies without the need for any captured data of fatigued movement. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 16 pages, 22 figures. To be published in ACM SIGGRAPH Asia Conference Papers 2023. ACM ISBN 979-8-4007-0315-7/23/12

ACM Class: I.3.7

Journal ref: ACM SIGGRAPH Asia Conference Papers 2023

arXiv:2310.08117 [pdf, other]

doi 10.1145/3581783.3611948

DUSA: Decoupled Unsupervised Sim2Real Adaptation for Vehicle-to-Everything Collaborative Perception

Authors: Xianghao Kong, Wentao Jiang, **rang Jia, Yifeng Shi, Runsheng Xu, Si Liu

Abstract: Vehicle-to-Everything (V2X) collaborative perception is crucial for autonomous driving. However, achieving high-precision V2X perception requires a significant amount of annotated real-world data, which can always be expensive and hard to acquire. Simulated data have raised much attention since they can be massively produced at an extremely low cost. Nevertheless, the significant domain gap betwee… ▽ More Vehicle-to-Everything (V2X) collaborative perception is crucial for autonomous driving. However, achieving high-precision V2X perception requires a significant amount of annotated real-world data, which can always be expensive and hard to acquire. Simulated data have raised much attention since they can be massively produced at an extremely low cost. Nevertheless, the significant domain gap between simulated and real-world data, including differences in sensor type, reflectance patterns, and road surroundings, often leads to poor performance of models trained on simulated data when evaluated on real-world data. In addition, there remains a domain gap between real-world collaborative agents, e.g. different types of sensors may be installed on autonomous vehicles and roadside infrastructures with different extrinsics, further increasing the difficulty of sim2real generalization. To take full advantage of simulated data, we present a new unsupervised sim2real domain adaptation method for V2X collaborative detection named Decoupled Unsupervised Sim2Real Adaptation (DUSA). Our new method decouples the V2X collaborative sim2real domain adaptation problem into two sub-problems: sim2real adaptation and inter-agent adaptation. For sim2real adaptation, we design a Location-adaptive Sim2Real Adapter (LSA) module to adaptively aggregate features from critical locations of the feature map and align the features between simulated data and real-world data via a sim/real discriminator on the aggregated global feature. For inter-agent adaptation, we further devise a Confidence-aware Inter-agent Adapter (CIA) module to align the fine-grained features from heterogeneous agents under the guidance of agent-wise confidence maps. Experiments demonstrate the effectiveness of the proposed DUSA approach on unsupervised sim2real adaptation from the simulated V2XSet dataset to the real-world DAIR-V2X-C dataset. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: ACM MM 2023

arXiv:2310.07247 [pdf, other]

Optimizing the Placement of Roadside LiDARs for Autonomous Driving

Authors: Wentao Jiang, Hao Xiang, Xinyu Cai, Runsheng Xu, Jiaqi Ma, Yikang Li, Gim Hee Lee, Si Liu

Abstract: Multi-agent cooperative perception is an increasingly popular topic in the field of autonomous driving, where roadside LiDARs play an essential role. However, how to optimize the placement of roadside LiDARs is a crucial but often overlooked problem. This paper proposes an approach to optimize the placement of roadside LiDARs by selecting optimized positions within the scene for better perception… ▽ More Multi-agent cooperative perception is an increasingly popular topic in the field of autonomous driving, where roadside LiDARs play an essential role. However, how to optimize the placement of roadside LiDARs is a crucial but often overlooked problem. This paper proposes an approach to optimize the placement of roadside LiDARs by selecting optimized positions within the scene for better perception performance. To efficiently obtain the best combination of locations, a greedy algorithm based on perceptual gain is proposed, which selects the location that can maximize the perceptual gain sequentially. We define perceptual gain as the increased perceptual capability when a new LiDAR is placed. To obtain the perception capability, we propose a perception predictor that learns to evaluate LiDAR placement using only a single point cloud frame. A dataset named Roadside-Opt is created using the CARLA simulator to facilitate research on the roadside LiDAR placement problem. △ Less

Submitted 11 October, 2023; originally announced October 2023.

arXiv:2310.07180 [pdf, other]

Integrated Sensing and Communication enabled Multiple Base Stations Cooperative Sensing Towards 6G

Authors: Zhiqing Wei, Wangjun Jiang, Zhiyong Feng, Huici Wu, Ning Zhang, Kaifeng Han, Ruizhong Xu, ** Zhang

Abstract: Driven by the intelligent applications of sixth-generation (6G) mobile communication systems such as smart city and autonomous driving, which connect the physical and cyber space, the integrated sensing and communication (ISAC) brings a revolutionary change to the base stations (BSs) of 6G by integrating radar sensing and communication in the same hardware and wireless resource. However, with the… ▽ More Driven by the intelligent applications of sixth-generation (6G) mobile communication systems such as smart city and autonomous driving, which connect the physical and cyber space, the integrated sensing and communication (ISAC) brings a revolutionary change to the base stations (BSs) of 6G by integrating radar sensing and communication in the same hardware and wireless resource. However, with the requirements of long-range and accurate sensing in the applications of smart city and autonomous driving, the ISAC enabled single BS still has a limitation in the sensing range and accuracy. With the networked infrastructures of mobile communication systems, multi-BS cooperative sensing is a natural choice satisfying the requirement of long-range and accurate sensing. In this article, the framework of multi-BS cooperative sensing is proposed, breaking through the limitation of single-BS sensing. The enabling technologies, including unified ISAC performance metrics, ISAC signal design and optimization, interference management, cooperative sensing algorithms, are introduced in details. The performance evaluation results are provided to verify the effectiveness of multi-BS cooperative sensing schemes. With ISAC enabled multi-BS cooperative sensing (ISAC-MCS), the intelligent infrastructures connecting physical and cyber space can be established, ushering the era of 6G promoting the intelligence of everything. △ Less

Submitted 24 November, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: 11 pages 6 figures

Journal ref: IEEE NetWork 2023

arXiv:2310.06660 [pdf, other]

doi 10.1103/PhysRevD.108.083009

Identifying axion conversion in compact star magnetospheres with radio-wave polarization signatures

Authors: Z. H. Xue, K. J. Lee, X. D. Gao, R. X. Xu

Abstract: The axion is well motivated in physics. It solves the strong charge conjugation-parity reversal problem CP in fundamental physics and the dark matter problem in astronomy. Its interaction with the electromagnetic field has been expected but never detected experimentally. Such particles may convert to radio waves in the environment with a strong magnetic field. Inspired by the idea, various researc… ▽ More The axion is well motivated in physics. It solves the strong charge conjugation-parity reversal problem CP in fundamental physics and the dark matter problem in astronomy. Its interaction with the electromagnetic field has been expected but never detected experimentally. Such particles may convert to radio waves in the environment with a strong magnetic field. Inspired by the idea, various research groups have been working on theoretical modeling and radio data analysis to search for the signature of radio signals generated by the axion conversion in the magnetosphere of compact stars, where the surface magnetic field as strong as $10^{13}$-$10^{14}$ G is expected. In this work, we calculate the observational properties of the axion-induced radio signals (AIRSs) in the neutron star magnetosphere, where both the total intensity and polarization properties of radio emission are derived. Based on the ray tracing method, assuming 100% linear polarization of radio waves generated in each conversion, we compute the polarization emission profile concerning different viewing angles. We note that plasma and general relativistic effects are important for the polarization properties of AIRSs. Our work suggests that AIRSs can be identified by the narrow bandwidth and distinct polarization features. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: 15 pages, 7 figures. Published in Physical Review D

Journal ref: Phys. Rev. D 108, 083009 (2023)

arXiv:2310.05092 [pdf, other]

Benchmarking Large Language Models with Augmented Instructions for Fine-grained Information Extraction

Authors: Jun Gao, Huan Zhao, Yice Zhang, Wei Wang, Changlong Yu, Ruifeng Xu

Abstract: Information Extraction (IE) is an essential task in Natural Language Processing. Traditional methods have relied on coarse-grained extraction with simple instructions. However, with the emergence of Large Language Models (LLMs), there is a need to adapt IE techniques to leverage the capabilities of these models. This paper introduces a fine-grained IE benchmark dataset tailored for LLMs, employing… ▽ More Information Extraction (IE) is an essential task in Natural Language Processing. Traditional methods have relied on coarse-grained extraction with simple instructions. However, with the emergence of Large Language Models (LLMs), there is a need to adapt IE techniques to leverage the capabilities of these models. This paper introduces a fine-grained IE benchmark dataset tailored for LLMs, employing augmented instructions for each information type, which includes task descriptions, extraction rules, output formats, and examples. Through extensive evaluations, we observe that encoder-decoder models, particularly T5 and FLAN-T5, perform well in generalizing to unseen information types, while ChatGPT exhibits greater adaptability to new task forms. Our results also indicate that performance is not solely dictated by model scale, and highlight the significance of architecture, data diversity, and learning techniques. This work paves the way for a more refined and versatile utilization of LLMs in Information Extraction. △ Less

Submitted 8 October, 2023; originally announced October 2023.

arXiv:2310.05022 [pdf, other]

Fully Spiking Neural Network for Legged Robots

Authors: Xiaoyang Jiang, Qiang Zhang, **gkai Sun, Jiahang Cao, **gtong Ma, Ren**g Xu

Abstract: In recent years, legged robots based on deep reinforcement learning have made remarkable progress. Quadruped robots have demonstrated the ability to complete challenging tasks in complex environments and have been deployed in real-world scenarios to assist humans. Simultaneously, bipedal and humanoid robots have achieved breakthroughs in various demanding tasks. Current reinforcement learning meth… ▽ More In recent years, legged robots based on deep reinforcement learning have made remarkable progress. Quadruped robots have demonstrated the ability to complete challenging tasks in complex environments and have been deployed in real-world scenarios to assist humans. Simultaneously, bipedal and humanoid robots have achieved breakthroughs in various demanding tasks. Current reinforcement learning methods can utilize diverse robot bodies and historical information to perform actions. However, prior research has not emphasized the speed and energy consumption of network inference, as well as the biological significance of the neural networks themselves. Most of the networks employed are traditional artificial neural networks that utilize multilayer perceptrons (MLP). In this paper, we successfully apply a novel Spiking Neural Network (SNN) to process legged robots, achieving outstanding results across a range of simulated terrains. SNN holds a natural advantage over traditional neural networks in terms of inference speed and energy consumption, and their pulse-form processing of body perception signals offers improved biological interpretability. Applying more biomimetic neural networks to legged robots can further reduce the heat dissipation and structural burden caused by the high power consumption of neural networks. To the best of our knowledge, this is the first work to implement SNN in legged robots. △ Less

Submitted 23 March, 2024; v1 submitted 8 October, 2023; originally announced October 2023.

arXiv:2310.01806 [pdf]

Improvement and Enhancement of YOLOv5 Small Target Recognition Based on Multi-module Optimization

Authors: Qingyang Li, Yuchen Li, Hongyi Duan, JiaLiang Kang, Jianan Zhang, Xueqian Gan, Ruotong Xu

Abstract: In this paper, the limitations of YOLOv5s model on small target detection task are deeply studied and improved. The performance of the model is successfully enhanced by introducing GhostNet-based convolutional module, RepGFPN-based Neck module optimization, CA and Transformer's attention mechanism, and loss function improvement using NWD. The experimental results validate the positive impact of th… ▽ More In this paper, the limitations of YOLOv5s model on small target detection task are deeply studied and improved. The performance of the model is successfully enhanced by introducing GhostNet-based convolutional module, RepGFPN-based Neck module optimization, CA and Transformer's attention mechanism, and loss function improvement using NWD. The experimental results validate the positive impact of these improvement strategies on model precision, recall and mAP. In particular, the improved model shows significant superiority in dealing with complex backgrounds and tiny targets in real-world application tests. This study provides an effective optimization strategy for the YOLOv5s model on small target detection, and lays a solid foundation for future related research and applications. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: 8 pages 10 figures

arXiv:2310.01248

Improving Emotional Expression and Cohesion in Image-Based Playlist Description and Music Topics: A Continuous Parameterization Approach

Authors: Yuelyu Ji, Yuheng Song, Wei Wang, Ruoyi Xu, Zhongqian Xie, Huiyun Liu

Abstract: Text generation in image-based platforms, particularly for music-related content, requires precise control over text styles and the incorporation of emotional expression. However, existing approaches often need help to control the proportion of external factors in generated text and rely on discrete inputs, lacking continuous control conditions for desired text generation. This study proposes Cont… ▽ More Text generation in image-based platforms, particularly for music-related content, requires precise control over text styles and the incorporation of emotional expression. However, existing approaches often need help to control the proportion of external factors in generated text and rely on discrete inputs, lacking continuous control conditions for desired text generation. This study proposes Continuous Parameterization for Controlled Text Generation (CPCTG) to overcome these limitations. Our approach leverages a Language Model (LM) as a style learner, integrating Semantic Cohesion (SC) and Emotional Expression Proportion (EEP) considerations. By enhancing the reward method and manipulating the CPCTG level, our experiments on playlist description and music topic generation tasks demonstrate significant improvements in ROUGE scores, indicating enhanced relevance and coherence in the generated text. △ Less

Submitted 12 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: Becasue I find some important fourmulation need to change

arXiv:2310.00574 [pdf, other]

YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs

Authors: Cyrus Zhou, Zack Hassman, Ruize Xu, Dhirpal Shah, Vaugnn Richard, Yan**g Li

Abstract: We address the challenges associated with deploying neural networks on CPUs, with a particular focus on minimizing inference time while maintaining accuracy. Our novel approach is to use the dataflow (i.e., computation order) of a neural network to explore data reuse opportunities using heuristic-guided analysis and a code generation framework, which enables exploration of various Single Instructi… ▽ More We address the challenges associated with deploying neural networks on CPUs, with a particular focus on minimizing inference time while maintaining accuracy. Our novel approach is to use the dataflow (i.e., computation order) of a neural network to explore data reuse opportunities using heuristic-guided analysis and a code generation framework, which enables exploration of various Single Instruction, Multiple Data (SIMD) implementations to achieve optimized neural network execution. Our results demonstrate that the dataflow that keeps outputs in SIMD registers while also maximizing both input and weight reuse consistently yields the best performance for a wide variety of inference workloads, achieving up to 3x speedup for 8-bit neural networks, and up to 4.8x speedup for binary neural networks, respectively, over the optimized implementations of neural networks today. △ Less

Submitted 23 November, 2023; v1 submitted 1 October, 2023; originally announced October 2023.

ACM Class: B.8.2

arXiv:2309.16992 [pdf, other]

Segment Anything Model is a Good Teacher for Local Feature Learning

Authors: **gqian Wu, Rongtao Xu, Zach Wood-Doughty, Changwei Wang, Shibiao Xu, Edmund Y. Lam

Abstract: Local feature detection and description play an important role in many computer vision tasks, which are designed to detect and describe keypoints in "any scene" and "any downstream task". Data-driven local feature learning methods need to rely on pixel-level correspondence for training, which is challenging to acquire at scale, thus hindering further improvements in performance. In this paper, we… ▽ More Local feature detection and description play an important role in many computer vision tasks, which are designed to detect and describe keypoints in "any scene" and "any downstream task". Data-driven local feature learning methods need to rely on pixel-level correspondence for training, which is challenging to acquire at scale, thus hindering further improvements in performance. In this paper, we propose SAMFeat to introduce SAM (segment anything model), a fundamental model trained on 11 million images, as a teacher to guide local feature learning and thus inspire higher performance on limited datasets. To do so, first, we construct an auxiliary task of Attention-weighted Semantic Relation Distillation (ASRD), which distillates feature relations with category-agnostic semantic information learned by the SAM encoder into a local feature learning network, to improve local feature description using semantic discrimination. Second, we develop a technique called Weakly Supervised Contrastive Learning Based on Semantic Grou** (WSC), which utilizes semantic grou**s derived from SAM as weakly supervised signals, to optimize the metric space of local descriptors. Third, we design an Edge Attention Guidance (EAG) to further improve the accuracy of local feature detection and description by prompting the network to pay more attention to the edge region guided by SAM. SAMFeat's performance on various tasks such as image matching on HPatches, and long-term visual localization on Aachen Day-Night showcases its superiority over previous local features. The release code is available at https://github.com/vignywang/SAMFeat. △ Less

Submitted 17 June, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

arXiv:2309.14982 [pdf, other]

doi 10.1103/PhysRevLett.132.171001

Experimental Limits on Solar Reflected Dark Matter with a New Approach on Accelerated-Dark-Matter-Electron Analysis in Semiconductors

Authors: Z. Y. Zhang, L. T. Yang, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, S. M. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (59 additional authors not shown)

Abstract: Recently a dark matter-electron (DM-electron) paradigm has drawn much attention. Models beyond the standard halo model describing DM accelerated by high energy celestial bodies are under intense examination as well. In this Letter, a velocity components analysis (VCA) method dedicated to swift analysis of accelerated DM-electron interactions via semiconductor detectors is proposed and the first HP… ▽ More Recently a dark matter-electron (DM-electron) paradigm has drawn much attention. Models beyond the standard halo model describing DM accelerated by high energy celestial bodies are under intense examination as well. In this Letter, a velocity components analysis (VCA) method dedicated to swift analysis of accelerated DM-electron interactions via semiconductor detectors is proposed and the first HPGe detector-based accelerated DM-electron analysis is realized. Utilizing the method, the first germanium based constraint on sub-GeV solar reflected DM-electron interaction is presented with the 205.4 kg$\cdot$day dataset from the CDEX-10 experiment. In the heavy mediator scenario, our result excels in the mass range of 5$-$15 keV/$c^2$, achieving a 3 orders of magnitude improvement comparing with previous semiconductor experiments. In the light mediator scenario, the strongest laboratory constraint for DM lighter than 0.1 MeV/$c^2$ is presented. The result proves the feasibility and demonstrates the vast potential of the VCA technique in future accelerated DM-electron analyses with semiconductor detectors. △ Less

Submitted 24 April, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

Comments: 7 pages, 4 figures. Version updated to match PRL version

Journal ref: Phys. Rev. Lett. 132, 171001 (2024)

arXiv:2309.14770 [pdf, other]

KERMIT: Knowledge Graph Completion of Enhanced Relation Modeling with Inverse Transformation

Authors: Haotian Li, Lingzhi Wang, Yuliang Wei, Richard Yi Da Xu, Bailing Wang

Abstract: Knowledge graph completion is a task that revolves around filling in missing triples based on the information available in a knowledge graph. Among the current studies, text-based methods complete the task by utilizing textual descriptions of triples. However, this modeling approach may encounter limitations, particularly when the description fails to accurately and adequately express the intended… ▽ More Knowledge graph completion is a task that revolves around filling in missing triples based on the information available in a knowledge graph. Among the current studies, text-based methods complete the task by utilizing textual descriptions of triples. However, this modeling approach may encounter limitations, particularly when the description fails to accurately and adequately express the intended meaning. To overcome these challenges, we propose the augmentation of data through two additional mechanisms. Firstly, we employ ChatGPT as an external knowledge base to generate coherent descriptions to bridge the semantic gap between the queries and answers. Secondly, we leverage inverse relations to create a symmetric graph, thereby creating extra labeling and providing supplementary information for link prediction. This approach offers additional insights into the relationships between entities. Through these efforts, we have observed significant improvements in knowledge graph completion, as these mechanisms enhance the richness and diversity of the available data, leading to more accurate results. △ Less

Submitted 26 September, 2023; originally announced September 2023.

arXiv:2309.14114 [pdf, other]

doi 10.1103/PhysRevD.108.123031

Hybrid Strangeon Stars

Authors: Chen Zhang, Yong Gao, Cheng-Jun Xia, Renxin Xu

Abstract: It was conjectured that the basic units of the ground state of bulk strong matter may be strange-clusters called strangeons, and they can form self-bound strangeon stars that are highly compact. Strangeon stars can develop a strange quark matter (SQM) core at high densities, particularly in the color-flavor-locking phase, yielding a branch of hybrid strangeon stars. We explore the stellar structur… ▽ More It was conjectured that the basic units of the ground state of bulk strong matter may be strange-clusters called strangeons, and they can form self-bound strangeon stars that are highly compact. Strangeon stars can develop a strange quark matter (SQM) core at high densities, particularly in the color-flavor-locking phase, yielding a branch of hybrid strangeon stars. We explore the stellar structure and astrophysical implications of hybrid strangeon stars. We find that hybrid strangeon stars can meet various astrophysical constraints on pulsar masses, radii, and tidal deformabilities. Finally, we show that the strangeon-SQM mixed phase is not preferred if the charge-neutrality condition is imposed at the strangeon-SQM transition region. △ Less

Submitted 8 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: 11 pages, 5 figures. Published version

Journal ref: Phys.Rev.D 108 (2023) 12, 123031

arXiv:2309.13604 [pdf, other]

Distribution-Aware Continual Test-Time Adaptation for Semantic Segmentation

Authors: Jiayi Ni, Senqiao Yang, Ran Xu, Jiaming Liu, Xiaoqi Li, Wenyu Jiao, Zehui Chen, Yi Liu, Shanghang Zhang

Abstract: Since autonomous driving systems usually face dynamic and ever-changing environments, continual test-time adaptation (CTTA) has been proposed as a strategy for transferring deployed models to continually changing target domains. However, the pursuit of long-term adaptation often introduces catastrophic forgetting and error accumulation problems, which impede the practical implementation of CTTA in… ▽ More Since autonomous driving systems usually face dynamic and ever-changing environments, continual test-time adaptation (CTTA) has been proposed as a strategy for transferring deployed models to continually changing target domains. However, the pursuit of long-term adaptation often introduces catastrophic forgetting and error accumulation problems, which impede the practical implementation of CTTA in the real world. Recently, existing CTTA methods mainly focus on utilizing a majority of parameters to fit target domain knowledge through self-training. Unfortunately, these approaches often amplify the challenge of error accumulation due to noisy pseudo-labels, and pose practical limitations stemming from the heavy computational costs associated with entire model updates. In this paper, we propose a distribution-aware tuning (DAT) method to make the semantic segmentation CTTA efficient and practical in real-world applications. DAT adaptively selects and updates two small groups of trainable parameters based on data distribution during the continual adaptation process, including domain-specific parameters (DSP) and task-relevant parameters (TRP). Specifically, DSP exhibits sensitivity to outputs with substantial distribution shifts, effectively mitigating the problem of error accumulation. In contrast, TRP are allocated to positions that are responsive to outputs with minor distribution shifts, which are fine-tuned to avoid the catastrophic forgetting problem. In addition, since CTTA is a temporal task, we introduce the Parameter Accumulation Update (PAU) strategy to collect the updated DSP and TRP in target domain sequences. We conduct extensive experiments on two widely-used semantic segmentation CTTA benchmarks, achieving promising performance compared to previous state-of-the-art methods. △ Less

Submitted 29 March, 2024; v1 submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.13302 [pdf, other]

Gaining the Sparse Rewards by Exploring Lottery Tickets in Spiking Neural Network

Authors: Hao Cheng, Jiahang Cao, Erjia Xiao, Mengshu Sun, Ren**g Xu

Abstract: Deploying energy-efficient deep learning algorithms on computational-limited devices, such as robots, is still a pressing issue for real-world applications. Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, offer a promising solution due to their low-latency and low-energy properties over traditional Artificial Neural Networks (ANNs). Despite their advantages, the dense structure o… ▽ More Deploying energy-efficient deep learning algorithms on computational-limited devices, such as robots, is still a pressing issue for real-world applications. Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, offer a promising solution due to their low-latency and low-energy properties over traditional Artificial Neural Networks (ANNs). Despite their advantages, the dense structure of deep SNNs can still result in extra energy consumption. The Lottery Ticket Hypothesis (LTH) posits that within dense neural networks, there exist winning Lottery Tickets (LTs), namely sub-networks, that can be obtained without compromising performance. Inspired by this, this paper delves into the spiking-based LTs (SLTs), examining their unique properties and potential for extreme efficiency. Then, two significant sparse \textbf{\textit{Rewards}} are gained through comprehensive explorations and meticulous experiments on SLTs across various dense structures. Moreover, a sparse algorithm tailored for spiking transformer structure, which incorporates convolution operations into the Patch Embedding Projection (ConvPEP) module, has been proposed to achieve Multi-level Sparsity (MultiSp). MultiSp refers to (1) Patch number sparsity; (2) ConvPEP weights sparsity and binarization; and (3) ConvPEP activation layer binarization. Extensive experiments demonstrate that our method achieves extreme sparsity with only a slight performance decrease, paving the way for deploying energy-efficient neural networks in robotics and beyond. △ Less

Submitted 27 March, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

Comments: This paper is under submission

arXiv:2309.13245 [pdf, other]

RBFormer: Improve Adversarial Robustness of Transformer by Robust Bias

Authors: Hao Cheng, **hao Duan, Hui Li, Lyutianyang Zhang, Jiahang Cao, ** Wang, Jize Zhang, Kaidi Xu, Ren**g Xu

Abstract: Recently, there has been a surge of interest and attention in Transformer-based structures, such as Vision Transformer (ViT) and Vision Multilayer Perceptron (VMLP). Compared with the previous convolution-based structures, the Transformer-based structure under investigation showcases a comparable or superior performance under its distinctive attention-based input token mixer strategy. Introducing… ▽ More Recently, there has been a surge of interest and attention in Transformer-based structures, such as Vision Transformer (ViT) and Vision Multilayer Perceptron (VMLP). Compared with the previous convolution-based structures, the Transformer-based structure under investigation showcases a comparable or superior performance under its distinctive attention-based input token mixer strategy. Introducing adversarial examples as a robustness consideration has had a profound and detrimental impact on the performance of well-established convolution-based structures. This inherent vulnerability to adversarial attacks has also been demonstrated in Transformer-based structures. In this paper, our emphasis lies on investigating the intrinsic robustness of the structure rather than introducing novel defense measures against adversarial attacks. To address the susceptibility to robustness issues, we employ a rational structure design approach to mitigate such vulnerabilities. Specifically, we enhance the adversarial robustness of the structure by increasing the proportion of high-frequency structural robust biases. As a result, we introduce a novel structure called Robust Bias Transformer-based Structure (RBFormer) that shows robust superiority compared to several existing baseline structures. Through a series of extensive experiments, RBFormer outperforms the original structures by a significant margin, achieving an impressive improvement of +16.12% and +5.04% across different evaluation criteria on CIFAR-10 and ImageNet-1k, respectively. △ Less

Submitted 22 September, 2023; originally announced September 2023.

Comments: BMVC 2023

arXiv:2309.13079 [pdf, other]

MiChao-HuaFen 1.0: A Specialized Pre-trained Corpus Dataset for Domain-specific Large Models

Authors: Yidong Liu, FuKai Shang, Fang Wang, Rui Xu, Jun Wang, Wei Li, Yao Li, Conghui He

Abstract: With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have demonstrated exceptional capabilities across various domains. Nevertheless, there remains a demand for high-quality, domain-specific outputs in areas like healthcare, law, and finance. This paper first evaluates the existing large models for specialized domains and discusses their limitations. To ca… ▽ More With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have demonstrated exceptional capabilities across various domains. Nevertheless, there remains a demand for high-quality, domain-specific outputs in areas like healthcare, law, and finance. This paper first evaluates the existing large models for specialized domains and discusses their limitations. To cater to the specific needs of certain domains, we introduce the ``MiChao-HuaFen 1.0'' pre-trained corpus dataset, tailored for the news and governmental sectors. The dataset, sourced from publicly available internet data from 2022, underwent multiple rounds of cleansing and processing to ensure high quality and reliable origins, with provisions for consistent and stable updates. This dataset not only supports the pre-training of large models for Chinese vertical domains but also aids in propelling deep learning research and applications in related fields. △ Less

Submitted 26 September, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

Comments: 4 pages,2 figures

arXiv:2309.12171 [pdf, other]

A multi-zone view on the multi-wavelength emission of blazars

Authors: Ruo-Yu Liu, Rui Xue, Ze-Rui Wang, Hong-Bin Tan, Markus Böttcher

Abstract: In this work, a time-dependent modeling is developed to study the emission properties of blazars in the low state. Motivated by various observations, we speculate and assume that numerous discrete radiation zones throughout the jet of a blazar contribute to the broadband emission. We model the temporal evolution of the electron spectrum in each emission zone taking into account the injection, cool… ▽ More In this work, a time-dependent modeling is developed to study the emission properties of blazars in the low state. Motivated by various observations, we speculate and assume that numerous discrete radiation zones throughout the jet of a blazar contribute to the broadband emission. We model the temporal evolution of the electron spectrum in each emission zone taking into account the injection, cooling and escape of relativistic electrons. By doing so, we are able to calculate the multi-wavelength emission of each radiation zone. The observed emission of a blazar is then the superposition of the emission from all discrete radiation zones. We revisit the multi-wavelength spectral energy distributions, light curves and polarisation under the model, and discuss its potential to reproduce the flat radio spectra, the core-shift phenomena, the minute-scale gamma-ray variability, and the large polarisation-angle swings, which are difficult to explain under the conventional one-zone models simultaneously. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: 20 pages, 16 figures, 1 table; accepted by MNRAS

arXiv:2309.11359 [pdf, other]

Prompt, Plan, Perform: LLM-based Humanoid Control via Quantized Imitation Learning

Authors: **gkai Sun, Qiang Zhang, Yiqun Duan, Xiaoyang Jiang, Chong Cheng, Ren**g Xu

Abstract: In recent years, reinforcement learning and imitation learning have shown great potential for controlling humanoid robots' motion. However, these methods typically create simulation environments and rewards for specific tasks, resulting in the requirements of multiple policies and limited capabilities for tackling complex and unknown tasks. To overcome these issues, we present a novel approach tha… ▽ More In recent years, reinforcement learning and imitation learning have shown great potential for controlling humanoid robots' motion. However, these methods typically create simulation environments and rewards for specific tasks, resulting in the requirements of multiple policies and limited capabilities for tackling complex and unknown tasks. To overcome these issues, we present a novel approach that combines adversarial imitation learning with large language models (LLMs). This innovative method enables the agent to learn reusable skills with a single policy and solve zero-shot tasks under the guidance of LLMs. In particular, we utilize the LLM as a strategic planner for applying previously learned skills to novel tasks through the comprehension of task-specific prompts. This empowers the robot to perform the specified actions in a sequence. To improve our model, we incorporate codebook-based vector quantization, allowing the agent to generate suitable actions in response to unseen textual commands from LLMs. Furthermore, we design general reward functions that consider the distinct motion features of humanoid robots, ensuring the agent imitates the motion data while maintaining goal orientation without additional guiding direction approaches or policies. To the best of our knowledge, this is the first framework that controls humanoid robots using a single learning policy network and LLM as a planner. Extensive experiments demonstrate that our method exhibits efficient and adaptive ability in complicated motion tasks. △ Less

Submitted 20 September, 2023; originally announced September 2023.

arXiv:2309.11015

3D-U-SAM Network For Few-shot Tooth Segmentation in CBCT Images

Authors: Yifu Zhang, Zuozhu Liu, Yang Feng, Ren**g Xu

Abstract: Accurate representation of tooth position is extremely important in treatment. 3D dental image segmentation is a widely used method, however labelled 3D dental datasets are a scarce resource, leading to the problem of small samples that this task faces in many cases. To this end, we address this problem with a pretrained SAM and propose a novel 3D-U-SAM network for 3D dental image segmentation. Sp… ▽ More Accurate representation of tooth position is extremely important in treatment. 3D dental image segmentation is a widely used method, however labelled 3D dental datasets are a scarce resource, leading to the problem of small samples that this task faces in many cases. To this end, we address this problem with a pretrained SAM and propose a novel 3D-U-SAM network for 3D dental image segmentation. Specifically, in order to solve the problem of using 2D pre-trained weights on 3D datasets, we adopted a convolution approximation method; in order to retain more details, we designed skip connections to fuse features at all levels with reference to U-Net. The effectiveness of the proposed method is demonstrated in ablation experiments, comparison experiments, and sample size experiments. △ Less

Submitted 27 February, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

Comments: The paper needs to be updated

arXiv:2309.10929 [pdf, other]

Specializing Small Language Models towards Complex Style Transfer via Latent Attribute Pre-Training

Authors: Ruiqi Xu, Yongfeng Huang, Xin Chen, Lin Zhang

Abstract: In this work, we introduce the concept of complex text style transfer tasks, and constructed complex text datasets based on two widely applicable scenarios. Our dataset is the first large-scale data set of its kind, with 700 rephrased sentences and 1,000 sentences from the game Genshin Impact. While large language models (LLM) have shown promise in complex text style transfer, they have drawbacks… ▽ More In this work, we introduce the concept of complex text style transfer tasks, and constructed complex text datasets based on two widely applicable scenarios. Our dataset is the first large-scale data set of its kind, with 700 rephrased sentences and 1,000 sentences from the game Genshin Impact. While large language models (LLM) have shown promise in complex text style transfer, they have drawbacks such as data privacy concerns, network instability, and high deployment costs. To address these issues, we explore the effectiveness of small models (less than T5-3B) with implicit style pre-training through contrastive learning. We also propose a method for automated evaluation of text generation quality based on alignment with human evaluations using ChatGPT. Finally, we compare our approach with existing methods and show that our model achieves state-of-art performances of few-shot text style transfer models. △ Less

Submitted 19 September, 2023; originally announced September 2023.

arXiv:2309.09847 [pdf, other]

doi 10.1093/mnras/stad3204

Quasi-periodic oscillations during magnetar giant flares in the strangeon star model

Authors: Hong-Bo Li, Yacheng Kang, Zexin Hu, Li**g Shao, Cheng-Jun Xia, Ren-Xin Xu

Abstract: Soft gamma-ray repeaters (SGRs) are widely understood as slowly rotating isolated neutron stars. Their generally large spin-down rates, high magnetic fields, and strong outburst energies render them different from ordinary pulsars. In a few giant flares (GFs) and short bursts of SGRs, high-confidence quasi-periodic oscillations (QPOs) were observed. Although remaining an open question, many theore… ▽ More Soft gamma-ray repeaters (SGRs) are widely understood as slowly rotating isolated neutron stars. Their generally large spin-down rates, high magnetic fields, and strong outburst energies render them different from ordinary pulsars. In a few giant flares (GFs) and short bursts of SGRs, high-confidence quasi-periodic oscillations (QPOs) were observed. Although remaining an open question, many theoretical studies suggest that the torsional oscillations caused by starquakes could explain QPOs. Motivated by this scenario, we systematically investigate torsional oscillation frequencies based on the strangeon-star (SS) model with various values of harmonic indices and overtones. To characterize the strong-repulsive interaction at short distances and the non-relativistic nature of strangeons, a phenomenological Lennard-Jones model is adopted. We show that, attributing to the large shear modulus of SSs, our results explain well the high-frequency QPOs ($\gtrsim 150\,\mathrm{Hz}$) during the GFs. The low-frequency QPOs ($\lesssim 150\,\mathrm{Hz}$) can also be interpreted when the ocean-crust interface modes are included. We also discuss possible effects of the magnetic field on the torsional mode frequencies. Considering realistic models with general-relativistic corrections and magnetic fields, we further calculate torsional oscillation frequencies for quark stars. We show that it would be difficult for quark stars to explain all QPOs in GFs. Our work advances the understanding of the nature of QPOs and magnetar asteroseismology. △ Less

Submitted 28 October, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: 9 pages, 7 figures; accepted by MNRAS

Journal ref: MNRAS 527 (2024) 855

arXiv:2309.09729 [pdf]

Disassembling one-dimensional chains in molybdenum oxides

Authors: Xian Du, Yidian Li, Wenxuan Zhao, Runzhe Xu, Kaiyi Zhai, Yulin Chen, Lexian Yang

Abstract: The dimensionality of quantum materials strongly affects their physical properties. Although many emergent phenomena, such as charge-density wave and Luttinger liquid behavior, are well understood in one-dimensional (1D) systems, the generalization to explore them in higher dimensional systems is still a challenging task. In this study, we aim to bridge this gap by systematically investigating the… ▽ More The dimensionality of quantum materials strongly affects their physical properties. Although many emergent phenomena, such as charge-density wave and Luttinger liquid behavior, are well understood in one-dimensional (1D) systems, the generalization to explore them in higher dimensional systems is still a challenging task. In this study, we aim to bridge this gap by systematically investigating the crystal and electronic structures of molybdenum-oxide family compounds, where the contexture of 1D chains facilitates rich emergent properties. While the quasi-1D chains in these materials share general similarities, such as the motifs made up of MoO6 octahedrons, they exhibit vast complexity and remarkable tunability. We disassemble the 1D chains in molybdenum oxides with different dimensions and construct effective models to excellently fit their low-energy electronic structures obtained by ab initio calculations. Furthermore, we discuss the implications of such chains on other physical properties of the materials and the practical significance of the effective models. Our work establishes the molybdenum oxides as simple and tunable model systems for studying and manipulating the dimensionality in quantum systems. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 25 pages, 5 figures

arXiv:2309.09297 [pdf, other]

Chasing Day and Night: Towards Robust and Efficient All-Day Object Detection Guided by an Event Camera

Authors: Jiahang Cao, Xu Zheng, Yuanhuiyi Lyu, Jiaxu Wang, Ren**g Xu, Lin Wang

Abstract: The ability to detect objects in all lighting (i.e., normal-, over-, and under-exposed) conditions is crucial for real-world applications, such as self-driving.Traditional RGB-based detectors often fail under such varying lighting conditions.Therefore, recent works utilize novel event cameras to supplement or guide the RGB modality; however, these methods typically adopt asymmetric network structu… ▽ More The ability to detect objects in all lighting (i.e., normal-, over-, and under-exposed) conditions is crucial for real-world applications, such as self-driving.Traditional RGB-based detectors often fail under such varying lighting conditions.Therefore, recent works utilize novel event cameras to supplement or guide the RGB modality; however, these methods typically adopt asymmetric network structures that rely predominantly on the RGB modality, resulting in limited robustness for all-day detection. In this paper, we propose EOLO, a novel object detection framework that achieves robust and efficient all-day detection by fusing both RGB and event modalities. Our EOLO framework is built based on a lightweight spiking neural network (SNN) to efficiently leverage the asynchronous property of events. Buttressed by it, we first introduce an Event Temporal Attention (ETA) module to learn the high temporal information from events while preserving crucial edge information. Secondly, as different modalities exhibit varying levels of importance under diverse lighting conditions, we propose a novel Symmetric RGB-Event Fusion (SREF) module to effectively fuse RGB-Event features without relying on a specific modality, thus ensuring a balanced and adaptive fusion for all-day detection. In addition, to compensate for the lack of paired RGB-Event datasets for all-day training and evaluation, we propose an event synthesis approach based on the randomized optical flow that allows for directly generating the event frame from a single exposure image. We further build two new datasets, E-MSCOCO and E-VOC based on the popular benchmarks MSCOCO and PASCAL VOC. Extensive experiments demonstrate that our EOLO outperforms the state-of-the-art detectors,e.g.,RENet,by a substantial margin (+3.74% mAP50) in all lighting conditions.Our code and datasets will be available at https://vlislab22.github.io/EOLO/ △ Less

Submitted 18 March, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

Comments: Accepted by ICRA 2024

arXiv:2309.09198 [pdf, other]

A Benchmark for Text Expansion: Datasets, Metrics, and Baselines

Authors: Yi Chen, Haiyun Jiang, Wei Bi, Rui Wang, Longyue Wang, Shuming Shi, Ruifeng Xu

Abstract: This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings. Different from existing insertion-based writing assistance tasks, TE requires the model to be more flexible in both locating and generation, and also more cautious in kee** basic semantics. We leverage four complementary… ▽ More This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings. Different from existing insertion-based writing assistance tasks, TE requires the model to be more flexible in both locating and generation, and also more cautious in kee** basic semantics. We leverage four complementary approaches to construct a dataset with 12 million automatically generated instances and 2K human-annotated references for both English and Chinese. To facilitate automatic evaluation, we design various metrics from multiple perspectives. In particular, we propose Info-Gain to effectively measure the informativeness of expansions, which is an important quality dimension in TE. On top of a pre-trained text-infilling model, we build both pipelined and joint Locate&Infill models, which demonstrate the superiority over the Text2Text baselines, especially in expansion informativeness. Experiments verify the feasibility of the TE task and point out potential directions for future research toward better automatic text expansion. △ Less

Submitted 17 September, 2023; originally announced September 2023.

arXiv:2309.04826 [pdf, other]

doi 10.1093/mnras/stad2769

The FAST Galactic Plane Pulsar Snapshot survey: IV. Discovery of five fast radio bursts

Authors: D. J. Zhou, J. L. Han, W. C. **g, P. F. Wang, C. Wang, T. Wang, W. -Y. Wang, R. Luo, J. Xu, R. X. Xu, H. G. Wang

Abstract: We report five new fast radio bursts (FRBs) discovered from the Galactic Plane Pulsar Snapshot (GPPS) survey by the Five-hundred-meter Aperture Spherical radio Telescope (FAST): FRB\,20210126, FRB\,20210208, FRB\,20210705, FRB\,20211005 and FRB\,20220306. To date, no repeating bursts from these FRB sources have been detected in the follow-up monitoring observations, leading to their classification… ▽ More We report five new fast radio bursts (FRBs) discovered from the Galactic Plane Pulsar Snapshot (GPPS) survey by the Five-hundred-meter Aperture Spherical radio Telescope (FAST): FRB\,20210126, FRB\,20210208, FRB\,20210705, FRB\,20211005 and FRB\,20220306. To date, no repeating bursts from these FRB sources have been detected in the follow-up monitoring observations, leading to their classification as potential one-off events. We obtain the basic parameters for these bursts, including position, dispersion measure (DM), pulse width, spectral index, scattering time-scale, etc. The fluences and flux densities are generally lower in comparison to the values observed in one-off bursts discovered by other telescopes. Among the observed bursts, polarization data for 4 bursts were recorded during observations. Consequently, we obtain polarization profiles and Faraday rotation measures (RMs) for these bursts. △ Less

Submitted 11 October, 2023; v1 submitted 9 September, 2023; originally announced September 2023.

Comments: 7 pages, 5 figures, 2 tables. Published in MNRAS, Volume 526, Issue 2, December 2023

Showing 251–300 of 1,572 results for author: Xu, R