-
SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training
Authors:
Rui Xu,
Wenkang Qin,
Peixiang Huang,
Hao Wang,
Lin Luo
Abstract:
Deep Neural Networks (DNNs) are expected to provide explanation for users to understand their black-box predictions. Saliency map is a common form of explanation illustrating the heatmap of feature attributions, but it suffers from noise in distinguishing important features. In this paper, we propose a model-agnostic learning method called Saliency Constrained Adaptive Adversarial Training (SCAAT)…
▽ More
Deep Neural Networks (DNNs) are expected to provide explanation for users to understand their black-box predictions. Saliency map is a common form of explanation illustrating the heatmap of feature attributions, but it suffers from noise in distinguishing important features. In this paper, we propose a model-agnostic learning method called Saliency Constrained Adaptive Adversarial Training (SCAAT) to improve the quality of such DNN interpretability. By constructing adversarial samples under the guidance of saliency map, SCAAT effectively eliminates most noise and makes saliency maps sparser and more faithful without any modification to the model architecture. We apply SCAAT to multiple DNNs and evaluate the quality of the generated saliency maps on various natural and pathological image datasets. Evaluations on different domains and metrics show that SCAAT significantly improves the interpretability of DNNs by providing more faithful saliency maps without sacrificing their predictive power.
△ Less
Submitted 10 November, 2023; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Meta-Adapter: An Online Few-shot Learner for Vision-Language Model
Authors:
Cheng Cheng,
Lin Song,
Ruoyi Xue,
Hang Wang,
Hongbin Sun,
Yixiao Ge,
Ying Shan
Abstract:
The contrastive vision-language pre-training, known as CLIP, demonstrates remarkable potential in perceiving open-world visual concepts, enabling effective zero-shot image recognition. Nevertheless, few-shot learning methods based on CLIP typically require offline fine-tuning of the parameters on few-shot samples, resulting in longer inference time and the risk of over-fitting in certain domains.…
▽ More
The contrastive vision-language pre-training, known as CLIP, demonstrates remarkable potential in perceiving open-world visual concepts, enabling effective zero-shot image recognition. Nevertheless, few-shot learning methods based on CLIP typically require offline fine-tuning of the parameters on few-shot samples, resulting in longer inference time and the risk of over-fitting in certain domains. To tackle these challenges, we propose the Meta-Adapter, a lightweight residual-style adapter, to refine the CLIP features guided by the few-shot samples in an online manner. With a few training samples, our method can enable effective few-shot learning capabilities and generalize to unseen data or tasks without additional fine-tuning, achieving competitive performance and high efficiency. Without bells and whistles, our approach outperforms the state-of-the-art online few-shot learning method by an average of 3.6\% on eight image classification datasets with higher inference speed. Furthermore, our model is simple and flexible, serving as a plug-and-play module directly applicable to downstream tasks. Without further fine-tuning, Meta-Adapter obtains notable performance improvements in open-vocabulary object detection and segmentation tasks.
△ Less
Submitted 11 January, 2024; v1 submitted 7 November, 2023;
originally announced November 2023.
-
Topological electronic structure and spin texture of quasi-one-dimensional higher-order topological insulator Bi4Br4
Authors:
W. X. Zhao,
M. Yang,
R. Z. Xu,
X. Du,
Y. D. Li,
K. Y. Zhai,
C. Peng,
D. Pei,
H. Gao,
Y. W. Li,
L. X. Xu,
J. F. Han,
Y. Huang,
Z. K. Liu,
Y. G. Yao,
J. C. Zhuang,
Y. Du,
J. J. Zhou,
Y. L. Chen,
L. X. Yang
Abstract:
The notion of topological insulators (TIs), characterized by an insulating bulk and conducting topological surface states, can be extended to higher-order topological insulators (HOTIs) hosting gapless modes localized at the boundaries of two or more dimensions lower than the insulating bulk1-5. In this work, by performing high-resolution angle-resolved photoemission spectroscopy (ARPES) measureme…
▽ More
The notion of topological insulators (TIs), characterized by an insulating bulk and conducting topological surface states, can be extended to higher-order topological insulators (HOTIs) hosting gapless modes localized at the boundaries of two or more dimensions lower than the insulating bulk1-5. In this work, by performing high-resolution angle-resolved photoemission spectroscopy (ARPES) measurements with submicron spatial and spin resolutions, we systematically investigate the electronic structure and spin texture of quasi-one-dimensional (1D) HOTI candidate Bi4Br4. In contrast to the bulk-state-dominant spectra on the (001) surface, we observe gapped surface states on the (100) surface, whose dispersion and spin-polarization agree well with our ab initio calculations. Moreover, we reveal in-gap states connecting the surface valence and conduction bands, which is an explicit signature of the existence of hinge states inside the (100) surface gap. Our findings provide compelling evidence for the HOTI phase of Bi4Br4. The identification of the higher-order topological phase will lay the promising prospect of applications based on 1D spin-momentum locked current in electronic and spintronic devices.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Advances in Embodied Navigation Using Large Language Models: A Survey
Authors:
**zhou Lin,
Han Gao,
Xuxiang Feng,
Rongtao Xu,
Changwei Wang,
Man Zhang,
Li Guo,
Shibiao Xu
Abstract:
In recent years, the rapid advancement of Large Language Models (LLMs) such as the Generative Pre-trained Transformer (GPT) has attracted increasing attention due to their potential in a variety of practical applications. The application of LLMs with Embodied Intelligence has emerged as a significant area of focus. Among the myriad applications of LLMs, navigation tasks are particularly noteworthy…
▽ More
In recent years, the rapid advancement of Large Language Models (LLMs) such as the Generative Pre-trained Transformer (GPT) has attracted increasing attention due to their potential in a variety of practical applications. The application of LLMs with Embodied Intelligence has emerged as a significant area of focus. Among the myriad applications of LLMs, navigation tasks are particularly noteworthy because they demand a deep understanding of the environment and quick, accurate decision-making. LLMs can augment embodied intelligence systems with sophisticated environmental perception and decision-making support, leveraging their robust language and image-processing capabilities. This article offers an exhaustive summary of the symbiosis between LLMs and embodied intelligence with a focus on navigation. It reviews state-of-the-art models, research methodologies, and assesses the advantages and disadvantages of existing embodied navigation models and datasets. Finally, the article elucidates the role of LLMs in embodied intelligence, based on current research, and forecasts future directions in the field. A comprehensive list of studies in this survey is available at https://github.com/Rongtao-Xu/Awesome-LLM-EN.
△ Less
Submitted 7 June, 2024; v1 submitted 1 November, 2023;
originally announced November 2023.
-
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
Authors:
Ran Xu,
Hejie Cui,
Yue Yu,
Xuan Kan,
Wenqi Shi,
Yuchen Zhuang,
Wei **,
Joyce Ho,
Carl Yang
Abstract:
Clinical natural language processing requires methods that can address domain-specific challenges, such as complex medical terminology and clinical contexts. Recently, large language models (LLMs) have shown promise in this domain. Yet, their direct deployment can lead to privacy issues and are constrained by resources. To address this challenge, we delve into synthetic clinical text generation us…
▽ More
Clinical natural language processing requires methods that can address domain-specific challenges, such as complex medical terminology and clinical contexts. Recently, large language models (LLMs) have shown promise in this domain. Yet, their direct deployment can lead to privacy issues and are constrained by resources. To address this challenge, we delve into synthetic clinical text generation using LLMs for clinical NLP tasks. We propose an innovative, resource-efficient approach, ClinGen, which infuses knowledge into the process. Our model involves clinical knowledge extraction and context-informed LLM prompting. Both clinical topics and writing styles are drawn from external domain-specific knowledge graphs and LLMs to guide data generation. Our extensive empirical study across 7 clinical NLP tasks and 16 datasets reveals that ClinGen consistently enhances performance across various tasks, effectively aligning the distribution of real datasets and significantly enriching the diversity of generated training instances. We will publish our code and all the generated data in \url{https://github.com/ritaranx/ClinGen}.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
What a Whole Slide Image Can Tell? Subtype-guided Masked Transformer for Pathological Image Captioning
Authors:
Wenkang Qin,
Rui Xu,
Peixiang Huang,
Xiaomin Wu,
Heyu Zhang,
Lin Luo
Abstract:
Pathological captioning of Whole Slide Images (WSIs), though is essential in computer-aided pathological diagnosis, has rarely been studied due to the limitations in datasets and model training efficacy. In this paper, we propose a new paradigm Subtype-guided Masked Transformer (SGMT) for pathological captioning based on Transformers, which treats a WSI as a sequence of sparse patches and generate…
▽ More
Pathological captioning of Whole Slide Images (WSIs), though is essential in computer-aided pathological diagnosis, has rarely been studied due to the limitations in datasets and model training efficacy. In this paper, we propose a new paradigm Subtype-guided Masked Transformer (SGMT) for pathological captioning based on Transformers, which treats a WSI as a sequence of sparse patches and generates an overall caption sentence from the sequence. An accompanying subtype prediction is introduced into SGMT to guide the training process and enhance the captioning accuracy. We also present an Asymmetric Masked Mechansim approach to tackle the large size constraint of pathological image captioning, where the numbers of sequencing patches in SGMT are sampled differently in the training and inferring phases, respectively. Experiments on the PatchGastricADC22 dataset demonstrate that our approach effectively adapts to the task with a transformer-based model and achieves superior performance than traditional RNN-based methods. Our codes are to be made available for further research and development.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology
Authors:
Peixiang Huang,
Songtao Zhang,
Yulu Gan,
Rui Xu,
Rongqi Zhu,
Wenkang Qin,
Limei Guo,
Shan Jiang,
Lin Luo
Abstract:
Deep learning in digital pathology brings intelligence and automation as substantial enhancements to pathological analysis, the gold standard of clinical diagnosis. However, multiple steps from tissue preparation to slide imaging introduce various image corruptions, making it difficult for deep neural network (DNN) models to achieve stable diagnostic results for clinical use. In order to assess an…
▽ More
Deep learning in digital pathology brings intelligence and automation as substantial enhancements to pathological analysis, the gold standard of clinical diagnosis. However, multiple steps from tissue preparation to slide imaging introduce various image corruptions, making it difficult for deep neural network (DNN) models to achieve stable diagnostic results for clinical use. In order to assess and further enhance the robustness of the models, we analyze the physical causes of the full-stack corruptions throughout the pathological life-cycle and propose an Omni-Corruption Emulation (OmniCE) method to reproduce 21 types of corruptions quantified with 5-level severity. We then construct three OmniCE-corrupted benchmark datasets at both patch level and slide level and assess the robustness of popular DNNs in classification and segmentation tasks. Further, we explore to use the OmniCE-corrupted datasets as augmentation data for training and experiments to verify that the generalization ability of the models has been significantly enhanced.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
Authors:
Hejie Cui,
Xinyu Fang,
Zihan Zhang,
Ran Xu,
Xuan Kan,
Xin Liu,
Yue Yu,
Manling Li,
Yangqiu Song,
Carl Yang
Abstract:
Images contain rich relational knowledge that can help machines understand the world. Existing methods on visual knowledge extraction often rely on the pre-defined format (e.g., sub-verb-obj tuples) or vocabulary (e.g., relation types), restricting the expressiveness of the extracted knowledge. In this work, we take a first exploration to a new paradigm of open visual knowledge extraction. To achi…
▽ More
Images contain rich relational knowledge that can help machines understand the world. Existing methods on visual knowledge extraction often rely on the pre-defined format (e.g., sub-verb-obj tuples) or vocabulary (e.g., relation types), restricting the expressiveness of the extracted knowledge. In this work, we take a first exploration to a new paradigm of open visual knowledge extraction. To achieve this, we present OpenVik which consists of an open relational region detector to detect regions potentially containing relational knowledge and a visual knowledge generator that generates format-free knowledge by prompting the large multimodality model with the detected region of interest. We also explore two data enhancement techniques for diversifying the generated format-free visual knowledge. Extensive knowledge quality evaluations highlight the correctness and uniqueness of the extracted open visual knowledge by OpenVik. Moreover, integrating our extracted knowledge across various visual reasoning applications shows consistent improvements, indicating the real-world applicability of OpenVik.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
How Hard is Takeover in DPoS Blockchains? Understanding the Security of Coin-based Voting Governance
Authors:
Chao Li,
Balaji Palanisamy,
Runhua Xu,
Li Duan,
Jiqiang Liu,
Wei Wang
Abstract:
Delegated-Proof-of-Stake (DPoS) blockchains, such as EOSIO, Steem and TRON, are governed by a committee of block producers elected via a coin-based voting system. We recently witnessed the first de facto blockchain takeover that happened between Steem and TRON. Within one hour of this incident, TRON founder took over the entire Steem committee, forcing the original Steem community to leave the blo…
▽ More
Delegated-Proof-of-Stake (DPoS) blockchains, such as EOSIO, Steem and TRON, are governed by a committee of block producers elected via a coin-based voting system. We recently witnessed the first de facto blockchain takeover that happened between Steem and TRON. Within one hour of this incident, TRON founder took over the entire Steem committee, forcing the original Steem community to leave the blockchain that they maintained for years. This is a historical event in the evolution of blockchains and Web 3.0. Despite its significant disruptive impact, little is known about how vulnerable DPoS blockchains are in general to takeovers and the ways in which we can improve their resistance to takeovers.
In this paper, we demonstrate that the resistance of a DPoS blockchain to takeovers is governed by both the theoretical design and the actual use of its underlying coin-based voting governance system. When voters actively cooperate to resist potential takeovers, our theoretical analysis reveals that the current active resistance of DPoS blockchains is far below the theoretical upper bound. However in practice, voter preferences could be significantly different. This paper presents the first large-scale empirical study of the passive takeover resistance of EOSIO, Steem and TRON. Our study identifies the diversity in voter preferences and characterizes the impact of this diversity on takeover resistance. Through both theoretical and empirical analyses, our study provides novel insights into the security of coin-based voting governance and suggests potential ways to improve the takeover resistance of any blockchain that implements this governance model.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews
Authors:
Xintao Wang,
Yunze Xiao,
Jen-tse Huang,
Siyu Yuan,
Rui Xu,
Haoran Guo,
Quan Tu,
Yaying Fei,
Ziang Leng,
Wei Wang,
Jiangjie Chen,
Cheng Li,
Yanghua Xiao
Abstract:
Role-playing agents (RPAs), powered by large language models, have emerged as a flourishing field of applications. However, a key challenge lies in assessing whether RPAs accurately reproduce the personas of target characters, namely their character fidelity. Existing methods mainly focus on the knowledge and linguistic patterns of characters. This paper, instead, introduces a novel perspective to…
▽ More
Role-playing agents (RPAs), powered by large language models, have emerged as a flourishing field of applications. However, a key challenge lies in assessing whether RPAs accurately reproduce the personas of target characters, namely their character fidelity. Existing methods mainly focus on the knowledge and linguistic patterns of characters. This paper, instead, introduces a novel perspective to evaluate the personality fidelity of RPAs with psychological scales. Overcoming drawbacks of previous self-report assessments on RPAs, we propose InCharacter, namely Interviewing Character agents for personality tests. Experiments include various types of RPAs and LLMs, covering 32 distinct characters on 14 widely used psychological scales. The results validate the effectiveness of InCharacter in measuring RPA personalities. Then, with InCharacter, we show that state-of-the-art RPAs exhibit personalities highly aligned with the human-perceived personalities of the characters, achieving an accuracy up to 80.7%.
△ Less
Submitted 7 June, 2024; v1 submitted 27 October, 2023;
originally announced October 2023.
-
Does or did the supernova remnant Cassiopeia A operate as a PeVatron?
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE;…
▽ More
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE; $E_γ\geq 100$~TeV) $γ$-rays. In this context, the historical SNR Cassiopeia A (Cas A) is considered one of the most promising target for UHE observations. This paper presents the observation of Cas A and its vicinity by the LHAASO KM2A detector. The exceptional sensitivity of LHAASO KM2A in the UHE band, combined with the young age of Cas A, enabled us to derive stringent model-independent limits on the energy budget of UHE protons and nuclei accelerated by Cas A at any epoch after the explosion. The results challenge the prevailing paradigm that Cas A-type SNRs are major suppliers of PeV CRs in the Milky Way.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Graph Agent: Explicit Reasoning Agent for Graphs
Authors:
Qinyong Wang,
Zhenxiang Gao,
Rong Xu
Abstract:
Graph embedding methods such as Graph Neural Networks (GNNs) and Graph Transformers have contributed to the development of graph reasoning algorithms for various tasks on knowledge graphs. However, the lack of interpretability and explainability of graph embedding methods has limited their applicability in scenarios requiring explicit reasoning. In this paper, we introduce the Graph Agent (GA), an…
▽ More
Graph embedding methods such as Graph Neural Networks (GNNs) and Graph Transformers have contributed to the development of graph reasoning algorithms for various tasks on knowledge graphs. However, the lack of interpretability and explainability of graph embedding methods has limited their applicability in scenarios requiring explicit reasoning. In this paper, we introduce the Graph Agent (GA), an intelligent agent methodology of leveraging large language models (LLMs), inductive-deductive reasoning modules, and long-term memory for knowledge graph reasoning tasks. GA integrates aspects of symbolic reasoning and existing graph embedding methods to provide an innovative approach for complex graph reasoning tasks. By converting graph structures into textual data, GA enables LLMs to process, reason, and provide predictions alongside human-interpretable explanations. The effectiveness of the GA was evaluated on node classification and link prediction tasks. Results showed that GA reached state-of-the-art performance, demonstrating accuracy of 90.65%, 95.48%, and 89.32% on Cora, PubMed, and PrimeKG datasets, respectively. Compared to existing GNN and transformer models, GA offered advantages of explicit reasoning ability, free-of-training, easy adaption to various graph reasoning tasks
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
V2C MXene-modified g-C3N4 for enhanced visible-light photocatalytic activity
Authors:
Ruizheng Xu,
Guiyu Wei,
Zhemin Xie,
Sijie Diao,
Jianfeng Wen,
Tao Tang,
Li Jiang,
Ming Li,
Guanghui Hu
Abstract:
Increasing the efficiency of charge transfer and separation efficiency of photogenerated carriers are still the main challenges in the field of semiconductor-based photocatalysts. Herein, we synthesized g-C3N4@V2C MXene photocatalyst by modifying g-C3N4 using V2C MXene. The prepared photocatalyst exhibited outstanding photocatalytic performance under visible light. The degradation efficiency of me…
▽ More
Increasing the efficiency of charge transfer and separation efficiency of photogenerated carriers are still the main challenges in the field of semiconductor-based photocatalysts. Herein, we synthesized g-C3N4@V2C MXene photocatalyst by modifying g-C3N4 using V2C MXene. The prepared photocatalyst exhibited outstanding photocatalytic performance under visible light. The degradation efficiency of methyl orange by g-C3N4@V2C MXene photocatalyst was as high as 94.5%, which is 1.56 times higher than that by g-C3N4. This was attributed to the V2C MXene inhibiting the rapid recombination of photogenerated carriers and facilitating rapid transfer of photogenerated electrons (e) from g-C3N4 to MXene. Moreover, g-C3N4@V2C MXene photocatalyst showed good cycling stability. The photocatalytic performance was higher than 85% after three cycles. Experiments to capture free radicals revealed that superoxide radicals (02) are the main contributors to the photocatalytic activity. Thus, the proposed g-C3N4@V2C MXene photocatalyst is a promising visible-light catalyst.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
The renormalization of the shell-model GT operator starting from effective field theory for nuclear systems
Authors:
L. Coraggio,
N. Itaco,
G. De Gregorio,
A. Gargano,
Z. H. Cheng,
Y. Z. Ma,
F. R. Xu,
M. Viviani
Abstract:
For the first time, we approach in this work the problem of the renormalization of the Gamow-Teller decay operator for nuclear shell-model calculations by way of many-body perturbation theory, starting from a nuclear Hamiltonian and electroweak currents derived consistently by way of the chiral perturbation theory. These are the inputs we need to construct microscopically the effective shell-model…
▽ More
For the first time, we approach in this work the problem of the renormalization of the Gamow-Teller decay operator for nuclear shell-model calculations by way of many-body perturbation theory, starting from a nuclear Hamiltonian and electroweak currents derived consistently by way of the chiral perturbation theory. These are the inputs we need to construct microscopically the effective shell-model Hamiltonians and decay operators. The goal is to assess the role of both electroweak currents and many-body correlations as the origins of the well-known problem of the quenching of the axial coupling constant gA. To this end, the calculation of observables related to the Gamow-Teller transitions has been performed for several nuclear systems outside the 40Ca and 56Ni closed cores and compared with the available data.
△ Less
Submitted 22 December, 2023; v1 submitted 24 October, 2023;
originally announced October 2023.
-
COPR: Continual Learning Human Preference through Optimal Policy Regularization
Authors:
Han Zhang,
Lin Gui,
Yuanzhao Zhai,
Hui Wang,
Yu Lei,
Ruifeng Xu
Abstract:
The technique of Reinforcement Learning from Human Feedback (RLHF) is a commonly employed method to improve pre-trained Language Models (LM), enhancing their ability to conform to human preferences. Nevertheless, the current RLHF-based LMs necessitate full retraining each time novel queries or feedback are introduced, which becomes a challenging task because human preferences can vary between diff…
▽ More
The technique of Reinforcement Learning from Human Feedback (RLHF) is a commonly employed method to improve pre-trained Language Models (LM), enhancing their ability to conform to human preferences. Nevertheless, the current RLHF-based LMs necessitate full retraining each time novel queries or feedback are introduced, which becomes a challenging task because human preferences can vary between different domains or tasks. Retraining LMs poses practical difficulties in many real-world situations due to the significant time and computational resources required, along with concerns related to data privacy. To address this limitation, we propose a new method called Continual Optimal Policy Regularization (COPR), in which we compute the distribution of optimal policy bypassing the partition function and then regularize the current policy based on the historically optimal distribution to mitigate Catastrophic Forgetting (CF). COPR involves a single learning phase and doesn't necessitate complex reinforcement learning. Importantly, it shares the capability with RLHF to learn from unlabeled data by maintaining a scoring module, similar to reward model, making it flexible for continually learning without human feedback. Our experimental results show that COPR outperforms strong Continuous Learning (CL) baselines when it comes to consistently aligning with human preferences on incremental tasks and domains.
△ Less
Submitted 26 March, 2024; v1 submitted 24 October, 2023;
originally announced October 2023.
-
Electric quadrupole second harmonic generation revealing dual magnetic orders in a magnetic Weyl semimetal
Authors:
Youngjun Ahn,
Xiaoyu Guo,
Rui Xue,
Kejian Qu,
Kai Sun,
David Mandrus,
Liuyan Zhao
Abstract:
Broken symmetries and electronic topology are nicely manifested together in the second order nonlinear optical responses from topologically nontrivial materials. While second order nonlinear optical effects from the electric dipole (ED) contribution have been extensively explored in polar Weyl semimetals (WSMs) with broken spatial inversion (SI) symmetry, they are rarely studied in centrosymmetric…
▽ More
Broken symmetries and electronic topology are nicely manifested together in the second order nonlinear optical responses from topologically nontrivial materials. While second order nonlinear optical effects from the electric dipole (ED) contribution have been extensively explored in polar Weyl semimetals (WSMs) with broken spatial inversion (SI) symmetry, they are rarely studied in centrosymmetric magnetic WSMs with broken time reversal (TR) symmetry due to complete suppression of the ED contribution. Here, we report experimental demonstration of optical second harmonic generation (SHG) in a magnetic WSM Co$_{3}$Sn$_{2}$S$_{2}$ from the electric quadrupole (EQ) contribution. By tracking the temperature dependence of the rotation anisotropy (RA) of SHG, we capture two magnetic phase transitions, with both the SHG intensity increasing and its RA pattern rotating at $T_{C,1}$=175K and $T_{C,2}$=120K subsequently. The fitted critical exponents for the SHG intensity and RA orientation near $T_{C,1}$ and $T_{C,2}$ suggest that the magnetic phase at $T_{C,1}$ is a 3D Ising-type out-of-plane ferromagnetism while the other at $T_{C,2}$ is a 3D XY-type all-in-all-out in-plane antiferromagnetism. Our results show the success of detection and exploration of EQ SHG in a centrosymmetric magnetic WSM, and hence open the pathway towards the future investigation of its tie to the band topology.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Semiparametrically Efficient Score for the Survival Odds Ratio
Authors:
Denise Rava,
Jelena Bradic,
Ronghui Xu
Abstract:
We consider a general proportional odds model for survival data under binary treatment, where the functional form of the covariates is left unspecified. We derive the efficient score for the conditional survival odds ratio given the covariates using modern semiparametric theory. The efficient score may be useful in the development of doubly robust estimators, although computational challenges rema…
▽ More
We consider a general proportional odds model for survival data under binary treatment, where the functional form of the covariates is left unspecified. We derive the efficient score for the conditional survival odds ratio given the covariates using modern semiparametric theory. The efficient score may be useful in the development of doubly robust estimators, although computational challenges remain.
△ Less
Submitted 14 May, 2024; v1 submitted 22 October, 2023;
originally announced October 2023.
-
Strangelets at finite temperature: nucleon emission rates, interface and shell effects
Authors:
Hao-Song You,
Huai-Min Chen,
Jian-Feng Xu,
Cheng-Jun Xia,
Guang-Xiong Peng,
Ren-Xin Xu
Abstract:
We investigate the properties of strangelets at finite temperature $T$, where an equivparticle model is adopted with both the linear confinement and leading-order perturbative interactions accounted for using density-dependent quark masses. The shell effects are examined by solving the Dirac equations for quarks in the mean-field approximation, which diminish with temperature as the occupation pro…
▽ More
We investigate the properties of strangelets at finite temperature $T$, where an equivparticle model is adopted with both the linear confinement and leading-order perturbative interactions accounted for using density-dependent quark masses. The shell effects are examined by solving the Dirac equations for quarks in the mean-field approximation, which diminish with temperature as the occupation probability of each single-particle levels fixed by the Fermi-Dirac statistics, i.e., shell dampening. Consequently, instead of decreasing with temperature, the surface tension extracted from a liquid-drop formula increases with $T$ until reaching its peak at $T\approx 20$-40 MeV with vanishing shell corrections, where the formula roughly reproduces the free energy per baryon of all strangelets. The curvature term, nevertheless, decreases with $T$ despite the presence of shell effects. The neutron and proton emission rates are fixed microscopically according to the external nucleon gas densities that are in equilibrium with strangelets, which generally increase with $T$ ($\lesssim 50$ MeV) for stable strangelets but decrease for those that are unstable against nucleon emission at $T=0$. The energy, free energy, entropy, charge-to-mass ratio, strangeness per baryon, and root-mean-square radius of $β$-stable strangelets obtained with various parameter sets are presented as well. The results indicated in this work are useful for understanding the products of binary compact star mergers and heavy-ion collisions.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
NMR Spectra Denoising with Vandermonde Constraints
Authors:
Di Guo,
Runmin Xu,
**yu Wu,
Mei** Lin,
Xiaofeng Du,
Xiaobo Qu
Abstract:
Nuclear magnetic resonance (NMR) spectroscopy serves as an important tool to analyze chemicals and proteins in bioengineering. However, NMR signals are easily contaminated by noise during the data acquisition, which can affect subsequent quantitative analysis. Therefore, denoising NMR signals has been a long-time concern. In this work, we propose an optimization model-based iterative denoising met…
▽ More
Nuclear magnetic resonance (NMR) spectroscopy serves as an important tool to analyze chemicals and proteins in bioengineering. However, NMR signals are easily contaminated by noise during the data acquisition, which can affect subsequent quantitative analysis. Therefore, denoising NMR signals has been a long-time concern. In this work, we propose an optimization model-based iterative denoising method, CHORD-V, by treating the time-domain NMR signal as damped exponentials and maintaining the exponential signal form with a Vandermonde factorization. Results on both synthetic and realistic NMR data show that CHORD-V has a superior denoising performance over typical Cadzow and rQRd methods, and the state-of-the-art CHORD method. CHORD-V restores low-intensity spectral peaks more accurately, especially when the noise is relatively high.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Neural Degradation Representation Learning for All-In-One Image Restoration
Authors:
Mingde Yao,
Ruikang Xu,
Yuanshen Guan,
Jie Huang,
Zhiwei Xiong
Abstract:
Existing methods have demonstrated effective performance on a single degradation type. In practical applications, however, the degradation is often unknown, and the mismatch between the model and the degradation will result in a severe performance drop. In this paper, we propose an all-in-one image restoration network that tackles multiple degradations. Due to the heterogeneous nature of different…
▽ More
Existing methods have demonstrated effective performance on a single degradation type. In practical applications, however, the degradation is often unknown, and the mismatch between the model and the degradation will result in a severe performance drop. In this paper, we propose an all-in-one image restoration network that tackles multiple degradations. Due to the heterogeneous nature of different types of degradations, it is difficult to process multiple degradations in a single network. To this end, we propose to learn a neural degradation representation (NDR) that captures the underlying characteristics of various degradations. The learned NDR decomposes different types of degradations adaptively, similar to a neural dictionary that represents basic degradation components. Subsequently, we develop a degradation query module and a degradation injection module to effectively recognize and utilize the specific degradation based on NDR, enabling the all-in-one restoration ability for multiple degradations. Moreover, we propose a bidirectional optimization strategy to effectively drive NDR to learn the degradation representation by optimizing the degradation and restoration processes alternately. Comprehensive experiments on representative types of degradations (including noise, haze, rain, and downsampling) demonstrate the effectiveness and generalization capability of our method.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization
Authors:
Simin Li,
Ruixiao Xu,
**gqiao Xiu,
Yuwei Zheng,
Pu Feng,
Yaodong Yang,
Xianglong Liu
Abstract:
In multi-agent reinforcement learning (MARL), ensuring robustness against unpredictable or worst-case actions by allies is crucial for real-world deployment. Existing robust MARL methods either approximate or enumerate all possible threat scenarios against worst-case adversaries, leading to computational intensity and reduced robustness. In contrast, human learning efficiently acquires robust beha…
▽ More
In multi-agent reinforcement learning (MARL), ensuring robustness against unpredictable or worst-case actions by allies is crucial for real-world deployment. Existing robust MARL methods either approximate or enumerate all possible threat scenarios against worst-case adversaries, leading to computational intensity and reduced robustness. In contrast, human learning efficiently acquires robust behaviors in daily life without preparing for every possible threat. Inspired by this, we frame robust MARL as an inference problem, with worst-case robustness implicitly optimized under all threat scenarios via off-policy evaluation. Within this framework, we demonstrate that Mutual Information Regularization as Robust Regularization (MIR3) during routine training is guaranteed to maximize a lower bound on robustness, without the need for adversaries. Further insights show that MIR3 acts as an information bottleneck, preventing agents from over-reacting to others and aligning policies with robust action priors. In the presence of worst-case adversaries, our MIR3 significantly surpasses baseline methods in robustness and training efficiency while maintaining cooperative performance in StarCraft II and robot swarm control. When deploying the robot swarm control algorithm in the real world, our method also outperforms the best baseline by 14.29%.
△ Less
Submitted 21 May, 2024; v1 submitted 15 October, 2023;
originally announced October 2023.
-
Overconstrained Robotic Limb with Energy-Efficient, Omni-directional Locomotion
Authors:
Ronghan Xu,
Jiayi Yin,
Shihao Feng,
Bangchao Huang,
Haoran Sun,
Jia Pan,
Fang Wan,
Chaoyang Song
Abstract:
This paper studies the design, modeling, and control of a novel quadruped, featuring overconstrained robotic limbs employing the Bennett linkage for motion and power transmission. The modular limb design allows the robot to morph into reptile- or mammal-inspired forms. In contrast to the prevailing focus on planar limbs, this research delves into the classical overconstrained linkages, which have…
▽ More
This paper studies the design, modeling, and control of a novel quadruped, featuring overconstrained robotic limbs employing the Bennett linkage for motion and power transmission. The modular limb design allows the robot to morph into reptile- or mammal-inspired forms. In contrast to the prevailing focus on planar limbs, this research delves into the classical overconstrained linkages, which have strong theoretical foundations in advanced kinematics but limited engineering applications. The study showcases the morphological superiority of overconstrained robotic limbs that can transform into planar or spherical limbs, exemplifying the Bennett linkage. By conducting kinematic and dynamic modeling, we apply model predictive control to simulate a range of locomotion tasks, revealing that overconstrained limbs outperform planar designs in omni-directional tasks like forward trotting, lateral trotting, and turning on the spot when considering foothold distances. These findings highlight the biological distinctions in limb design between reptiles and mammals and represent the first documented instance of overconstrained robotic limbs outperforming planar designs in dynamic locomotion.
△ Less
Submitted 3 February, 2024; v1 submitted 15 October, 2023;
originally announced October 2023.
-
Very high energy gamma-ray emission beyond 10 TeV from GRB 221009A
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
A. Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the t…
▽ More
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the trigger. The intrinsic energy spectrum of gamma-rays can be described by a power-law after correcting for extragalactic background light (EBL) absorption. Such a hard spectrum challenges the synchrotron self-Compton (SSC) scenario of relativistic electrons for the afterglow emission above several TeV. Observations of gamma-rays up to 13 TeV from a source with a measured redshift of z=0.151 hints more transparency in intergalactic space than previously expected. Alternatively, one may invoke new physics such as Lorentz Invariance Violation (LIV) or an axion origin of very high energy (VHE) signals.
△ Less
Submitted 22 November, 2023; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Discovering Fatigued Movements for Virtual Character Animation
Authors:
Noshaba Cheema,
Rui Xu,
Nam Hee Kim,
Perttu Hämäläinen,
Vladislav Golyanik,
Marc Habermann,
Christian Theobalt,
Philipp Slusallek
Abstract:
Virtual character animation and movement synthesis have advanced rapidly during recent years, especially through a combination of extensive motion capture datasets and machine learning. A remaining challenge is interactively simulating characters that fatigue when performing extended motions, which is indispensable for the realism of generated animations. However, capturing such movements is probl…
▽ More
Virtual character animation and movement synthesis have advanced rapidly during recent years, especially through a combination of extensive motion capture datasets and machine learning. A remaining challenge is interactively simulating characters that fatigue when performing extended motions, which is indispensable for the realism of generated animations. However, capturing such movements is problematic, as performing movements like backflips with fatigued variations up to exhaustion raises capture cost and risk of injury. Surprisingly, little research has been done on faithful fatigue modeling. To address this, we propose a deep reinforcement learning-based approach, which -- for the first time in literature -- generates control policies for full-body physically simulated agents aware of cumulative fatigue. For this, we first leverage Generative Adversarial Imitation Learning (GAIL) to learn an expert policy for the skill; Second, we learn a fatigue policy by limiting the generated constant torque bounds based on endurance time to non-linear, state- and time-dependent limits in the joint-actuation space using a Three-Compartment Controller (3CC) model. Our results demonstrate that agents can adapt to different fatigue and rest rates interactively, and discover realistic recovery strategies without the need for any captured data of fatigued movement.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
DUSA: Decoupled Unsupervised Sim2Real Adaptation for Vehicle-to-Everything Collaborative Perception
Authors:
Xianghao Kong,
Wentao Jiang,
**rang Jia,
Yifeng Shi,
Runsheng Xu,
Si Liu
Abstract:
Vehicle-to-Everything (V2X) collaborative perception is crucial for autonomous driving. However, achieving high-precision V2X perception requires a significant amount of annotated real-world data, which can always be expensive and hard to acquire. Simulated data have raised much attention since they can be massively produced at an extremely low cost. Nevertheless, the significant domain gap betwee…
▽ More
Vehicle-to-Everything (V2X) collaborative perception is crucial for autonomous driving. However, achieving high-precision V2X perception requires a significant amount of annotated real-world data, which can always be expensive and hard to acquire. Simulated data have raised much attention since they can be massively produced at an extremely low cost. Nevertheless, the significant domain gap between simulated and real-world data, including differences in sensor type, reflectance patterns, and road surroundings, often leads to poor performance of models trained on simulated data when evaluated on real-world data. In addition, there remains a domain gap between real-world collaborative agents, e.g. different types of sensors may be installed on autonomous vehicles and roadside infrastructures with different extrinsics, further increasing the difficulty of sim2real generalization. To take full advantage of simulated data, we present a new unsupervised sim2real domain adaptation method for V2X collaborative detection named Decoupled Unsupervised Sim2Real Adaptation (DUSA). Our new method decouples the V2X collaborative sim2real domain adaptation problem into two sub-problems: sim2real adaptation and inter-agent adaptation. For sim2real adaptation, we design a Location-adaptive Sim2Real Adapter (LSA) module to adaptively aggregate features from critical locations of the feature map and align the features between simulated data and real-world data via a sim/real discriminator on the aggregated global feature. For inter-agent adaptation, we further devise a Confidence-aware Inter-agent Adapter (CIA) module to align the fine-grained features from heterogeneous agents under the guidance of agent-wise confidence maps. Experiments demonstrate the effectiveness of the proposed DUSA approach on unsupervised sim2real adaptation from the simulated V2XSet dataset to the real-world DAIR-V2X-C dataset.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
Optimizing the Placement of Roadside LiDARs for Autonomous Driving
Authors:
Wentao Jiang,
Hao Xiang,
Xinyu Cai,
Runsheng Xu,
Jiaqi Ma,
Yikang Li,
Gim Hee Lee,
Si Liu
Abstract:
Multi-agent cooperative perception is an increasingly popular topic in the field of autonomous driving, where roadside LiDARs play an essential role. However, how to optimize the placement of roadside LiDARs is a crucial but often overlooked problem. This paper proposes an approach to optimize the placement of roadside LiDARs by selecting optimized positions within the scene for better perception…
▽ More
Multi-agent cooperative perception is an increasingly popular topic in the field of autonomous driving, where roadside LiDARs play an essential role. However, how to optimize the placement of roadside LiDARs is a crucial but often overlooked problem. This paper proposes an approach to optimize the placement of roadside LiDARs by selecting optimized positions within the scene for better perception performance. To efficiently obtain the best combination of locations, a greedy algorithm based on perceptual gain is proposed, which selects the location that can maximize the perceptual gain sequentially. We define perceptual gain as the increased perceptual capability when a new LiDAR is placed. To obtain the perception capability, we propose a perception predictor that learns to evaluate LiDAR placement using only a single point cloud frame. A dataset named Roadside-Opt is created using the CARLA simulator to facilitate research on the roadside LiDAR placement problem.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Integrated Sensing and Communication enabled Multiple Base Stations Cooperative Sensing Towards 6G
Authors:
Zhiqing Wei,
Wangjun Jiang,
Zhiyong Feng,
Huici Wu,
Ning Zhang,
Kaifeng Han,
Ruizhong Xu,
** Zhang
Abstract:
Driven by the intelligent applications of sixth-generation (6G) mobile communication systems such as smart city and autonomous driving, which connect the physical and cyber space, the integrated sensing and communication (ISAC) brings a revolutionary change to the base stations (BSs) of 6G by integrating radar sensing and communication in the same hardware and wireless resource. However, with the…
▽ More
Driven by the intelligent applications of sixth-generation (6G) mobile communication systems such as smart city and autonomous driving, which connect the physical and cyber space, the integrated sensing and communication (ISAC) brings a revolutionary change to the base stations (BSs) of 6G by integrating radar sensing and communication in the same hardware and wireless resource. However, with the requirements of long-range and accurate sensing in the applications of smart city and autonomous driving, the ISAC enabled single BS still has a limitation in the sensing range and accuracy. With the networked infrastructures of mobile communication systems, multi-BS cooperative sensing is a natural choice satisfying the requirement of long-range and accurate sensing. In this article, the framework of multi-BS cooperative sensing is proposed, breaking through the limitation of single-BS sensing. The enabling technologies, including unified ISAC performance metrics, ISAC signal design and optimization, interference management, cooperative sensing algorithms, are introduced in details. The performance evaluation results are provided to verify the effectiveness of multi-BS cooperative sensing schemes. With ISAC enabled multi-BS cooperative sensing (ISAC-MCS), the intelligent infrastructures connecting physical and cyber space can be established, ushering the era of 6G promoting the intelligence of everything.
△ Less
Submitted 24 November, 2023; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Identifying axion conversion in compact star magnetospheres with radio-wave polarization signatures
Authors:
Z. H. Xue,
K. J. Lee,
X. D. Gao,
R. X. Xu
Abstract:
The axion is well motivated in physics. It solves the strong charge conjugation-parity reversal problem CP in fundamental physics and the dark matter problem in astronomy. Its interaction with the electromagnetic field has been expected but never detected experimentally. Such particles may convert to radio waves in the environment with a strong magnetic field. Inspired by the idea, various researc…
▽ More
The axion is well motivated in physics. It solves the strong charge conjugation-parity reversal problem CP in fundamental physics and the dark matter problem in astronomy. Its interaction with the electromagnetic field has been expected but never detected experimentally. Such particles may convert to radio waves in the environment with a strong magnetic field. Inspired by the idea, various research groups have been working on theoretical modeling and radio data analysis to search for the signature of radio signals generated by the axion conversion in the magnetosphere of compact stars, where the surface magnetic field as strong as $10^{13}$-$10^{14}$ G is expected. In this work, we calculate the observational properties of the axion-induced radio signals (AIRSs) in the neutron star magnetosphere, where both the total intensity and polarization properties of radio emission are derived. Based on the ray tracing method, assuming 100% linear polarization of radio waves generated in each conversion, we compute the polarization emission profile concerning different viewing angles. We note that plasma and general relativistic effects are important for the polarization properties of AIRSs. Our work suggests that AIRSs can be identified by the narrow bandwidth and distinct polarization features.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Benchmarking Large Language Models with Augmented Instructions for Fine-grained Information Extraction
Authors:
Jun Gao,
Huan Zhao,
Yice Zhang,
Wei Wang,
Changlong Yu,
Ruifeng Xu
Abstract:
Information Extraction (IE) is an essential task in Natural Language Processing. Traditional methods have relied on coarse-grained extraction with simple instructions. However, with the emergence of Large Language Models (LLMs), there is a need to adapt IE techniques to leverage the capabilities of these models. This paper introduces a fine-grained IE benchmark dataset tailored for LLMs, employing…
▽ More
Information Extraction (IE) is an essential task in Natural Language Processing. Traditional methods have relied on coarse-grained extraction with simple instructions. However, with the emergence of Large Language Models (LLMs), there is a need to adapt IE techniques to leverage the capabilities of these models. This paper introduces a fine-grained IE benchmark dataset tailored for LLMs, employing augmented instructions for each information type, which includes task descriptions, extraction rules, output formats, and examples. Through extensive evaluations, we observe that encoder-decoder models, particularly T5 and FLAN-T5, perform well in generalizing to unseen information types, while ChatGPT exhibits greater adaptability to new task forms. Our results also indicate that performance is not solely dictated by model scale, and highlight the significance of architecture, data diversity, and learning techniques. This work paves the way for a more refined and versatile utilization of LLMs in Information Extraction.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
Fully Spiking Neural Network for Legged Robots
Authors:
Xiaoyang Jiang,
Qiang Zhang,
**gkai Sun,
Jiahang Cao,
**gtong Ma,
Ren**g Xu
Abstract:
In recent years, legged robots based on deep reinforcement learning have made remarkable progress. Quadruped robots have demonstrated the ability to complete challenging tasks in complex environments and have been deployed in real-world scenarios to assist humans. Simultaneously, bipedal and humanoid robots have achieved breakthroughs in various demanding tasks. Current reinforcement learning meth…
▽ More
In recent years, legged robots based on deep reinforcement learning have made remarkable progress. Quadruped robots have demonstrated the ability to complete challenging tasks in complex environments and have been deployed in real-world scenarios to assist humans. Simultaneously, bipedal and humanoid robots have achieved breakthroughs in various demanding tasks. Current reinforcement learning methods can utilize diverse robot bodies and historical information to perform actions. However, prior research has not emphasized the speed and energy consumption of network inference, as well as the biological significance of the neural networks themselves. Most of the networks employed are traditional artificial neural networks that utilize multilayer perceptrons (MLP). In this paper, we successfully apply a novel Spiking Neural Network (SNN) to process legged robots, achieving outstanding results across a range of simulated terrains. SNN holds a natural advantage over traditional neural networks in terms of inference speed and energy consumption, and their pulse-form processing of body perception signals offers improved biological interpretability. Applying more biomimetic neural networks to legged robots can further reduce the heat dissipation and structural burden caused by the high power consumption of neural networks. To the best of our knowledge, this is the first work to implement SNN in legged robots.
△ Less
Submitted 23 March, 2024; v1 submitted 8 October, 2023;
originally announced October 2023.
-
Improvement and Enhancement of YOLOv5 Small Target Recognition Based on Multi-module Optimization
Authors:
Qingyang Li,
Yuchen Li,
Hongyi Duan,
JiaLiang Kang,
Jianan Zhang,
Xueqian Gan,
Ruotong Xu
Abstract:
In this paper, the limitations of YOLOv5s model on small target detection task are deeply studied and improved. The performance of the model is successfully enhanced by introducing GhostNet-based convolutional module, RepGFPN-based Neck module optimization, CA and Transformer's attention mechanism, and loss function improvement using NWD. The experimental results validate the positive impact of th…
▽ More
In this paper, the limitations of YOLOv5s model on small target detection task are deeply studied and improved. The performance of the model is successfully enhanced by introducing GhostNet-based convolutional module, RepGFPN-based Neck module optimization, CA and Transformer's attention mechanism, and loss function improvement using NWD. The experimental results validate the positive impact of these improvement strategies on model precision, recall and mAP. In particular, the improved model shows significant superiority in dealing with complex backgrounds and tiny targets in real-world application tests. This study provides an effective optimization strategy for the YOLOv5s model on small target detection, and lays a solid foundation for future related research and applications.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Improving Emotional Expression and Cohesion in Image-Based Playlist Description and Music Topics: A Continuous Parameterization Approach
Authors:
Yuelyu Ji,
Yuheng Song,
Wei Wang,
Ruoyi Xu,
Zhongqian Xie,
Huiyun Liu
Abstract:
Text generation in image-based platforms, particularly for music-related content, requires precise control over text styles and the incorporation of emotional expression. However, existing approaches often need help to control the proportion of external factors in generated text and rely on discrete inputs, lacking continuous control conditions for desired text generation. This study proposes Cont…
▽ More
Text generation in image-based platforms, particularly for music-related content, requires precise control over text styles and the incorporation of emotional expression. However, existing approaches often need help to control the proportion of external factors in generated text and rely on discrete inputs, lacking continuous control conditions for desired text generation. This study proposes Continuous Parameterization for Controlled Text Generation (CPCTG) to overcome these limitations. Our approach leverages a Language Model (LM) as a style learner, integrating Semantic Cohesion (SC) and Emotional Expression Proportion (EEP) considerations. By enhancing the reward method and manipulating the CPCTG level, our experiments on playlist description and music topic generation tasks demonstrate significant improvements in ROUGE scores, indicating enhanced relevance and coherence in the generated text.
△ Less
Submitted 12 October, 2023; v1 submitted 2 October, 2023;
originally announced October 2023.
-
YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs
Authors:
Cyrus Zhou,
Zack Hassman,
Ruize Xu,
Dhirpal Shah,
Vaugnn Richard,
Yan**g Li
Abstract:
We address the challenges associated with deploying neural networks on CPUs, with a particular focus on minimizing inference time while maintaining accuracy. Our novel approach is to use the dataflow (i.e., computation order) of a neural network to explore data reuse opportunities using heuristic-guided analysis and a code generation framework, which enables exploration of various Single Instructi…
▽ More
We address the challenges associated with deploying neural networks on CPUs, with a particular focus on minimizing inference time while maintaining accuracy. Our novel approach is to use the dataflow (i.e., computation order) of a neural network to explore data reuse opportunities using heuristic-guided analysis and a code generation framework, which enables exploration of various Single Instruction, Multiple Data (SIMD) implementations to achieve optimized neural network execution. Our results demonstrate that the dataflow that keeps outputs in SIMD registers while also maximizing both input and weight reuse consistently yields the best performance for a wide variety of inference workloads, achieving up to 3x speedup for 8-bit neural networks, and up to 4.8x speedup for binary neural networks, respectively, over the optimized implementations of neural networks today.
△ Less
Submitted 23 November, 2023; v1 submitted 1 October, 2023;
originally announced October 2023.
-
Segment Anything Model is a Good Teacher for Local Feature Learning
Authors:
**gqian Wu,
Rongtao Xu,
Zach Wood-Doughty,
Changwei Wang,
Shibiao Xu,
Edmund Y. Lam
Abstract:
Local feature detection and description play an important role in many computer vision tasks, which are designed to detect and describe keypoints in "any scene" and "any downstream task". Data-driven local feature learning methods need to rely on pixel-level correspondence for training, which is challenging to acquire at scale, thus hindering further improvements in performance. In this paper, we…
▽ More
Local feature detection and description play an important role in many computer vision tasks, which are designed to detect and describe keypoints in "any scene" and "any downstream task". Data-driven local feature learning methods need to rely on pixel-level correspondence for training, which is challenging to acquire at scale, thus hindering further improvements in performance. In this paper, we propose SAMFeat to introduce SAM (segment anything model), a fundamental model trained on 11 million images, as a teacher to guide local feature learning and thus inspire higher performance on limited datasets. To do so, first, we construct an auxiliary task of Attention-weighted Semantic Relation Distillation (ASRD), which distillates feature relations with category-agnostic semantic information learned by the SAM encoder into a local feature learning network, to improve local feature description using semantic discrimination. Second, we develop a technique called Weakly Supervised Contrastive Learning Based on Semantic Grou** (WSC), which utilizes semantic grou**s derived from SAM as weakly supervised signals, to optimize the metric space of local descriptors. Third, we design an Edge Attention Guidance (EAG) to further improve the accuracy of local feature detection and description by prompting the network to pay more attention to the edge region guided by SAM. SAMFeat's performance on various tasks such as image matching on HPatches, and long-term visual localization on Aachen Day-Night showcases its superiority over previous local features. The release code is available at https://github.com/vignywang/SAMFeat.
△ Less
Submitted 17 June, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Experimental Limits on Solar Reflected Dark Matter with a New Approach on Accelerated-Dark-Matter-Electron Analysis in Semiconductors
Authors:
Z. Y. Zhang,
L. T. Yang,
Q. Yue,
K. J. Kang,
Y. J. Li,
H. P. An,
Greeshma C.,
J. P. Chang,
Y. H. Chen,
J. P. Cheng,
W. H. Dai,
Z. Deng,
C. H. Fang,
X. P. Geng,
H. Gong,
Q. J. Guo,
T. Guo,
X. Y. Guo,
L. He,
S. M. He,
J. W. Hu,
H. X. Huang,
T. C. Huang,
L. Jiang,
S. Karmakar
, et al. (59 additional authors not shown)
Abstract:
Recently a dark matter-electron (DM-electron) paradigm has drawn much attention. Models beyond the standard halo model describing DM accelerated by high energy celestial bodies are under intense examination as well. In this Letter, a velocity components analysis (VCA) method dedicated to swift analysis of accelerated DM-electron interactions via semiconductor detectors is proposed and the first HP…
▽ More
Recently a dark matter-electron (DM-electron) paradigm has drawn much attention. Models beyond the standard halo model describing DM accelerated by high energy celestial bodies are under intense examination as well. In this Letter, a velocity components analysis (VCA) method dedicated to swift analysis of accelerated DM-electron interactions via semiconductor detectors is proposed and the first HPGe detector-based accelerated DM-electron analysis is realized. Utilizing the method, the first germanium based constraint on sub-GeV solar reflected DM-electron interaction is presented with the 205.4 kg$\cdot$day dataset from the CDEX-10 experiment. In the heavy mediator scenario, our result excels in the mass range of 5$-$15 keV/$c^2$, achieving a 3 orders of magnitude improvement comparing with previous semiconductor experiments. In the light mediator scenario, the strongest laboratory constraint for DM lighter than 0.1 MeV/$c^2$ is presented. The result proves the feasibility and demonstrates the vast potential of the VCA technique in future accelerated DM-electron analyses with semiconductor detectors.
△ Less
Submitted 24 April, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
KERMIT: Knowledge Graph Completion of Enhanced Relation Modeling with Inverse Transformation
Authors:
Haotian Li,
Lingzhi Wang,
Yuliang Wei,
Richard Yi Da Xu,
Bailing Wang
Abstract:
Knowledge graph completion is a task that revolves around filling in missing triples based on the information available in a knowledge graph. Among the current studies, text-based methods complete the task by utilizing textual descriptions of triples. However, this modeling approach may encounter limitations, particularly when the description fails to accurately and adequately express the intended…
▽ More
Knowledge graph completion is a task that revolves around filling in missing triples based on the information available in a knowledge graph. Among the current studies, text-based methods complete the task by utilizing textual descriptions of triples. However, this modeling approach may encounter limitations, particularly when the description fails to accurately and adequately express the intended meaning. To overcome these challenges, we propose the augmentation of data through two additional mechanisms. Firstly, we employ ChatGPT as an external knowledge base to generate coherent descriptions to bridge the semantic gap between the queries and answers. Secondly, we leverage inverse relations to create a symmetric graph, thereby creating extra labeling and providing supplementary information for link prediction. This approach offers additional insights into the relationships between entities. Through these efforts, we have observed significant improvements in knowledge graph completion, as these mechanisms enhance the richness and diversity of the available data, leading to more accurate results.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Hybrid Strangeon Stars
Authors:
Chen Zhang,
Yong Gao,
Cheng-Jun Xia,
Renxin Xu
Abstract:
It was conjectured that the basic units of the ground state of bulk strong matter may be strange-clusters called strangeons, and they can form self-bound strangeon stars that are highly compact. Strangeon stars can develop a strange quark matter (SQM) core at high densities, particularly in the color-flavor-locking phase, yielding a branch of hybrid strangeon stars. We explore the stellar structur…
▽ More
It was conjectured that the basic units of the ground state of bulk strong matter may be strange-clusters called strangeons, and they can form self-bound strangeon stars that are highly compact. Strangeon stars can develop a strange quark matter (SQM) core at high densities, particularly in the color-flavor-locking phase, yielding a branch of hybrid strangeon stars. We explore the stellar structure and astrophysical implications of hybrid strangeon stars. We find that hybrid strangeon stars can meet various astrophysical constraints on pulsar masses, radii, and tidal deformabilities. Finally, we show that the strangeon-SQM mixed phase is not preferred if the charge-neutrality condition is imposed at the strangeon-SQM transition region.
△ Less
Submitted 8 January, 2024; v1 submitted 25 September, 2023;
originally announced September 2023.
-
Distribution-Aware Continual Test-Time Adaptation for Semantic Segmentation
Authors:
Jiayi Ni,
Senqiao Yang,
Ran Xu,
Jiaming Liu,
Xiaoqi Li,
Wenyu Jiao,
Zehui Chen,
Yi Liu,
Shanghang Zhang
Abstract:
Since autonomous driving systems usually face dynamic and ever-changing environments, continual test-time adaptation (CTTA) has been proposed as a strategy for transferring deployed models to continually changing target domains. However, the pursuit of long-term adaptation often introduces catastrophic forgetting and error accumulation problems, which impede the practical implementation of CTTA in…
▽ More
Since autonomous driving systems usually face dynamic and ever-changing environments, continual test-time adaptation (CTTA) has been proposed as a strategy for transferring deployed models to continually changing target domains. However, the pursuit of long-term adaptation often introduces catastrophic forgetting and error accumulation problems, which impede the practical implementation of CTTA in the real world. Recently, existing CTTA methods mainly focus on utilizing a majority of parameters to fit target domain knowledge through self-training. Unfortunately, these approaches often amplify the challenge of error accumulation due to noisy pseudo-labels, and pose practical limitations stemming from the heavy computational costs associated with entire model updates. In this paper, we propose a distribution-aware tuning (DAT) method to make the semantic segmentation CTTA efficient and practical in real-world applications. DAT adaptively selects and updates two small groups of trainable parameters based on data distribution during the continual adaptation process, including domain-specific parameters (DSP) and task-relevant parameters (TRP). Specifically, DSP exhibits sensitivity to outputs with substantial distribution shifts, effectively mitigating the problem of error accumulation. In contrast, TRP are allocated to positions that are responsive to outputs with minor distribution shifts, which are fine-tuned to avoid the catastrophic forgetting problem. In addition, since CTTA is a temporal task, we introduce the Parameter Accumulation Update (PAU) strategy to collect the updated DSP and TRP in target domain sequences. We conduct extensive experiments on two widely-used semantic segmentation CTTA benchmarks, achieving promising performance compared to previous state-of-the-art methods.
△ Less
Submitted 29 March, 2024; v1 submitted 24 September, 2023;
originally announced September 2023.
-
Gaining the Sparse Rewards by Exploring Lottery Tickets in Spiking Neural Network
Authors:
Hao Cheng,
Jiahang Cao,
Erjia Xiao,
Mengshu Sun,
Ren**g Xu
Abstract:
Deploying energy-efficient deep learning algorithms on computational-limited devices, such as robots, is still a pressing issue for real-world applications. Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, offer a promising solution due to their low-latency and low-energy properties over traditional Artificial Neural Networks (ANNs). Despite their advantages, the dense structure o…
▽ More
Deploying energy-efficient deep learning algorithms on computational-limited devices, such as robots, is still a pressing issue for real-world applications. Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, offer a promising solution due to their low-latency and low-energy properties over traditional Artificial Neural Networks (ANNs). Despite their advantages, the dense structure of deep SNNs can still result in extra energy consumption. The Lottery Ticket Hypothesis (LTH) posits that within dense neural networks, there exist winning Lottery Tickets (LTs), namely sub-networks, that can be obtained without compromising performance. Inspired by this, this paper delves into the spiking-based LTs (SLTs), examining their unique properties and potential for extreme efficiency. Then, two significant sparse \textbf{\textit{Rewards}} are gained through comprehensive explorations and meticulous experiments on SLTs across various dense structures. Moreover, a sparse algorithm tailored for spiking transformer structure, which incorporates convolution operations into the Patch Embedding Projection (ConvPEP) module, has been proposed to achieve Multi-level Sparsity (MultiSp). MultiSp refers to (1) Patch number sparsity; (2) ConvPEP weights sparsity and binarization; and (3) ConvPEP activation layer binarization. Extensive experiments demonstrate that our method achieves extreme sparsity with only a slight performance decrease, paving the way for deploying energy-efficient neural networks in robotics and beyond.
△ Less
Submitted 27 March, 2024; v1 submitted 23 September, 2023;
originally announced September 2023.
-
RBFormer: Improve Adversarial Robustness of Transformer by Robust Bias
Authors:
Hao Cheng,
**hao Duan,
Hui Li,
Lyutianyang Zhang,
Jiahang Cao,
** Wang,
Jize Zhang,
Kaidi Xu,
Ren**g Xu
Abstract:
Recently, there has been a surge of interest and attention in Transformer-based structures, such as Vision Transformer (ViT) and Vision Multilayer Perceptron (VMLP). Compared with the previous convolution-based structures, the Transformer-based structure under investigation showcases a comparable or superior performance under its distinctive attention-based input token mixer strategy. Introducing…
▽ More
Recently, there has been a surge of interest and attention in Transformer-based structures, such as Vision Transformer (ViT) and Vision Multilayer Perceptron (VMLP). Compared with the previous convolution-based structures, the Transformer-based structure under investigation showcases a comparable or superior performance under its distinctive attention-based input token mixer strategy. Introducing adversarial examples as a robustness consideration has had a profound and detrimental impact on the performance of well-established convolution-based structures. This inherent vulnerability to adversarial attacks has also been demonstrated in Transformer-based structures. In this paper, our emphasis lies on investigating the intrinsic robustness of the structure rather than introducing novel defense measures against adversarial attacks. To address the susceptibility to robustness issues, we employ a rational structure design approach to mitigate such vulnerabilities. Specifically, we enhance the adversarial robustness of the structure by increasing the proportion of high-frequency structural robust biases. As a result, we introduce a novel structure called Robust Bias Transformer-based Structure (RBFormer) that shows robust superiority compared to several existing baseline structures. Through a series of extensive experiments, RBFormer outperforms the original structures by a significant margin, achieving an impressive improvement of +16.12% and +5.04% across different evaluation criteria on CIFAR-10 and ImageNet-1k, respectively.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
MiChao-HuaFen 1.0: A Specialized Pre-trained Corpus Dataset for Domain-specific Large Models
Authors:
Yidong Liu,
FuKai Shang,
Fang Wang,
Rui Xu,
Jun Wang,
Wei Li,
Yao Li,
Conghui He
Abstract:
With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have demonstrated exceptional capabilities across various domains. Nevertheless, there remains a demand for high-quality, domain-specific outputs in areas like healthcare, law, and finance. This paper first evaluates the existing large models for specialized domains and discusses their limitations. To ca…
▽ More
With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have demonstrated exceptional capabilities across various domains. Nevertheless, there remains a demand for high-quality, domain-specific outputs in areas like healthcare, law, and finance. This paper first evaluates the existing large models for specialized domains and discusses their limitations. To cater to the specific needs of certain domains, we introduce the ``MiChao-HuaFen 1.0'' pre-trained corpus dataset, tailored for the news and governmental sectors. The dataset, sourced from publicly available internet data from 2022, underwent multiple rounds of cleansing and processing to ensure high quality and reliable origins, with provisions for consistent and stable updates. This dataset not only supports the pre-training of large models for Chinese vertical domains but also aids in propelling deep learning research and applications in related fields.
△ Less
Submitted 26 September, 2023; v1 submitted 21 September, 2023;
originally announced September 2023.
-
A multi-zone view on the multi-wavelength emission of blazars
Authors:
Ruo-Yu Liu,
Rui Xue,
Ze-Rui Wang,
Hong-Bin Tan,
Markus Böttcher
Abstract:
In this work, a time-dependent modeling is developed to study the emission properties of blazars in the low state. Motivated by various observations, we speculate and assume that numerous discrete radiation zones throughout the jet of a blazar contribute to the broadband emission. We model the temporal evolution of the electron spectrum in each emission zone taking into account the injection, cool…
▽ More
In this work, a time-dependent modeling is developed to study the emission properties of blazars in the low state. Motivated by various observations, we speculate and assume that numerous discrete radiation zones throughout the jet of a blazar contribute to the broadband emission. We model the temporal evolution of the electron spectrum in each emission zone taking into account the injection, cooling and escape of relativistic electrons. By doing so, we are able to calculate the multi-wavelength emission of each radiation zone. The observed emission of a blazar is then the superposition of the emission from all discrete radiation zones. We revisit the multi-wavelength spectral energy distributions, light curves and polarisation under the model, and discuss its potential to reproduce the flat radio spectra, the core-shift phenomena, the minute-scale gamma-ray variability, and the large polarisation-angle swings, which are difficult to explain under the conventional one-zone models simultaneously.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
Prompt, Plan, Perform: LLM-based Humanoid Control via Quantized Imitation Learning
Authors:
**gkai Sun,
Qiang Zhang,
Yiqun Duan,
Xiaoyang Jiang,
Chong Cheng,
Ren**g Xu
Abstract:
In recent years, reinforcement learning and imitation learning have shown great potential for controlling humanoid robots' motion. However, these methods typically create simulation environments and rewards for specific tasks, resulting in the requirements of multiple policies and limited capabilities for tackling complex and unknown tasks. To overcome these issues, we present a novel approach tha…
▽ More
In recent years, reinforcement learning and imitation learning have shown great potential for controlling humanoid robots' motion. However, these methods typically create simulation environments and rewards for specific tasks, resulting in the requirements of multiple policies and limited capabilities for tackling complex and unknown tasks. To overcome these issues, we present a novel approach that combines adversarial imitation learning with large language models (LLMs). This innovative method enables the agent to learn reusable skills with a single policy and solve zero-shot tasks under the guidance of LLMs. In particular, we utilize the LLM as a strategic planner for applying previously learned skills to novel tasks through the comprehension of task-specific prompts. This empowers the robot to perform the specified actions in a sequence. To improve our model, we incorporate codebook-based vector quantization, allowing the agent to generate suitable actions in response to unseen textual commands from LLMs. Furthermore, we design general reward functions that consider the distinct motion features of humanoid robots, ensuring the agent imitates the motion data while maintaining goal orientation without additional guiding direction approaches or policies. To the best of our knowledge, this is the first framework that controls humanoid robots using a single learning policy network and LLM as a planner. Extensive experiments demonstrate that our method exhibits efficient and adaptive ability in complicated motion tasks.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
3D-U-SAM Network For Few-shot Tooth Segmentation in CBCT Images
Authors:
Yifu Zhang,
Zuozhu Liu,
Yang Feng,
Ren**g Xu
Abstract:
Accurate representation of tooth position is extremely important in treatment. 3D dental image segmentation is a widely used method, however labelled 3D dental datasets are a scarce resource, leading to the problem of small samples that this task faces in many cases. To this end, we address this problem with a pretrained SAM and propose a novel 3D-U-SAM network for 3D dental image segmentation. Sp…
▽ More
Accurate representation of tooth position is extremely important in treatment. 3D dental image segmentation is a widely used method, however labelled 3D dental datasets are a scarce resource, leading to the problem of small samples that this task faces in many cases. To this end, we address this problem with a pretrained SAM and propose a novel 3D-U-SAM network for 3D dental image segmentation. Specifically, in order to solve the problem of using 2D pre-trained weights on 3D datasets, we adopted a convolution approximation method; in order to retain more details, we designed skip connections to fuse features at all levels with reference to U-Net. The effectiveness of the proposed method is demonstrated in ablation experiments, comparison experiments, and sample size experiments.
△ Less
Submitted 27 February, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Specializing Small Language Models towards Complex Style Transfer via Latent Attribute Pre-Training
Authors:
Ruiqi Xu,
Yongfeng Huang,
Xin Chen,
Lin Zhang
Abstract:
In this work, we introduce the concept of complex text style transfer tasks, and constructed complex text datasets based on two widely applicable scenarios. Our dataset is the first large-scale data set of its kind, with 700 rephrased sentences and 1,000 sentences from the game Genshin Impact. While large language models (LLM) have shown promise in complex text style transfer, they have drawbacks…
▽ More
In this work, we introduce the concept of complex text style transfer tasks, and constructed complex text datasets based on two widely applicable scenarios. Our dataset is the first large-scale data set of its kind, with 700 rephrased sentences and 1,000 sentences from the game Genshin Impact. While large language models (LLM) have shown promise in complex text style transfer, they have drawbacks such as data privacy concerns, network instability, and high deployment costs. To address these issues, we explore the effectiveness of small models (less than T5-3B) with implicit style pre-training through contrastive learning. We also propose a method for automated evaluation of text generation quality based on alignment with human evaluations using ChatGPT. Finally, we compare our approach with existing methods and show that our model achieves state-of-art performances of few-shot text style transfer models.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Quasi-periodic oscillations during magnetar giant flares in the strangeon star model
Authors:
Hong-Bo Li,
Yacheng Kang,
Zexin Hu,
Li**g Shao,
Cheng-Jun Xia,
Ren-Xin Xu
Abstract:
Soft gamma-ray repeaters (SGRs) are widely understood as slowly rotating isolated neutron stars. Their generally large spin-down rates, high magnetic fields, and strong outburst energies render them different from ordinary pulsars. In a few giant flares (GFs) and short bursts of SGRs, high-confidence quasi-periodic oscillations (QPOs) were observed. Although remaining an open question, many theore…
▽ More
Soft gamma-ray repeaters (SGRs) are widely understood as slowly rotating isolated neutron stars. Their generally large spin-down rates, high magnetic fields, and strong outburst energies render them different from ordinary pulsars. In a few giant flares (GFs) and short bursts of SGRs, high-confidence quasi-periodic oscillations (QPOs) were observed. Although remaining an open question, many theoretical studies suggest that the torsional oscillations caused by starquakes could explain QPOs. Motivated by this scenario, we systematically investigate torsional oscillation frequencies based on the strangeon-star (SS) model with various values of harmonic indices and overtones. To characterize the strong-repulsive interaction at short distances and the non-relativistic nature of strangeons, a phenomenological Lennard-Jones model is adopted. We show that, attributing to the large shear modulus of SSs, our results explain well the high-frequency QPOs ($\gtrsim 150\,\mathrm{Hz}$) during the GFs. The low-frequency QPOs ($\lesssim 150\,\mathrm{Hz}$) can also be interpreted when the ocean-crust interface modes are included. We also discuss possible effects of the magnetic field on the torsional mode frequencies. Considering realistic models with general-relativistic corrections and magnetic fields, we further calculate torsional oscillation frequencies for quark stars. We show that it would be difficult for quark stars to explain all QPOs in GFs. Our work advances the understanding of the nature of QPOs and magnetar asteroseismology.
△ Less
Submitted 28 October, 2023; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Disassembling one-dimensional chains in molybdenum oxides
Authors:
Xian Du,
Yidian Li,
Wenxuan Zhao,
Runzhe Xu,
Kaiyi Zhai,
Yulin Chen,
Lexian Yang
Abstract:
The dimensionality of quantum materials strongly affects their physical properties. Although many emergent phenomena, such as charge-density wave and Luttinger liquid behavior, are well understood in one-dimensional (1D) systems, the generalization to explore them in higher dimensional systems is still a challenging task. In this study, we aim to bridge this gap by systematically investigating the…
▽ More
The dimensionality of quantum materials strongly affects their physical properties. Although many emergent phenomena, such as charge-density wave and Luttinger liquid behavior, are well understood in one-dimensional (1D) systems, the generalization to explore them in higher dimensional systems is still a challenging task. In this study, we aim to bridge this gap by systematically investigating the crystal and electronic structures of molybdenum-oxide family compounds, where the contexture of 1D chains facilitates rich emergent properties. While the quasi-1D chains in these materials share general similarities, such as the motifs made up of MoO6 octahedrons, they exhibit vast complexity and remarkable tunability. We disassemble the 1D chains in molybdenum oxides with different dimensions and construct effective models to excellently fit their low-energy electronic structures obtained by ab initio calculations. Furthermore, we discuss the implications of such chains on other physical properties of the materials and the practical significance of the effective models. Our work establishes the molybdenum oxides as simple and tunable model systems for studying and manipulating the dimensionality in quantum systems.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Chasing Day and Night: Towards Robust and Efficient All-Day Object Detection Guided by an Event Camera
Authors:
Jiahang Cao,
Xu Zheng,
Yuanhuiyi Lyu,
Jiaxu Wang,
Ren**g Xu,
Lin Wang
Abstract:
The ability to detect objects in all lighting (i.e., normal-, over-, and under-exposed) conditions is crucial for real-world applications, such as self-driving.Traditional RGB-based detectors often fail under such varying lighting conditions.Therefore, recent works utilize novel event cameras to supplement or guide the RGB modality; however, these methods typically adopt asymmetric network structu…
▽ More
The ability to detect objects in all lighting (i.e., normal-, over-, and under-exposed) conditions is crucial for real-world applications, such as self-driving.Traditional RGB-based detectors often fail under such varying lighting conditions.Therefore, recent works utilize novel event cameras to supplement or guide the RGB modality; however, these methods typically adopt asymmetric network structures that rely predominantly on the RGB modality, resulting in limited robustness for all-day detection. In this paper, we propose EOLO, a novel object detection framework that achieves robust and efficient all-day detection by fusing both RGB and event modalities. Our EOLO framework is built based on a lightweight spiking neural network (SNN) to efficiently leverage the asynchronous property of events. Buttressed by it, we first introduce an Event Temporal Attention (ETA) module to learn the high temporal information from events while preserving crucial edge information. Secondly, as different modalities exhibit varying levels of importance under diverse lighting conditions, we propose a novel Symmetric RGB-Event Fusion (SREF) module to effectively fuse RGB-Event features without relying on a specific modality, thus ensuring a balanced and adaptive fusion for all-day detection. In addition, to compensate for the lack of paired RGB-Event datasets for all-day training and evaluation, we propose an event synthesis approach based on the randomized optical flow that allows for directly generating the event frame from a single exposure image. We further build two new datasets, E-MSCOCO and E-VOC based on the popular benchmarks MSCOCO and PASCAL VOC. Extensive experiments demonstrate that our EOLO outperforms the state-of-the-art detectors,e.g.,RENet,by a substantial margin (+3.74% mAP50) in all lighting conditions.Our code and datasets will be available at https://vlislab22.github.io/EOLO/
△ Less
Submitted 18 March, 2024; v1 submitted 17 September, 2023;
originally announced September 2023.
-
A Benchmark for Text Expansion: Datasets, Metrics, and Baselines
Authors:
Yi Chen,
Haiyun Jiang,
Wei Bi,
Rui Wang,
Longyue Wang,
Shuming Shi,
Ruifeng Xu
Abstract:
This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings. Different from existing insertion-based writing assistance tasks, TE requires the model to be more flexible in both locating and generation, and also more cautious in kee** basic semantics. We leverage four complementary…
▽ More
This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings. Different from existing insertion-based writing assistance tasks, TE requires the model to be more flexible in both locating and generation, and also more cautious in kee** basic semantics. We leverage four complementary approaches to construct a dataset with 12 million automatically generated instances and 2K human-annotated references for both English and Chinese. To facilitate automatic evaluation, we design various metrics from multiple perspectives. In particular, we propose Info-Gain to effectively measure the informativeness of expansions, which is an important quality dimension in TE. On top of a pre-trained text-infilling model, we build both pipelined and joint Locate&Infill models, which demonstrate the superiority over the Text2Text baselines, especially in expansion informativeness. Experiments verify the feasibility of the TE task and point out potential directions for future research toward better automatic text expansion.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
The FAST Galactic Plane Pulsar Snapshot survey: IV. Discovery of five fast radio bursts
Authors:
D. J. Zhou,
J. L. Han,
W. C. **g,
P. F. Wang,
C. Wang,
T. Wang,
W. -Y. Wang,
R. Luo,
J. Xu,
R. X. Xu,
H. G. Wang
Abstract:
We report five new fast radio bursts (FRBs) discovered from the Galactic Plane Pulsar Snapshot (GPPS) survey by the Five-hundred-meter Aperture Spherical radio Telescope (FAST): FRB\,20210126, FRB\,20210208, FRB\,20210705, FRB\,20211005 and FRB\,20220306. To date, no repeating bursts from these FRB sources have been detected in the follow-up monitoring observations, leading to their classification…
▽ More
We report five new fast radio bursts (FRBs) discovered from the Galactic Plane Pulsar Snapshot (GPPS) survey by the Five-hundred-meter Aperture Spherical radio Telescope (FAST): FRB\,20210126, FRB\,20210208, FRB\,20210705, FRB\,20211005 and FRB\,20220306. To date, no repeating bursts from these FRB sources have been detected in the follow-up monitoring observations, leading to their classification as potential one-off events. We obtain the basic parameters for these bursts, including position, dispersion measure (DM), pulse width, spectral index, scattering time-scale, etc. The fluences and flux densities are generally lower in comparison to the values observed in one-off bursts discovered by other telescopes. Among the observed bursts, polarization data for 4 bursts were recorded during observations. Consequently, we obtain polarization profiles and Faraday rotation measures (RMs) for these bursts.
△ Less
Submitted 11 October, 2023; v1 submitted 9 September, 2023;
originally announced September 2023.