Skip to main content

Showing 1–50 of 11,408 results for author: Xin

.
  1. arXiv:2407.09174  [pdf, other

    cs.CV cs.AI

    DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training

    Authors: Chen Xin, Andreas Hartel, Enkelejda Kasneci

    Abstract: Swift and accurate detection of specified objects is crucial for many industrial applications, such as safety monitoring on construction sites. However, traditional approaches rely heavily on arduous manual annotation and data collection, which struggle to adapt to ever-changing environments and novel target objects. To address these limitations, this paper presents DART, an automated end-to-end p… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.09018  [pdf, other

    cs.SE

    AUITestAgent: Automatic Requirements Oriented GUI Function Testing

    Authors: Yongxiang Hu, Xuan Wang, Yingchuan Wang, Yu Zhang, Shiyu Guo, Chaoyi Chen, Xin Wang, Yangfan Zhou

    Abstract: The Graphical User Interface (GUI) is how users interact with mobile apps. To ensure it functions properly, testing engineers have to make sure it functions as intended, based on test requirements that are typically written in natural language. While widely adopted manual testing and script-based methods are effective, they demand substantial effort due to the vast number of GUI pages and rapid it… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  3. arXiv:2407.08995  [pdf, other

    cs.CL

    Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs

    Authors: Aobo Kong, Shiwan Zhao, Hao Chen, Qicheng Li, Yong Qin, Ruiqi Sun, Xin Zhou, Jiaming Zhou, Haoqin Sun

    Abstract: Recent advancements in LLMs have showcased their remarkable role-playing capabilities, able to accurately simulate the dialogue styles and cognitive processes of various roles based on different instructions and contexts. Studies indicate that assigning LLMs the roles of experts, a strategy known as role-play prompting, can enhance their performance in the corresponding domains. However, the promp… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  4. arXiv:2407.08958  [pdf, other

    cs.SE

    Towards Practical and Useful Automated Program Repair for Debugging

    Authors: Qi Xin, Haojun Wu, Steven P. Reiss, Jifeng Xuan

    Abstract: Current automated program repair (APR) techniques are far from being practical and useful enough to be considered for realistic debugging. They rely on unrealistic assumptions including the requirement of a comprehensive suite of test cases as the correctness criterion and frequent program re-execution for patch validation; they are not fast; and their ability of repairing the commonly arising com… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  5. arXiv:2407.08745  [pdf, other

    cs.NE cs.AI

    Evolutionary Computation for the Design and Enrichment of General-Purpose Artificial Intelligence Systems: Survey and Prospects

    Authors: Javier Poyatos, Javier Del Ser, Salvador Garcia, Hisao Ishibuchi, Daniel Molina, Isaac Triguero, Bing Xue, Xin Yao, Francisco Herrera

    Abstract: In Artificial Intelligence, there is an increasing demand for adaptive models capable of dealing with a diverse spectrum of learning tasks, surpassing the limitations of systems devised to cope with a single task. The recent emergence of General-Purpose Artificial Intelligence Systems (GPAIS) poses model configuration and adaptability challenges at far greater complexity scales than the optimal de… ▽ More

    Submitted 3 June, 2024; originally announced July 2024.

  6. arXiv:2407.08555  [pdf, other

    eess.IV cs.CV

    SLoRD: Structural Low-Rank Descriptors for Shape Consistency in Vertebrae Segmentation

    Authors: Xin You, Yixin Lou, Minghui Zhang, Chuyan Zhang, Jie Yang, Yun Gu

    Abstract: Automatic and precise segmentation of vertebrae from CT images is crucial for various clinical applications. However, due to a lack of explicit and strict constraints, existing methods especially for single-stage methods, still suffer from the challenge of intra-vertebrae segmentation inconsistency, which refers to multiple label predictions inside a singular vertebra. For multi-stage methods, ver… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Under review

  7. arXiv:2407.08528  [pdf, other

    eess.IV cs.CV cs.MM

    Enhancing octree-based context models for point cloud geometry compression with attention-based child node number prediction

    Authors: Chang Sun, Hui Yuan, Xiaolong Mao, Xin Lu, Raouf Hamzaoui

    Abstract: In point cloud geometry compression, most octreebased context models use the cross-entropy between the onehot encoding of node occupancy and the probability distribution predicted by the context model as the loss. This approach converts the problem of predicting the number (a regression problem) and the position (a classification problem) of occupied child nodes into a 255-dimensional classificati… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 2 figures and 2 tables

    Journal ref: IEEE Signal Processing Letters, 2024

  8. arXiv:2407.08520  [pdf, other

    eess.IV cs.CV cs.MM

    Enhancing context models for point cloud geometry compression with context feature residuals and multi-loss

    Authors: Chang Sun, Hui Yuan, Shuai Li, Xin Lu, Raouf Hamzaoui

    Abstract: In point cloud geometry compression, context models usually use the one-hot encoding of node occupancy as the label, and the cross-entropy between the one-hot encoding and the probability distribution predicted by the context model as the loss function. However, this approach has two main weaknesses. First, the differences between contexts of different nodes are not significant, making it difficul… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 11 pages, 8 figures

    Journal ref: IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 14, no. 2, pp. 224-234, Jun. 2024

  9. arXiv:2407.08427  [pdf, other

    astro-ph.CO hep-th

    Constraining Holographic Dark Energy and Analyzing Cosmological Tensions

    Authors: Xin Tang, Yin-Zhe Ma, Wei-Ming Dai, Hong-Jian He

    Abstract: We investigate cosmological constraints on the holographic dark energy (HDE) using the state-of-the-art cosmological datasets: Planck CMB angular power spectra and weak lensing power spectra, Atacama Cosmology Telescope (ACT) temperature power spectra, baryon acoustic oscillation (BAO) and redshift-space distortion (RSD) measurements from six-degree-field galaxy survey and Sloan Digital Sky Survey… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 11 pages, 7 figures, 5 tables

    Journal ref: Physics of the Dark Universe 46 (2024) 101568

  10. Chromosomal Structural Abnormality Diagnosis by Homologous Similarity

    Authors: Juren Li, Fanzhe Fu, Ran Wei, Yifei Sun, Zeyu Lai, Ning Song, Xin Chen, Yang Yang

    Abstract: Pathogenic chromosome abnormalities are very common among the general population. While numerical chromosome abnormalities can be quickly and precisely detected, structural chromosome abnormalities are far more complex and typically require considerable efforts by human experts for identification. This paper focuses on investigating the modeling of chromosome features and the identification of chr… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  11. arXiv:2407.08183  [pdf, other

    astro-ph.SR

    The white-light superflares from cool stars in GWAC triggers

    Authors: Guang-Wei Li, Liang Wang, Hai-Long Yuan, Li-** Xin, **g Wang, Chao Wu, Hua-Li Li, Hasitieer Haerken, Wei-Hua Wang, Hong-Bo Cai, Xu-Hui Han, Yang Xu, Lei Huang, Xiao-Meng Lu, Jian-Ying Bai, Xiang-Yu Wang, Zi-Gao Dai, En-Wei Liang, Jian-Yan Wei

    Abstract: M-type stars are the ones that flare most frequently, but how big their maximum flare energy can reach is still unknown. We present 163 flares from 162 individual M2 through L1-type stars that triggered the GWAC, with flare energies ranging from $10^{32.2}$ to $10^{36.4}$ erg . The flare amplitudes range from $\triangle G = 0.84$ to $\sim 10$ mag. Flare energy increases with stellar surface temper… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 18 pages, 11 figures, 4 tables

  12. arXiv:2407.08164  [pdf, other

    cs.AI cs.MA cs.RO

    Hierarchical Consensus-Based Multi-Agent Reinforcement Learning for Multi-Robot Cooperation Tasks

    Authors: Pu Feng, Junkang Liang, Size Wang, Xin Yu, Rongye Shi, Wenjun Wu

    Abstract: In multi-agent reinforcement learning (MARL), the Centralized Training with Decentralized Execution (CTDE) framework is pivotal but struggles due to a gap: global state guidance in training versus reliance on local observations in execution, lacking global signals. Inspired by human societal consensus mechanisms, we introduce the Hierarchical Consensus-based Multi-Agent Reinforcement Learning (HC-… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 8 pages, 10 figures. Accepted for presentation at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  13. arXiv:2407.07805  [pdf, other

    cs.CV

    SUMix: Mixup with Semantic and Uncertain Information

    Authors: Huafeng Qin, Xin **, Hongyu Zhu, Hongchao Liao, Mounîm A. El-Yacoubi, Xinbo Gao

    Abstract: Mixup data augmentation approaches have been applied for various tasks of deep learning to improve the generalization ability of deep neural networks. Some existing approaches CutMix, SaliencyMix, etc. randomly replace a patch in one image with patches from another to generate the mixed image. Similarly, the corresponding labels are linearly combined by a fixed ratio $λ$ by l. The objects in two i… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024 [Camera Ready] (16 pages, 5 figures) with the source code at https://github.com/**Xins/SUMix

  14. arXiv:2407.07760  [pdf, other

    cs.CV cs.AI

    Learning Spatial-Semantic Features for Robust Video Object Segmentation

    Authors: Xin Li, Deshui Miao, Zhenyu He, Yaowei Wang, Huchuan Lu, Ming-Hsuan Yang

    Abstract: Tracking and segmenting multiple similar objects with complex or separate parts in long-term videos is inherently challenging due to the ambiguity of target parts and identity confusion caused by occlusion, background clutter, and long-term variations. In this paper, we propose a robust video object segmentation framework equipped with spatial-semantic features and discriminative object queries to… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Winner solution of the VOTS2024 Challenge

  15. arXiv:2407.07747  [pdf, other

    cs.NI cs.AI

    HGFF: A Deep Reinforcement Learning Framework for Lifetime Maximization in Wireless Sensor Networks

    Authors: Xiaoxu Han, Xin Mu, **ghui Zhong

    Abstract: Planning the movement of the sink to maximize the lifetime in wireless sensor networks is an essential problem of great research challenge and practical value. Many existing mobile sink techniques based on mathematical programming or heuristics have demonstrated the feasibility of the task. Nevertheless, the huge computation consumption or the over-reliance on human knowledge can result in relativ… ▽ More

    Submitted 11 April, 2024; originally announced July 2024.

    Comments: Preprint. Under review

  16. arXiv:2407.07744  [pdf, other

    cs.IT cs.AI eess.SP

    Belief Information based Deep Channel Estimation for Massive MIMO Systems

    Authors: Jialong Xu, Liu Liu, Xin Wang, Lan Chen

    Abstract: In the next generation wireless communication system, transmission rates should continue to rise to support emerging scenarios, e.g., the immersive communications. From the perspective of communication system evolution, multiple-input multiple-output (MIMO) technology remains pivotal for enhancing transmission rates. However, current MIMO systems rely on inserting pilot signals to achieve accurate… ▽ More

    Submitted 23 June, 2024; originally announced July 2024.

    Comments: 5 pages, 4 figures

  17. arXiv:2407.07513  [pdf, other

    quant-ph

    High-rate quantum digital signatures network with integrated silicon photonics

    Authors: Yongqiang Du, Bing-Hong Li, Xin Hua, Xiao-Yu Cao, Zhengeng Zhao, Feng Xie, Zhenrong Zhang, Hua-Lei Yin, Xi Xiao, Ke** Wei

    Abstract: The development of quantum networks is paramount towards practical and secure communications. Quantum digital signatures (QDS) offer an information-theoretically secure solution for ensuring data integrity, authenticity, and non-repudiation, rapidly growing from proof-of-concept to robust demonstrations. However, previous QDS systems relied on expensive and bulky optical equipment, limiting large-… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 11 pages, 6 figures

  18. arXiv:2407.07510  [pdf, other

    cs.CR cs.CV eess.SY

    Invisible Optical Adversarial Stripes on Traffic Sign against Autonomous Vehicles

    Authors: Dongfang Guo, Yuting Wu, Yimin Dai, Pengfei Zhou, Xin Lou, Rui Tan

    Abstract: Camera-based computer vision is essential to autonomous vehicle's perception. This paper presents an attack that uses light-emitting diodes and exploits the camera's rolling shutter effect to create adversarial stripes in the captured images to mislead traffic sign recognition. The attack is stealthy because the stripes on the traffic sign are invisible to human. For the attack to be threatening,… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Journal ref: In Proceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services (MobiSys 2024), 534-546

  19. arXiv:2407.07472  [pdf, other

    cs.SE cs.AI

    Rectifier: Code Translation with Corrector via LLMs

    Authors: Xin Yin, Chao Ni, Tien N. Nguyen, Shaohua Wang, Xiaohu Yang

    Abstract: Software migration is garnering increasing attention with the evolution of software and society. Early studies mainly relied on handcrafted translation rules to translate between two languages, the translation process is error-prone and time-consuming. In recent years, researchers have begun to explore the use of pre-trained large language models (LLMs) in code translation. However, code translati… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2308.03109, arXiv:2302.03908 by other authors

  20. arXiv:2407.07465  [pdf, other

    cs.CV

    Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation Pretraining

    Authors: Tianfang Sun, Zhizhong Zhang, Xin Tan, Yanyun Qu, Yuan Xie

    Abstract: LiDAR-camera 3D representation pretraining has shown significant promise for 3D perception tasks and related applications. However, two issues widely exist in this framework: 1) Solely keyframes are used for training. For example, in nuScenes, a substantial quantity of unpaired LiDAR and camera frames remain unutilized, limiting the representation capabilities of the pretrained network. 2) The con… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: preprint, version 1

  21. arXiv:2407.07372  [pdf, other

    eess.IV cs.CV

    Trustworthy Contrast-enhanced Brain MRI Synthesis

    Authors: Jiyao Liu, Yuxin Li, Shangqi Gao, Yuncheng Zhou, Xin Gao, Ningsheng Xu, Xiao-Yong Zhang, Xiahai Zhuang

    Abstract: Contrast-enhanced brain MRI (CE-MRI) is a valuable diagnostic technique but may pose health risks and incur high costs. To create safer alternatives, multi-modality medical image translation aims to synthesize CE-MRI images from other available modalities. Although existing methods can generate promising predictions, they still face two challenges, i.e., exhibiting over-confidence and lacking inte… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures

  22. arXiv:2407.07336  [pdf, other

    astro-ph.CO gr-qc hep-ph

    Testing the cosmic distance duality relation using strong gravitational lensing time delays and Type Ia supernovae

    Authors: **g-Zhao Qi, Yi-Fan Jiang, Wan-Ting Hou, Xin Zhang

    Abstract: We present a comprehensive test of the cosmic distance duality relation (DDR) using a combination of strong gravitational lensing (SGL) time delay measurements and Type Ia supernovae (SNe Ia) data. We investigate three different parameterizations of potential DDR violations. To bridge the gap between SGL and SNe Ia datasets, we implement an artificial neural network (ANN) approach to reconstruct t… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures

  23. arXiv:2407.07325  [pdf, other

    cs.CV cs.CL cs.MM eess.IV

    HiLight: Technical Report on the Motern AI Video Language Model

    Authors: Zhiting Wang, Qiangong Zhou, Kangjie Yang, Zongyang Liu, Xin Mao

    Abstract: This technical report presents the implementation of a state-of-the-art video encoder for video-text modal alignment and a video conversation framework called HiLight, which features dual visual towers. The work is divided into two main parts: 1.alignment of video and text modalities; 2.convenient and efficient way to interact with users. Our goal is to address the task of video comprehension in t… ▽ More

    Submitted 11 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

  24. arXiv:2407.06948  [pdf, other

    eess.SY

    Detection-Triggered Recursive Impact Mitigation against Secondary False Data Injection Attacks in Microgrids

    Authors: Mengxiang Liu, Xin Zhang, Rui Zhang, Zhuoran Zhou, Zhenyong Zhang, Ruilong Deng

    Abstract: The cybersecurity of microgrid has received widespread attentions due to the frequently reported attack accidents against distributed energy resource (DER) manufactures. Numerous impact mitigation schemes have been proposed to reduce or eliminate the impacts of false data injection attacks (FDIAs). Nevertheless, the existing methods either requires at least one neighboring trustworthy agent or may… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Submitted to IEEE Transactions on Smart Grid

  25. arXiv:2407.06544  [pdf, other

    cs.LG

    Multiple Instance Verification

    Authors: Xin Xu, Eibe Frank, Geoffrey Holmes

    Abstract: We explore multiple-instance verification, a problem setting where a query instance is verified against a bag of target instances with heterogeneous, unknown relevancy. We show that naive adaptations of attention-based multiple instance learning (MIL) methods and standard verification methods like Siamese neural networks are unsuitable for this setting: directly combining state-of-the-art (SOTA) M… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 30 pages

  26. arXiv:2407.06309  [pdf, other

    cs.CY cs.AI

    Multimodal Chain-of-Thought Reasoning via ChatGPT to Protect Children from Age-Inappropriate Apps

    Authors: Chuanbo Hu, Bin Liu, Minglei Yin, Yilu Zhou, Xin Li

    Abstract: Mobile applications (Apps) could expose children to inappropriate themes such as sexual content, violence, and drug use. Maturity rating offers a quick and effective method for potential users, particularly guardians, to assess the maturity levels of apps. Determining accurate maturity ratings for mobile apps is essential to protect children's health in today's saturated digital marketplace. Exist… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  27. arXiv:2407.06293  [pdf, other

    cs.CE physics.app-ph

    A Framework for Simulating the Path-level Residual Stress in the Laser Powder Bed Fusion Process

    Authors: Xin Liu, Xingchen Liu, Paul Witherell

    Abstract: Laser Powder Bed Fusion (LPBF) additive manufacturing has revolutionized industries with its capability to create intricate and customized components. The LPBF process uses moving heat sources to melt and solidify metal powders. The fast melting and cooling leads to residual stress, which critically affects the part quality. Currently, the computational intensity of accurately simulating the resid… ▽ More

    Submitted 10 April, 2024; originally announced July 2024.

  28. arXiv:2407.05984  [pdf, other

    eess.IV

    MBA-Net: SAM-driven Bidirectional Aggregation Network for Ovarian Tumor Segmentation

    Authors: Yifan Gao, Wei Xia, Wenkui Wang, Xin Gao

    Abstract: Accurate segmentation of ovarian tumors from medical images is crucial for early diagnosis, treatment planning, and patient management. However, the diverse morphological characteristics and heterogeneous appearances of ovarian tumors pose significant challenges to automated segmentation methods. In this paper, we propose MBA-Net, a novel architecture that integrates the powerful segmentation capa… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: MICCAI 2024

  29. arXiv:2407.05758  [pdf, other

    eess.IV cs.AI cs.CV

    Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports

    Authors: Yutong Zhang, Yi Pan, Tianyang Zhong, Peixin Dong, Kangni Xie, Yuxiao Liu, Hanqi Jiang, Zhengliang Liu, Shijie Zhao, Tuo Zhang, Xi Jiang, Dinggang Shen, Tianming Liu, Xin Zhang

    Abstract: Medical images and radiology reports are crucial for diagnosing medical conditions, highlighting the importance of quantitative analysis for clinical decision-making. However, the diversity and cross-source heterogeneity of these data challenge the generalizability of current data-mining methods. Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecti… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  30. arXiv:2407.05738  [pdf, ps, other

    eess.SP eess.SY

    Collaborative Secret and Covert Communications for Multi-User Multi-Antenna Uplink UAV Systems: Design and Optimization

    Authors: **peng Xu, Lin Bai, Xin Xie, Lin Zhou

    Abstract: Motivated by diverse secure requirements of multi-user in UAV systems, we propose a collaborative secret and covert transmission method for multi-antenna ground users to unmanned aerial vehicle (UAV) communications. Specifically, based on the power domain non-orthogonal multiple access (NOMA), two ground users with distinct security requirements, named Bob and Carlo, superimpose their signals and… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  31. arXiv:2407.05677  [pdf, other

    eess.IV

    PCAC-GAN:ASparse-Tensor-Based Generative Adversarial Network for 3D Point Cloud Attribute Compression

    Authors: Xiaolong Mao, Hui Yuan, Xin Lu, Raouf Hamzaoui, Wei Gao

    Abstract: Learning-based methods have proven successful in compressing geometric information for point clouds. For attribute compression, however, they still lag behind non-learning-based methods such as the MPEG G-PCC standard. To bridge this gap, we propose a novel deep learning-based point cloud attribute compression method that uses a generative adversarial network (GAN) with sparse convolution layers.… ▽ More

    Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 14 pages, 5 figures

    MSC Class: 94J20 ACM Class: I.4.2

  32. arXiv:2407.05626  [pdf, other

    math.NA

    A Stochastic Interacting Particle-Field Algorithm for a Haptotaxis Advection-Diffusion System Modeling Cancer Cell Invasion

    Authors: Boyi Hu, Zhongjian Wang, Jack Xin, Zhiwen Zhang

    Abstract: The investigation of tumor invasion and metastasis dynamics is crucial for advancements in cancer biology and treatment. Many mathematical models have been developed to study the invasion of host tissue by tumor cells. In this paper, we develop a novel stochastic interacting particle-field (SIPF) algorithm that accurately simulates the cancer cell invasion process within the haptotaxis advection-d… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  33. arXiv:2407.05608  [pdf, other

    cs.SD cs.CL eess.AS

    A Benchmark for Multi-speaker Anonymization

    Authors: Xiaoxiao Miao, Ruijie Tao, Chang Zeng, Xin Wang

    Abstract: Privacy-preserving voice protection approaches primarily suppress privacy-related information derived from paralinguistic attributes while preserving the linguistic content. Existing solutions focus on single-speaker scenarios. However, they lack practicality for real-world applications, i.e., multi-speaker scenarios. In this paper, we present an initial attempt to provide a multi-speaker anonymiz… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  34. arXiv:2407.05563  [pdf, other

    cs.CL

    LLMBox: A Comprehensive Library for Large Language Models

    Authors: Tianyi Tang, Yiwen Hu, Bingqian Li, Wenyang Luo, Zi**g Qin, Haoxiang Sun, Jiapeng Wang, Shiyi Xu, Xiaoxue Cheng, Geyang Guo, Han Peng, Bowen Zheng, Yiru Tang, Yingqian Min, Yushuo Chen, Jie Chen, Yuanqian Zhao, Luran Ding, Yuhao Wang, Zican Dong, Chunxuan Xia, Junyi Li, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: To facilitate the research on large language models (LLMs), this paper presents a comprehensive and unified library, LLMBox, to ease the development, use, and evaluation of LLMs. This library is featured with three main merits: (1) a unified data interface that supports the flexible implementation of various training strategies, (2) a comprehensive evaluation that covers extensive tasks, datasets,… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Accepted by ACL 2024 Demo

  35. arXiv:2407.05421  [pdf, other

    eess.AS cs.SD

    ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation

    Authors: Ruibo Fu, Xin Qi, Zhengqi Wen, Jianhua Tao, Tao Wang, Chunyu Qiang, Zhiyong Wang, Yi Lu, Xiaopeng Wang, Shuchen Shi, Yukun Liu, Xuefei Liu, Shuai Zhang

    Abstract: Speaker adaptation, which involves cloning voices from unseen speakers in the Text-to-Speech task, has garnered significant interest due to its numerous applications in multi-media fields. Despite recent advancements, existing methods often struggle with inadequate speaker representation accuracy and overfitting, particularly in limited reference speeches scenarios. To address these challenges, we… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: The audio demo is available at https://7xin.github.io/ASRRL/

  36. arXiv:2407.05251  [pdf, ps, other

    math.OA

    Almost elementary groupoid models for $C^*$-algebras

    Authors: Xin Ma, Jianchao Wu

    Abstract: The notion of almost elementariness for a locally compact Hausdorff étale groupoid $\mathcal{G}$ with a compact unit space was introduced by the authors as a sufficient condition ensuring the reduced groupoid $C^*$-algebra $C^*_r(\mathcal{G})$ is (tracially) $\mathcal{Z}$-stable and thus classifiable under additional natural assumption. In this paper, we explore the converse direction and show tha… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  37. arXiv:2407.05041  [pdf, ps, other

    math.NA

    Local convergence analysis of L1/finite element scheme for a constant delay reaction-subdiffusion equation with uniform time mesh

    Authors: Wei** Bu, Xin Zheng

    Abstract: The aim of this paper is to develop a refined error estimate of L1/finite element scheme for a reaction-subdiffusion equation with constant delay $τ$ and uniform time mesh. Under the non-uniform multi-singularity assumption of exact solution in time, the local truncation errors of the L1 scheme with uniform mesh is investigated. Then we introduce a fully discrete finite element scheme of the consi… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  38. arXiv:2407.04969  [pdf, other

    cs.CL

    EVA-Score: Evaluation of Long-form Summarization on Informativeness through Extraction and Validation

    Authors: Yuchen Fan, Xin Zhong, Chengsi Wang, Gaoche Wu, Bowen Zhou

    Abstract: Summarization is a fundamental task in natural language processing (NLP) and since large language models (LLMs), such as GPT-4 and Claude, come out, increasing attention has been paid to long-form summarization whose input sequences are much longer, indicating more information contained. The current evaluation metrics either use similarity-based metrics like ROUGE and BERTScore which rely on sim… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 16 pages, 3 figures, submitted to EMNLP

  39. arXiv:2407.04942  [pdf, other

    cs.RO cs.LG

    FOSP: Fine-tuning Offline Safe Policy through World Models

    Authors: Chenyang Cao, Yucheng Xin, Silang Wu, Longxiang He, Zichen Yan, Junbo Tan, Xueqian Wang

    Abstract: Model-based Reinforcement Learning (RL) has shown its high training efficiency and capability of handling high-dimensional tasks. Regarding safety issues, safe model-based RL can achieve nearly zero-cost performance and effectively manage the trade-off between performance and safety. Nevertheless, prior works still pose safety challenges due to the online exploration in real-world deployment. To a… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 21 pages

  40. arXiv:2407.04697  [pdf, other

    cs.CV cs.MM

    VCoME: Verbal Video Composition with Multimodal Editing Effects

    Authors: Weibo Gong, Xiaojie **, Xin Li, Dongliang He, Xinglong Wu

    Abstract: Verbal videos, featuring voice-overs or text overlays, provide valuable content but present significant challenges in composition, especially when incorporating editing effects to enhance clarity and visual appeal. In this paper, we introduce the novel task of verbal video composition with editing effects. This task aims to generate coherent and visually appealing verbal videos by integrating mult… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  41. arXiv:2407.04684  [pdf, other

    astro-ph.HE

    Investigating the Mass of the Black Hole and Possible Wind Outflow of the Accretion Disk in the Tidal Disruption Event AT2021ehb

    Authors: Xin Xiang, Jon M. Miller, Abderahmen Zoghbi, Mark T. Reynolds, David Bogensberger, Lixin Dai, Paul A. Draghis, Jeremy J. Drake, Olivier Godet, Jimmy A. Irwin, Michael C. Miller, Brenna E. Mockler, Richard Saxton, Natalie Webb

    Abstract: Tidal disruption events (TDEs) can potentially probe low-mass black holes in host galaxies that might not adhere to bulge or stellar-dispersion relationships. At least initially, TDEs can also reveal super-Eddington accretion. X-ray spectroscopy can potentially constrain black hole masses, and reveal ionized outflows associated with super-Eddington accretion. Our analysis of XMM-Newton X-ray obser… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 19 pages, 4 figures

  42. PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation

    Authors: Yinghua Yao, Yuangang Pan, **g Li, Ivor Tsang, Xin Yao

    Abstract: Recent advancements in the realm of deep generative models focus on generating samples that satisfy multiple desired properties. However, prevalent approaches optimize these property functions independently, thus omitting the trade-offs among them. In addition, the property optimization is often improperly integrated into the generative models, resulting in an unnecessary compromise on generation… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Journal ref: Machine Learning 2024

  43. Exploration of Class Center for Fine-Grained Visual Classification

    Authors: Hang Yao, Qiguang Miao, Peipei Zhao, Chaoneng Li, Xin Li, Guanwen Feng, Ruyi Liu

    Abstract: Different from large-scale classification tasks, fine-grained visual classification is a challenging task due to two critical problems: 1) evident intra-class variances and subtle inter-class differences, and 2) overfitting owing to fewer training samples in datasets. Most existing methods extract key features to reduce intra-class variances, but pay no attention to subtle inter-class differences… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Accpeted by TCSVT. Code and trained models are here:https://github.com/hyao1/ECC

  44. arXiv:2407.04242  [pdf, other

    cs.CV

    Fine-grained Context and Multi-modal Alignment for Freehand 3D Ultrasound Reconstruction

    Authors: Zhongnuo Yan, Xin Yang, Mingyuan Luo, Jiongquan Chen, Rusi Chen, Lian Liu, Dong Ni

    Abstract: Fine-grained spatio-temporal learning is crucial for freehand 3D ultrasound reconstruction. Previous works mainly resorted to the coarse-grained spatial features and the separated temporal dependency learning and struggles for fine-grained spatio-temporal learning. Mining spatio-temporal information in fine-grained scales is extremely challenging due to learning difficulties in long-range dependen… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Accepted at MICCAI 2024. This is the submitted manuscript and the preprint has not undergone peer review (when applicable) or any post-submission improvements or corrections

  45. arXiv:2407.04085  [pdf, other

    cs.CV

    FIPGNet:Pyramid grafting network with feature interaction strategies

    Authors: Ziyi Ding, Like Xin

    Abstract: Salient object detection is designed to identify the objects in an image that attract the most visual attention.Currently, the most advanced method of significance object detection adopts pyramid grafting network architecture.However, pyramid-graft network architecture still has the problem of failing to accurately locate significant targets.We observe that this is mainly due to the fact that curr… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2309.08365 by other authors

  46. arXiv:2407.04024  [pdf, other

    cs.CV

    Adaptive Step-size Perception Unfolding Network with Non-local Hybrid Attention for Hyperspectral Image Reconstruction

    Authors: Yanan Yang, Like Xin

    Abstract: Deep unfolding methods and transformer architecture have recently shown promising results in hyperspectral image (HSI) reconstruction. However, there still exist two issues: (1) in the data subproblem, most methods represents the stepsize utilizing a learnable parameter. Nevertheless, for different spectral channel, error between features and ground truth is unequal. (2) Transformer struggles to b… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  47. arXiv:2407.04020  [pdf, other

    cs.CL

    LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking

    Authors: Amy Xin, Yunjia Qi, Zijun Yao, Fangwei Zhu, Kaisheng Zeng, Xu Bin, Lei Hou, Juanzi Li

    Abstract: Entity Linking (EL) models are well-trained at map** mentions to their corresponding entities according to a given context. However, EL models struggle to disambiguate long-tail entities due to their limited training data. Meanwhile, large language models (LLMs) are more robust at interpreting uncommon mentions. Yet, due to a lack of specialized training, LLMs suffer at generating correct entity… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  48. arXiv:2407.03980  [pdf, other

    quant-ph

    Practical asynchronous measurement-device-independent quantum key distribution with advantage distillation

    Authors: Di Luo, Xin Liu, Kaibiao Qin, Zhenrong Zhang, Ke** Wei

    Abstract: The advantage distillation (AD) method has proven effective in improving the performance of quantum key distribution (QKD). In this paper, we introduce the AD method into a recently proposed asynchronous measurement-device-independent (AMDI) QKD protocol, taking finite-key effects into account. Simulation results show that the AD method significantly enhances AMDIQKD, e.g., extending the transmiss… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 13 pages, 5 figures

  49. arXiv:2407.03939  [pdf

    cs.CV

    SfM on-the-fly: Get better 3D from What You Capture

    Authors: Zhan Zongqian, Yu Yifei, Xia Rui, Gan Wentian, Xie Hong, Perda Giulio, Morelli Luca, Remondino Fabio, Wang Xin

    Abstract: In the last twenty years, Structure from Motion (SfM) has been a constant research hotspot in the fields of photogrammetry, computer vision, robotics etc., whereas real-time performance is just a recent topic of growing interest. This work builds upon the original on-the-fly SfM (Zhan et al., 2024) and presents an updated version with three new advancements to get better 3D from what you capture:… ▽ More

    Submitted 12 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  50. arXiv:2407.03757  [pdf, other

    cs.CV

    DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts

    Authors: Zheng-Peng Duan, Jiawei zhang, Zheng Lin, Xin **, Dongqing Zou, Chunle Guo, Chongyi Li

    Abstract: Image retouching aims to enhance the visual quality of photos. Considering the different aesthetic preferences of users, the target of retouching is subjective. However, current retouching methods mostly adopt deterministic models, which not only neglects the style diversity in the expert-retouched results and tends to learn an average style during training, but also lacks sample diversity during… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.