Skip to main content

Showing 101–150 of 20,338 results for author: Wang, Y

.
  1. arXiv:2406.17272  [pdf, ps, other

    cs.LG

    A Comprehensive Solution to Connect Speech Encoder and Large Language Model for ASR

    Authors: Van Tung Pham, Yist Lin, Tao Han, Wei Li, Jun Zhang, Lu Lu, Yuxuan Wang

    Abstract: Recent works have shown promising results in connecting speech encoders to large language models (LLMs) for speech recognition. However, several limitations persist, including limited fine-tuning options, a lack of mechanisms to enforce speech-text alignment, and high insertion errors especially in domain mismatch conditions. This paper presents a comprehensive solution to address these issues. We… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.17225  [pdf, other

    eess.IV cs.CV

    Multimodal Cross-Task Interaction for Survival Analysis in Whole Slide Pathological Images

    Authors: Songhan Jiang, Zhengyu Gan, Linghan Cai, Yifeng Wang, Yongbing Zhang

    Abstract: Survival prediction, utilizing pathological images and genomic profiles, is increasingly important in cancer analysis and prognosis. Despite significant progress, precise survival analysis still faces two main challenges: (1) The massive pixels contained in whole slide images (WSIs) complicate the process of pathological images, making it difficult to generate an effective representation of the tu… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  3. arXiv:2406.17096  [pdf, other

    cs.LG cs.AI stat.ML

    Model-Free Robust Reinforcement Learning with Sample Complexity Analysis

    Authors: Yudan Wang, Shaofeng Zou, Yue Wang

    Abstract: Distributionally Robust Reinforcement Learning (DR-RL) aims to derive a policy optimizing the worst-case performance within a predefined uncertainty set. Despite extensive research, previous DR-RL algorithms have predominantly favored model-based approaches, with limited availability of model-free methods offering convergence guarantees or sample complexities. This paper proposes a model-free DR-R… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: UAI 2024

  4. arXiv:2406.17006  [pdf, other

    hep-ex

    Probing the nature of the $χ_{c1}(3872)$ state using radiative decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1094 additional authors not shown)

    Abstract: The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 31 pages, 2 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-015.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-015, CERN-EP-2025-157

  5. arXiv:2406.17005  [pdf, other

    cs.CV

    PVUW 2024 Challenge on Complex Video Understanding: Methods and Results

    Authors: Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, Yaowei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Cheng**g Wu, Ting Liu, Luoqi Liu, Xinyu Liu, **g Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, **gnan Luo , et al. (12 additional authors not shown)

    Abstract: Pixel-level Video Understanding in the Wild Challenge (PVUW) focus on complex video understanding. In this CVPR 2024 workshop, we add two new tracks, Complex Video Object Segmentation Track based on MOSE dataset and Motion Expression guided Video Segmentation track based on MeViS dataset. In the two new tracks, we provide additional videos and annotations that feature challenging elements, such as… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: MOSE Challenge: https://henghuiding.github.io/MOSE/ChallengeCVPR2024, MeViS Challenge: https://henghuiding.github.io/MeViS/ChallengeCVPR2024

  6. arXiv:2406.16978  [pdf, other

    cs.LG cs.AI cs.RO

    MetaFollower: Adaptable Personalized Autonomous Car Following

    Authors: Xianda Chen, Kehua Chen, Meixin Zhu, Hao, Yang, Shaojie Shen, Xuesong Wang, Yinhai Wang

    Abstract: Car-following (CF) modeling, a fundamental component in microscopic traffic simulation, has attracted increasing interest of researchers in the past decades. In this study, we propose an adaptable personalized car-following framework -MetaFollower, by leveraging the power of meta-learning. Specifically, we first utilize Model-Agnostic Meta-Learning (MAML) to extract common driving knowledge from v… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  7. arXiv:2406.16929  [pdf, other

    eess.SP cs.AI

    Modelling the 5G Energy Consumption using Real-world Data: Energy Fingerprint is All You Need

    Authors: Tingwei Chen, Yantao Wang, Hanzhi Chen, Zijian Zhao, Xinhao Li, Nicola Piovesan, Guangxu Zhu, Qingjiang Shi

    Abstract: The introduction of fifth-generation (5G) radio technology has revolutionized communications, bringing unprecedented automation, capacity, connectivity, and ultra-fast, reliable communications. However, this technological leap comes with a substantial increase in energy consumption, presenting a significant challenge. To improve the energy efficiency of 5G networks, it is imperative to develop sop… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  8. arXiv:2406.16878  [pdf, ps, other

    eess.SP cs.AI cs.IT

    Benchmarking Semantic Communications for Image Transmission Over MIMO Interference Channels

    Authors: Yanhu Wang, Shuaishuai Guo, Anming Dong, Hui Zhao

    Abstract: Semantic communications offer promising prospects for enhancing data transmission efficiency. However, existing schemes have predominantly concentrated on point-to-point transmissions. In this paper, we aim to investigate the validity of this claim in interference scenarios compared to baseline approaches. Specifically, our focus is on general multiple-input multiple-output (MIMO) interference cha… ▽ More

    Submitted 10 April, 2024; originally announced June 2024.

  9. arXiv:2406.16851  [pdf, other

    cs.CL cs.AI cs.CV

    Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts

    Authors: Aditya Sharma, Michael Saxon, William Yang Wang

    Abstract: We present LoCoVQA, a dynamic benchmark generator for evaluating long-context extractive reasoning in vision language models (VLMs). LoCoVQA augments test examples for mathematical reasoning, VQA, and character recognition tasks with increasingly long visual contexts composed of both in-distribution and out-of-distribution distractor images. Across these tasks, a diverse set of VLMs rapidly lose… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Under review

  10. arXiv:2406.16845  [pdf, other

    cs.CL

    RaTEScore: A Metric for Radiology Report Generation

    Authors: Weike Zhao, Chaoyi Wu, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie

    Abstract: This paper introduces a novel, entity-aware metric, termed as Radiological Report (Text) Evaluation (RaTEScore), to assess the quality of medical reports generated by AI models. RaTEScore emphasizes crucial medical entities such as diagnostic outcomes and anatomical details, and is robust against complex medical synonyms and sensitive to negation expressions. Technically, we developed a comprehens… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  11. arXiv:2406.16710  [pdf, other

    cs.CV

    Portrait3D: 3D Head Generation from Single In-the-wild Portrait Image

    Authors: **kun Hao, Junshu Tang, Jiangning Zhang, Ran Yi, Yijia Hong, Moran Li, Weijian Cao, Yating Wang, Lizhuang Ma

    Abstract: While recent works have achieved great success on one-shot 3D common object generation, high quality and fidelity 3D head generation from a single image remains a great challenge. Previous text-based methods for generating 3D heads were limited by text descriptions and image-based methods struggled to produce high-quality head geometry. To handle this challenging problem, we propose a novel framew… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: https://**kun-hao.github.io/Portrait3D/

  12. arXiv:2406.16578  [pdf, other

    cs.RO cs.AI

    QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds

    Authors: Ye Wang, Yuting Mei, Sipeng Zheng, Qin **

    Abstract: While pets offer companionship, their limited intelligence restricts advanced reasoning and autonomous interaction with humans. Considering this, we propose QuadrupedGPT, a versatile agent designed to master a broad range of complex tasks with agility comparable to that of a pet. To achieve this goal, the primary challenges include: i) effectively leveraging multimodal observations for decision-ma… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Under review

  13. arXiv:2406.16522  [pdf, other

    astro-ph.SR

    Why are non-radial solar eruptions less frequent than radial ones?

    Authors: Qingjun Liu, Chaowei Jiang, Xuesheng Feng, **bing Zuo, Yi Wang

    Abstract: Coronal mass ejections from the Sun are not always initiated along a radial trajectory; such non-radial eruptions are well known to be caused by the asymmetry of the pre-eruption magnetic configuration, which is primarily determined by the uneven distribution of magnetic flux at the photosphere. Therefore, it is naturally expected that the non-radial eruptions should be rather common, at least as… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 7 pages, 5 figures, accept by MNRAS Letters

  14. arXiv:2406.16487  [pdf, other

    cs.SE

    Decomposing God Header File via Multi-View Graph Clustering

    Authors: Yue Wang, Wenhui Chang, Yanzhen Zou, Tongwei Deng, Bing Xie

    Abstract: God Header File refers to a header file with large code size and wide file impact. Such files pose difficulties in code comprehension and slow down compilation, ultimately increasing the maintenance cost during software evolution. Although this concept is similar to God Class, existing refactoring methods for God Classes are inappropriate for God Header Files. The reason lies in the fact that the… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Be accepted by ICSME 2024

  15. arXiv:2406.16474  [pdf, other

    astro-ph.SR astro-ph.HE gr-qc

    Detecting eclipsing double white dwarfs with electromagnetic and gravitational waves

    Authors: Hong-Ming **, Bo Ma, Yong Shao, Yan Wang

    Abstract: Galactic double white dwarfs are predominant sources of gravitational waves in the millihertz frequencies accessible to space-borne gravitational wave detectors. With advances in multi-messenger astronomy, an increasing number of double white dwarf systems will be discovered through both electromagnetic and gravitational wave observations. In this paper, we simulated two populations of double whit… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 15 pages, 11 figures and 8 tables. Submitted

  16. arXiv:2406.16473  [pdf, other

    cs.CV cs.AI

    Seeking Certainty In Uncertainty: Dual-Stage Unified Framework Solving Uncertainty in Dynamic Facial Expression Recognition

    Authors: Haoran Wang, Xinji Mai, Zeng Tao, Xuan Tong, Junxiong Lin, Yan Wang, Jiawen Yu, Boyang Wang, Shaoqi Yan, Qing Zhao, Ziheng Zhou, Shuyong Gao, Wenqiang Zhang

    Abstract: The contemporary state-of-the-art of Dynamic Facial Expression Recognition (DFER) technology facilitates remarkable progress by deriving emotional map**s of facial expressions from video content, underpinned by training on voluminous datasets. Yet, the DFER datasets encompass a substantial volume of noise data. Noise arises from low-quality captures that defy logical labeling, and instances that… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  17. arXiv:2406.16459  [pdf, other

    cs.CV

    Suppressing Uncertainties in Degradation Estimation for Blind Super-Resolution

    Authors: Junxiong Lin, Zeng Tao, Xuan Tong, Xinji Mai, Haoran Wang, Boyang Wang, Yan Wang, Qing Zhao, Jiawen Yu, Yuxuan Lin, Shaoqi Yan, Shuyong Gao, Wenqiang Zhang

    Abstract: The problem of blind image super-resolution aims to recover high-resolution (HR) images from low-resolution (LR) images with unknown degradation modes. Most existing methods model the image degradation process using blur kernels. However, this explicit modeling approach struggles to cover the complex and varied degradation processes encountered in the real world, such as high-order combinations of… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  18. arXiv:2406.16427  [pdf, other

    cs.CV cs.AI

    Dynamic Pseudo Label Optimization in Point-Supervised Nuclei Segmentation

    Authors: Ziyue Wang, Ye Zhang, Yifeng Wang, Linghan Cai, Yongbing Zhang

    Abstract: Deep learning has achieved impressive results in nuclei segmentation, but the massive requirement for pixel-wise labels remains a significant challenge. To alleviate the annotation burden, existing methods generate pseudo masks for model training using point labels. However, the generated masks are inevitably different from the ground truth, and these dissimilarities are not handled reasonably dur… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: early accepted by MICCAI2024

  19. arXiv:2406.16425  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Spin order and dynamics in the topological rare-earth germanide semimetals

    Authors: Yuhao Wang, Zhixuan Zhen, **g Meng, Igor Plokhikh, Delong Wu, Dariusz J. Gawryluk, Yang Xu, Qingfeng Zhan, Ming Shi, Ekaterina Pomjakushina, Toni Shiroka, Tian Shang

    Abstract: The $RE$Al(Si,Ge) ($RE$ = rare earth) family, known to break both the inversion- and time-reversal symmetries, represents one of the most suitable platforms for investigating the interplay between correlated-electron phenomena and topologically nontrivial bands. Here, we report on systematic magnetic, transport, and muon-spin rotation and relaxation ($μ$SR) measurements on (Nd,Sm)AlGe single cryst… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 13 pages, 14 figures

  20. arXiv:2406.16381  [pdf, other

    eess.SP

    Polar-Coded Tensor-Based Unsourced Random Access with Soft Decoding

    Authors: Jiaqi Fang, Yan Liang, Gangle Sun, Hongwei Hou, Yafei Wang, Li You, Wen** Wang

    Abstract: The unsourced random access (URA) has emerged as a viable scheme for supporting the massive machine-type communications (mMTC) in the sixth generation (6G) wireless networks. Notably, the tensor-based URA (TURA), with its inherent tensor structure, stands out by simultaneously enhancing performance and reducing computational complexity for the multi-user separation, especially in mMTC networks wit… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  21. arXiv:2406.16377  [pdf, other

    cs.CL cs.AI

    On the Transformations across Reward Model, Parameter Update, and In-Context Prompt

    Authors: Deng Cai, Huayang Li, Tingchen Fu, Siheng Li, Weiwen Xu, Shuaiyi Li, Bowen Cao, Zhisong Zhang, Xinting Huang, Leyang Cui, Yan Wang, Lemao Liu, Taro Watanabe, Shuming Shi

    Abstract: Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications. In this paper, we demonstrate the interchangeability of three popular and distinct adaptation tools: parameter updating, reward modeling, and in-context prompting. This interchangeability establishes a triangular framework with six transformation… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  22. arXiv:2406.16369  [pdf, ps, other

    cs.CR

    Machine Learning with Real-time and Small Footprint Anomaly Detection System for In-Vehicle Gateway

    Authors: Yi Wang, Yuan** Zheng, Yajun Ha

    Abstract: Anomaly Detection System (ADS) is an essential part of a modern gateway Electronic Control Unit (ECU) to detect abnormal behaviors and attacks in vehicles. Among the existing attacks, ``one-time`` attack is the most challenging to be detected, together with the strict gateway ECU constraints of both microsecond or even nanosecond level real-time budget and limited footprint of code. To address the… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  23. arXiv:2406.16368  [pdf, ps, other

    math.DG

    The general Kastler-Kalau-Walze type theorem for the J-twist DJ of the Dirac operator

    Authors: Siyao Liu, Yong Wang

    Abstract: In [21] and [22], we proved the Kastler-Kalau-Walze type theorem for the J-twist DJ of the Dirac operator on 3-dimensional, 4-dimensional and 6-dimensional almost product Riemannian spin manifold with boundary. In this paper, we generalize our previous conclusions and establish the proof of the general Kastler-Kalau-Walze type theorem for the J-twist DJ of the Dirac operator on even-dimensional… ▽ More

    Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: 32 pages. arXiv admin note: text overlap with arXiv:2211.06602, arXiv:2203.10467, arXiv:2312.00154, arXiv:2401.10909

  24. arXiv:2406.16338  [pdf, other

    cs.CV

    VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models

    Authors: Yuxuan Wang, Yueqian Wang, Dongyan Zhao, Cihang Xie, Zilong Zheng

    Abstract: Recent advancements in Multimodal Large Language Models (MLLMs) have extended their capabilities to video understanding. Yet, these models are often plagued by "hallucinations", where irrelevant or nonsensical content is generated, deviating from the actual video context. This work introduces VideoHallucer, the first comprehensive benchmark for hallucination detection in large video-language model… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  25. arXiv:2406.16306  [pdf, other

    cs.CL cs.LG stat.ML

    Cascade Reward Sampling for Efficient Decoding-Time Alignment

    Authors: Bolian Li, Yifan Wang, Ananth Grama, Ruqi Zhang

    Abstract: Aligning large language models (LLMs) with human preferences is critical for their deployment. Recently, decoding-time alignment has emerged as an effective plug-and-play technique that requires no fine-tuning of model parameters. However, generating text that achieves both high reward and high likelihood remains a significant challenge. Existing methods often fail to generate high-reward text or… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  26. arXiv:2406.16303  [pdf, other

    eess.SP

    Hybrid Precoding With Low-Resolution PSs for Wideband Terahertz Communication Systems in The Face of Beam Squint

    Authors: Yang Wang, Chuang Yang, Mugen Peng

    Abstract: Terahertz (THz) communication is considered one of the most critical technologies for 6G because of its abundant bandwidth. To compensate the high propagation of THz, analog/digital hybrid precoding for THz massive multiple input multiple output (MIMO) is proposed to focus signals and extend communication range. Notably, considering hardware cost and power consumption, infinite and high-resolution… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  27. arXiv:2406.16278  [pdf, ps, other

    math.AP

    Sharp fractional Sobolev and related inequalities on H-type groups

    Authors: Yaojun Wang, Qiaohua Yang

    Abstract: We determine the sharp constants for the fractional Sobolev inequalities associated with the conformally invariant fractional powers $\mathcal{L}_{s}(0<s<1)$ of the sublaplacian on H-type groups. From these inequalities we derive a sharp log-Sobolev inequality by considering a limiting case and a sharp Sobolev trace inequality. The later extends to this context the result of Frank, González, Monti… ▽ More

    Submitted 27 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  28. arXiv:2406.16253  [pdf, other

    cs.CL

    LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

    Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

    Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  29. arXiv:2406.16150  [pdf, other

    eess.IV cs.CV

    Intensity Confusion Matters: An Intensity-Distance Guided Loss for Bronchus Segmentation

    Authors: Haifan Gong, Wenhao Huang, Huan Zhang, Yu Wang, Xiang Wan, Hong Shen, Guanbin Li, Haofeng Li

    Abstract: Automatic segmentation of the bronchial tree from CT imaging is important, as it provides structural information for disease diagnosis. Despite the merits of previous automatic bronchus segmentation methods, they have paied less attention to the issue we term as \textit{Intensity Confusion}, wherein the intensity values of certain background voxels approach those of the foreground voxels within br… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: IEEE International Conference on Multimedia & Expo (ICME) 2024

  30. arXiv:2406.16144  [pdf, other

    cs.CL

    Chain-of-Probe: Examing the Necessity and Accuracy of CoT Step-by-Step

    Authors: Zezhong Wang, Xingshan Zeng, Weiwen Liu, Yufei Wang, Liangyou Li, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong

    Abstract: Current research found the issue of Early Answering in large language models (LLMs), where the models already have an answer before generating the Chain-of-Thought (CoT). This phenomenon suggests a potential lack of necessary dependency between the predicted answer and the reasoning process. Consequently, two important questions arise: (1) Is CoT still necessary if the model already has an answer?… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  31. arXiv:2406.15948  [pdf, other

    cs.CL

    Teaching LLMs to Abstain across Languages via Multilingual Feedback

    Authors: Shangbin Feng, Weijia Shi, Yike Wang, Wenxuan Ding, Orevaoghene Ahia, Shuyue Stella Li, Vidhisha Balachandran, Sunayana Sitaram, Yulia Tsvetkov

    Abstract: Multilingual LLMs often have knowledge disparities across languages, with larger gaps in under-resourced languages. Teaching LLMs to abstain in the face of knowledge gaps is thus a promising strategy to mitigate hallucinations in multilingual settings. However, previous studies on LLM abstention primarily focus on English; we find that directly applying existing solutions beyond English results in… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  32. arXiv:2406.15740  [pdf, other

    astro-ph.IM physics.ins-det

    The FRB-searching pipeline of the Tianlai Cylinder Pathfinder Array

    Authors: Zijie Yu, Furen Deng, Shijie Sun, Chenhui Niu, Jixia Li, Fengquan Wu, Wei-Yang Wang, Yougang Wang, Shifan Zuo, Lin Shu, Jie Hao, Xiaohui Liu, Reza Ansari, Ue-Li Pen, Albert Stebbins, Peter Timbie, Xuelei Chen

    Abstract: This paper presents the design, calibration, and survey strategy of the Fast Radio Burst (FRB) digital backend and its real-time data processing pipeline employed in the Tianlai Cylinder Pathfinder array. The array, consisting of three parallel cylindrical reflectors and equipped with 96 dual-polarization feeds, is a radio interferometer array designed for conducting drift scans of the northern ce… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 27 pages, 21 figures, 7 tables, RAA accepted

  33. arXiv:2406.15713  [pdf, other

    math.OC cs.LG

    Efficient Low-rank Identification via Accelerated Iteratively Reweighted Nuclear Norm Minimization

    Authors: Hao Wang, Ye Wang, Xiangyu Yang

    Abstract: This paper considers the problem of minimizing the sum of a smooth function and the Schatten-$p$ norm of the matrix. Our contribution involves proposing accelerated iteratively reweighted nuclear norm methods designed for solving the nonconvex low-rank minimization problem. Two major novelties characterize our approach. Firstly, the proposed method possesses a rank identification property, enablin… ▽ More

    Submitted 26 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

    Comments: Copyright may be transferred without notice, after which this version may no longer be accessible

  34. arXiv:2406.15704  [pdf, other

    cs.CV

    video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models

    Authors: Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang

    Abstract: Speech understanding as an element of the more generic video understanding using audio-visual large language models (av-LLMs) is a crucial yet understudied aspect. This paper proposes video-SALMONN, a single end-to-end av-LLM for video processing, which can understand not only visual frame sequences, audio events and music, but speech as well. To obtain fine-grained temporal information required b… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024. arXiv admin note: substantial text overlap with arXiv:2310.05863

  35. arXiv:2406.15523  [pdf, other

    cs.LG stat.ML

    Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark

    Authors: Yili Wang, Yixin Liu, Xu Shen, Chenyu Li, Kaize Ding, Rui Miao, Ying Wang, Shirui Pan, Xin Wang

    Abstract: To build safe and reliable graph machine learning systems, unsupervised graph-level anomaly detection (GLAD) and unsupervised graph-level out-of-distribution (OOD) detection (GLOD) have received significant attention in recent years. Though those two lines of research indeed share the same objective, they have been studied independently in the community due to distinct evaluation setups, creating… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  36. arXiv:2406.15346  [pdf, other

    cs.LG cs.AI

    Privacy Preserved Blood Glucose Level Cross-Prediction: An Asynchronous Decentralized Federated Learning Approach

    Authors: Chengzhe Piao, Taiyu Zhu, Yu Wang, Stephanie E Baldeweg, Paul Taylor, Pantelis Georgiou, Jiahao Sun, Jun Wang, Kezhi Li

    Abstract: Newly diagnosed Type 1 Diabetes (T1D) patients often struggle to obtain effective Blood Glucose (BG) prediction models due to the lack of sufficient BG data from Continuous Glucose Monitoring (CGM), presenting a significant "cold start" problem in patient care. Utilizing population models to address this challenge is a potential solution, but collecting patient data for training population models… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  37. arXiv:2406.15330  [pdf, other

    cs.AI cs.CL

    Gradient-Mask Tuning Elevates the Upper Limits of LLM Performance

    Authors: Haoling Li, Xin Zhang, Xiao Liu, Yeyun Gong, Yifan Wang, Yujiu Yang, Qi Chen, Peng Cheng

    Abstract: Large language models (LLMs) have revolutionized lots of fields of research. Although it is well-known that fine-tuning is essential for enhancing the capabilities of LLMs, existing research suggests that there is potential redundancy in the fine-tuning process and therefore proposes to update only a subset of parameters. However, these methods fail to leverage the task-specific information to ide… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  38. arXiv:2406.15305  [pdf, other

    cs.CR cs.AI

    PID: Prompt-Independent Data Protection Against Latent Diffusion Models

    Authors: Ang Li, Yichuan Mo, Mingjie Li, Yisen Wang

    Abstract: The few-shot fine-tuning of Latent Diffusion Models (LDMs) has enabled them to grasp new concepts from a limited number of images. However, given the vast amount of personal images accessible online, this capability raises critical concerns about civil privacy. While several previous defense methods have been developed to prevent such misuse of LDMs, they typically assume that the textual prompts… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 27 pages, ICML 2024 poster

  39. arXiv:2406.15030  [pdf, ps, other

    hep-ex

    Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 11 pages, 3 figures

  40. arXiv:2406.14979  [pdf, other

    cs.CL

    Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation

    Authors: Yuanjie Lyu, Zihan Niu, Zheyong Xie, Chao Zhang, Tong Xu, Yang Wang, Enhong Chen

    Abstract: Despite the significant progress of large language models (LLMs) in various tasks, they often produce factual errors due to their limited internal knowledge. Retrieval-Augmented Generation (RAG), which enhances LLMs with external knowledge sources, offers a promising solution. However, these methods can be misled by irrelevant paragraphs in retrieved documents. Due to the inherent uncertainty in L… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  41. arXiv:2406.14952  [pdf, other

    cs.CL

    ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models

    Authors: Haiquan Zhao, Lingyu Li, Shisong Chen, Shuqi Kong, Jiaan Wang, Kexin Huang, Tianle Gu, Yixu Wang, Dandan Liang, Zhixu Li, Yan Teng, Yanghua Xiao, Yingchun Wang

    Abstract: Emotion Support Conversation (ESC) is a crucial application, which aims to reduce human stress, offer emotional guidance, and ultimately enhance human mental and physical well-being. With the advancement of Large Language Models (LLMs), many researchers have employed LLMs as the ESC models. However, the evaluation of these LLM-based ESCs remains uncertain. Inspired by the awesome development of ro… ▽ More

    Submitted 24 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

    Comments: Pre-print

  42. Learning Autonomous Race Driving with Action Map** Reinforcement Learning

    Authors: Yuanda Wang, Xin Yuan, Changyin Sun

    Abstract: Autonomous race driving poses a complex control challenge as vehicles must be operated at the edge of their handling limits to reduce lap times while respecting physical and safety constraints. This paper presents a novel reinforcement learning (RL)-based approach, incorporating the action map** (AM) mechanism to manage state-dependent input constraints arising from limited tire-road friction. A… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  43. arXiv:2406.14928  [pdf, other

    cs.AI cs.CL cs.HC cs.MA cs.SI

    Autonomous Agents for Collaborative Task under Information Asymmetry

    Authors: Wei Liu, Chenxi Wang, Yifei Wang, Zihao Xie, Rennai Qiu, Yufan Dang, Zhuoyun Du, Weize Chen, Cheng Yang, Chen Qian

    Abstract: Large Language Model Multi-Agent Systems (LLM-MAS) have achieved great progress in solving complex tasks. It performs communication among agents within the system to collaboratively solve tasks, under the premise of shared information. However, when agents' communication is leveraged to enhance human cooperation, a new challenge arises due to information asymmetry, since each agent can only access… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 16 pages, 8 figures, 5 tables, Work in progress

  44. arXiv:2406.14920  [pdf, other

    physics.chem-ph physics.comp-ph

    Extending GPU-Accelerated Gaussian Integrals in the TeraChem Software Package to f Type Orbitals: Implementation and Applications

    Authors: Yuanheng Wang, Diptarka Hait, K. Grace Johnson, O. Jonathan Fajen, Rubén D. Guerrero, Todd J. Martínez

    Abstract: The increasing availability of GPUs for scientific computing has prompted interest in accelerating quantum chemical calculations through their use. The complexity of integral kernels for high angular momentum basis functions however often limits the utility of GPU implementations with large basis sets or for metal containing systems. In this work, we report implementation of $f$ function support i… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  45. arXiv:2406.14912  [pdf, other

    cs.CV

    FC3DNet: A Fully Connected Encoder-Decoder for Efficient Demoir'eing

    Authors: Zhibo Du, Long Peng, Yang Wang, Yang Cao, Zheng-Jun Zha

    Abstract: Moiré patterns are commonly seen when taking photos of screens. Camera devices usually have limited hardware performance but take high-resolution photos. However, users are sensitive to the photo processing time, which presents a hardly considered challenge of efficiency for demoiréing methods. To balance the network speed and quality of results, we propose a \textbf{F}ully \textbf{C}onnected en\t… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted by ICIP2024

  46. arXiv:2406.14909  [pdf, other

    cs.LG cs.AI cs.CL

    MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression

    Authors: Tianyu Fu, Haofeng Huang, Xuefei Ning, Genghan Zhang, Boju Chen, Tianqi Wu, Hongyi Wang, Zixiao Huang, Shiyao Li, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang

    Abstract: Sparse attention can effectively mitigate the significant memory and throughput demands of Large Language Models (LLMs) in long contexts. Existing methods typically employ a uniform sparse attention mask, applying the same sparse pattern across different attention heads and input lengths. However, this uniform approach fails to capture the diverse attention patterns inherent in LLMs, ignoring thei… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 10 pages

    ACM Class: I.2.7

  47. arXiv:2406.14896  [pdf, other

    eess.IV cs.CV

    SelfReg-UNet: Self-Regularized UNet for Medical Image Segmentation

    Authors: Wenhui Zhu, Xiwen Chen, Peijie Qiu, Mohammad Farazi, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang

    Abstract: Since its introduction, UNet has been leading a variety of medical image segmentation tasks. Although numerous follow-up studies have also been dedicated to improving the performance of standard UNet, few have conducted in-depth analyses of the underlying interest pattern of UNet in medical image segmentation. In this paper, we explore the patterns learned in a UNet and observe two important facto… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted as a conference paper to 2024 MICCAI

  48. arXiv:2406.14867  [pdf, other

    cs.LG cs.AI cs.CL

    DistiLRR: Transferring Code Repair for Low-Resource Programming Languages

    Authors: Kyle Wong, Alfonso Amayuelas, Liangming Pan, William Yang Wang

    Abstract: Large language models (LLMs) have shown remarkable performance on code generation tasks. A recent application of LLMs for code generation is iterative code repair, where a model fixes an incorrect program by rationalizing about errors and generating a new program. However, code repair is primarily studied on high-resource languages like Python, and the framework's efficacy is under-explored on low… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  49. arXiv:2406.14763  [pdf, other

    cs.CL cs.AI

    A Learn-Then-Reason Model Towards Generalization in Knowledge Base Question Answering

    Authors: Lingxi Zhang, **g Zhang, Yanling Wang, Cui** Li, Hong Chen

    Abstract: Large-scale knowledge bases (KBs) like Freebase and Wikidata house millions of structured knowledge. Knowledge Base Question Answering (KBQA) provides a user-friendly way to access these valuable KBs via asking natural language questions. In order to improve the generalization capabilities of KBQA models, extensive research has embraced a retrieve-then-reason framework to retrieve relevant evidenc… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  50. arXiv:2406.14629  [pdf, other

    cs.CL cs.AI

    Can LLMs Learn by Teaching? A Preliminary Study

    Authors: Xuefei Ning, Zifu Wang, Shiyao Li, Zinan Lin, Peiran Yao, Tianyu Fu, Matthew B. Blaschko, Guohao Dai, Huazhong Yang, Yu Wang

    Abstract: Teaching to improve student models (e.g., knowledge distillation) is an extensively studied methodology in LLMs. However, for humans, teaching not only improves students but also improves teachers. We ask: Can LLMs also learn by teaching (LbT)? If yes, we can potentially unlock the possibility of continuously advancing the models without solely relying on human-produced data or stronger models. In… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Under review