Skip to main content

Showing 151–200 of 14,365 results for author: Wang, X

.
  1. arXiv:2406.10900  [pdf, other

    cs.CV cs.CL

    AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models

    Authors: Xiyang Wu, Tianrui Guan, Dianqi Li, Shuaiyi Huang, Xiaoyu Liu, Xijun Wang, Ruiqi Xian, Abhinav Shrivastava, Furong Huang, Jordan Lee Boyd-Graber, Tianyi Zhou, Dinesh Manocha

    Abstract: Large vision-language models (LVLMs) hallucinate: certain context cues in an image may trigger the language module's overconfident and incorrect reasoning on abnormal or hypothetical objects. Though a few benchmarks have been developed to investigate LVLM hallucinations, they mainly rely on hand-crafted corner cases whose fail patterns may hardly generalize, and finetuning on them could undermine… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  2. arXiv:2406.10881  [pdf, other

    cs.CL

    Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals

    Authors: Lida Chen, Zujie Liang, Xintao Wang, Jiaqing Liang, Yanghua Xiao, Feng Wei, **glei Chen, Zhenghong Hao, Bing Han, Wei Wang

    Abstract: Large language models (LLMs) have achieved great success, but their occasional content fabrication, or hallucination, limits their practical application. Hallucination arises because LLMs struggle to admit ignorance due to inadequate training on knowledge boundaries. We call it a limitation of LLMs that they can not accurately express their knowledge boundary, answering questions they know while a… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  3. arXiv:2406.10836  [pdf, other

    eess.AS cs.SD

    Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis

    Authors: Xin Wang, Tomi Kinnunen, Kong Aik Lee, Paul-Gauthier Noé, Junichi Yamagishi

    Abstract: Fusing outputs from automatic speaker verification (ASV) and spoofing countermeasure (CM) is expected to make an integrated system robust to zero-effort imposters and synthesized spoofing attacks. Many score-level fusion methods have been proposed, but many remain heuristic. This paper revisits score-level fusion using tools from decision theory and presents three main findings. First, fusion by s… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Interspeech 2024 Accepted. https://github.com/nii-yamagishilab/SpeechSPC-mini

  4. arXiv:2406.10605  [pdf, other

    cs.LG cs.GT

    Last-iterate Convergence Separation between Extra-gradient and Optimism in Constrained Periodic Games

    Authors: Yi Feng, ** Li, Ioannis Panageas, Xiao Wang

    Abstract: Last-iterate behaviors of learning algorithms in repeated two-player zero-sum games have been extensively studied due to their wide applications in machine learning and related tasks. Typical algorithms that exhibit the last-iterate convergence property include optimistic and extra-gradient methods. However, most existing results establish these properties under the assumption that the game is tim… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Accepted for UAI 2024

  5. arXiv:2406.10603  [pdf, other

    cs.GT

    Prediction Accuracy of Learning in Games : Follow-the-Regularized-Leader meets Heisenberg

    Authors: Yi Feng, Georgios Piliouras, Xiao Wang

    Abstract: We investigate the accuracy of prediction in deterministic learning dynamics of zero-sum games with random initializations, specifically focusing on observer uncertainty and its relationship to the evolution of covariances. Zero-sum games are a prominent field of interest in machine learning due to their various applications. Concurrently, the accuracy of prediction in dynamical systems from mecha… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Accepted for ICML 2024

  6. arXiv:2406.10591  [pdf, other

    eess.AS cs.AI cs.CV cs.MM cs.SD

    MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation

    Authors: Ruibo Fu, Shuchen Shi, Hongming Guo, Tao Wang, Chunyu Qiang, Zhengqi Wen, Jianhua Tao, Xin Qi, Yi Lu, Xiaopeng Wang, Zhiyong Wang, Yukun Liu, Xuefei Liu, Shuai Zhang, Guanjun Li

    Abstract: Foley audio, critical for enhancing the immersive experience in multimedia content, faces significant challenges in the AI-generated content (AIGC) landscape. Despite advancements in AIGC technologies for text and image generation, the foley audio dubbing remains rudimentary due to difficulties in cross-modal scene matching and content correlation. Current text-to-audio technology, which relies on… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  7. arXiv:2406.10582  [pdf, other

    math.NA math.PR

    Strong convergence rates for long-time approximations of SDEs with non-globally Lipschitz continuous coefficients

    Authors: Xiaoming Wu, Xiaojie Wang

    Abstract: This paper is concerned with long-time strong approximations of SDEs with non-globally Lipschitz coefficients.Under certain non-globally Lipschitz conditions, a long-time version of fundamental strong convergence theorem is established for general one-step time discretization schemes. With the aid of the fundamental strong convergence theorem, we prove the expected strong convergence rate over inf… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 30 pages,4 figures

    MSC Class: 60H35; 65C30

  8. arXiv:2406.10484  [pdf, other

    cs.CV

    Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model

    Authors: Lu Xu, Sijie Zhu, Chunyuan Li, Chia-Wen Kuo, Fan Chen, Xinyao Wang, Guang Chen, Dawei Du, Ye Yuan, Longyin Wen

    Abstract: The emerging video LMMs (Large Multimodal Models) have achieved significant improvements on generic video understanding in the form of VQA (Visual Question Answering), where the raw videos are captured by cameras. However, a large portion of videos in real-world applications are edited videos, \textit{e.g.}, users usually cut and add effects/modifications to the raw video before publishing it on s… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  9. arXiv:2406.10473  [pdf, other

    stat.ME

    Design-based variance estimation of the Hájek effect estimator in stratified and clustered experiments

    Authors: Xinhe Wang, Ben B. Hansen

    Abstract: Randomized controlled trials (RCTs) are used to evaluate treatment effects. When individuals are grouped together, clustered RCTs are conducted. Stratification is recommended to reduce imbalance of baseline covariates between treatment and control. In practice, this can lead to comparisons between clusters of very different sizes. As a result, direct adjustment estimators that average differences… ▽ More

    Submitted 19 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

  10. arXiv:2406.10347  [pdf, other

    cs.NI

    A Near-Optimal Category Information Sampling in RFID Systems

    Authors: Xiujun Wang, Zhi Liu, Xiaokang Zhou, Yong Liao, Han Hu, Xiao Zheng, Jie Li

    Abstract: In many RFID-enabled applications, objects are classified into different categories, and the information associated with each object's category (called category information) is written into the attached tag, allowing the reader to access it later. The category information sampling in such RFID systems, which is to randomly choose (sample) a few tags from each category and collect their category in… ▽ More

    Submitted 18 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: 37 pages, 11 figures

  11. arXiv:2406.10221  [pdf, other

    cs.CV cs.AI cs.CL

    Short Film Dataset (SFD): A Benchmark for Story-Level Video Understanding

    Authors: Ridouane Ghermi, Xi Wang, Vicky Kalogeiton, Ivan Laptev

    Abstract: Recent advances in vision-language models have significantly propelled video understanding. Existing datasets and tasks, however, have notable limitations. Most datasets are confined to short videos with limited events and narrow narratives. For example, datasets with instructional and egocentric videos often document the activities of one person in a single scene. Although some movie datasets off… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  12. arXiv:2406.10091  [pdf, other

    cs.CL

    Exploring the Correlation between Human and Machine Evaluation of Simultaneous Speech Translation

    Authors: Xiaoman Wang, Claudio Fantinuoli

    Abstract: Assessing the performance of interpreting services is a complex task, given the nuanced nature of spoken language translation, the strategies that interpreters apply, and the diverse expectations of users. The complexity of this task become even more pronounced when automated evaluation methods are applied. This is particularly true because interpreted texts exhibit less linearity between the sour… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Paper accepted at the European Association for Machine Translation conference 2024

  13. arXiv:2406.10026  [pdf

    physics.optics

    Retiming dynamics of harmonically modelocked laser solitons in a self-driven optomechanical lattice

    Authors: Xiaocong Wang, Benhai Wang, Wenbin He, Xintong Zhang, Qi Huang, Zhiyuan Huang, Xin Jiang, Philip St. J. Russell, Meng Pang

    Abstract: Harmonic mode-locking, realized actively or passively, is an effective technique for increasing the repetition rate of lasers, with important applications in optical sampling, laser micro-machining and frequency metrology. It is critically important to understand how a harmonically mode-locked pulse train responds to external perturbations and noise, so as to make sure that it is stable and resist… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  14. arXiv:2406.09923  [pdf, other

    cs.CL cs.AI cs.LG

    CliBench: Multifaceted Evaluation of Large Language Models in Clinical Decisions on Diagnoses, Procedures, Lab Tests Orders and Prescriptions

    Authors: Mingyu Derek Ma, Chenchen Ye, Yu Yan, Xiaoxuan Wang, Peipei **, Timothy S Chang, Wei Wang

    Abstract: The integration of Artificial Intelligence (AI), especially Large Language Models (LLMs), into the clinical diagnosis process offers significant potential to improve the efficiency and accessibility of medical care. While LLMs have shown some promise in the medical domain, their application in clinical diagnosis remains underexplored, especially in real-world clinical practice, where highly sophis… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Project page: https://clibench.github.io

  15. arXiv:2406.09821  [pdf, other

    eess.AS

    Low algorithmic delay implementation of convolutional beamformer for online joint source separation and dereverberation

    Authors: Kaien Mo, Xianrui Wang, Yichen Yang, Shoji Makino, **gdong Chen

    Abstract: Blind-audio-source-separation (BASS) techniques, particularly those with low latency, play an important role in a wide range of real-time systems, e.g., hearing aids, in-car hand-free voice communication, real-time human-machine interaction, etc. Most existing BASS algorithms are deduced to run on batch mode, and therefore large latency is unavoidable. Recently, some online algorithms were develop… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 4 pages, 4 figures. Accepted by EUSIPCO 2024

  16. arXiv:2406.09817  [pdf, other

    physics.chem-ph q-bio.BM

    Efficient and Precise Force Field Optimization for Biomolecules Using DPA-2

    Authors: Junhan Chang, Duo Zhang, Yuqing Deng, Hongrui Lin, Zhirong Liu, Linfeng Zhang, Hang Zheng, Xinyan Wang

    Abstract: Molecular simulations are essential tools in computational chemistry, enabling the prediction and understanding of molecular interactions and thermodynamic properties of biomolecules. However, traditional force fields face significant challenges in accurately representing novel molecules and complex chemical environments due to the labor-intensive process of manually setting optimization parameter… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  17. arXiv:2406.09791  [pdf, other

    eess.SP

    Semi-Blind Multi-Tag Ambient Backscatter Communications Using Radar Signals

    Authors: Luca Venturino, Emanuele Grossi, Jeremy Johnston, Marco Lops, Xiaodong Wang

    Abstract: In this work, we consider a backscatter communication system wherein multiple asynchronous sources (tags) exploit the reverberation generated by a nearby radar transmitter as an ambient carrier to deliver a message to a common destination (reader) through a number of available subchannels. We propose a new encoding strategy wherein each tag transmits both pilot and data symbols on each subchannel… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Submitted to the IEEE Transactions on Wireless Communications

  18. arXiv:2406.09771  [pdf, other

    cs.DS

    Block Coordinate Descent Methods for Optimization under J-Orthogonality Constraints with Applications

    Authors: Di He, Ganzhao Yuan, Xiao Wang, Pengxiang Xu

    Abstract: The J-orthogonal matrix, also referred to as the hyperbolic orthogonal matrix, is a class of special orthogonal matrix in hyperbolic space, notable for its advantageous properties. These matrices are integral to optimization under J-orthogonal constraints, which have widespread applications in statistical learning and data science. However, addressing these problems is generally challenging due to… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  19. arXiv:2406.09683  [pdf, other

    astro-ph.GA

    Interstellar Nitrogen Isotope Ratios: Measurements on tracers of C$^{14}$N and C$^{15}$N

    Authors: J. L. Chen, J. S. Zhang, C. Henkel, Y. T. Yan, H. Z. Yu, Y. X. Wang, Y. P. Zou, J. Y. Zhao, X. Y. Wang

    Abstract: The nitrogen isotope ratio 14N/15N is a powerful tool to trace Galactic stellar nucleosynthesis and constraining Galactic chemical evolution. Previous observations have found lower 14N/15N ratios in the Galactic center and higher values in the Galactic disk. This is consistent with the inside-out formation scenario of our Milky Way. However, previous studies mostly utilized double isotope ratios a… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 34 pages, 9 figures, 6 tables

    Journal ref: The Astrophysical Journal (2004)

  20. arXiv:2406.09644  [pdf, other

    hep-ph nucl-th

    Bridging Electromagnetic and Gravitational Form Factors: Insights from LFHQCD

    Authors: Xiaobin Wang, Zanbin Xing, Minghui Ding, Khépani Raya, Lei Chang

    Abstract: We propose an efficacious approach to derive the generalized parton distributions for the pion and proton, based upon prior knowledge of their respective parton distribution functions (PDFs). Our method leverages on integral representations of the electromagnetic form factors derived from the light-front holographic QCD (LFHQCD) formalism, coupled with PDFs computed from continuum Schwinger functi… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 6 pages, 5 figures

  21. Massive Dirac Fermions and Strong Shubnikov-de Haas Oscillations in Topological Insulator Sm,Fe:Bi2Se3 Single Crystals

    Authors: Weiyao Zhao, Chi Xuan Trang, Qile Li, Lei Chen, Zengji Yue, Abdulhakim Bake, Cheng Tan, Lan Wang, Mitchell Nancarrow, Mark Edmonds, David Cortie, Xiaolin Wang

    Abstract: Topological insulators (TIs) are emergent materials with unique band structure, which allow the study of quantum effect in solids, as well as contribute to high performance quantum devices. To achieve the better performance of TI, here we present a co-do** strategy using synergistic rare-earth Sm and transition-metal Fe dopants in Bi2Se3 single crystals, which combine the advantages of both tran… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 5 figures

    Journal ref: Physical Review B 104, 085153 (2021)

  22. arXiv:2406.09475  [pdf, other

    hep-ex

    Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  23. arXiv:2406.09305  [pdf, other

    cs.CV

    Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation

    Authors: Yufan Zhou, Ruiyi Zhang, Kaizhi Zheng, Nanxuan Zhao, Jiuxiang Gu, Zichao Wang, Xin Eric Wang, Tong Sun

    Abstract: In subject-driven text-to-image generation, recent works have achieved superior performance by training the model on synthetic datasets containing numerous image pairs. Trained on these datasets, generative models can produce text-aligned images for specific subject from arbitrary testing image in a zero-shot manner. They even outperform methods which require additional fine-tuning on testing imag… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  24. arXiv:2406.09270  [pdf, other

    astro-ph.HE

    Discovery and Extensive Follow-Up of SN 2024ggi, a nearby type IIP supernova in NGC 3621

    Authors: Ting-Wan Chen, Sheng Yang, Shubham Srivastav, Takashi J. Moriya, Stephen J. Smartt, Sofia Rest, Armin Rest, Hsing Wen Lin, Hao-Yu Miao, Yu-Chi Cheng, Amar Aryan, Chia-Yu Cheng, Morgan Fraser, Li-Ching Huang, Meng-Han Lee, Cheng-Han Lai, Yu Hsuan Liu, Aiswarya Sankar. K, Ken W. Smith, Heloise F. Stevance, Ze-Ning Wang, Joseph P. Anderson, Charlotte R. Angus, Thomas de Boer, Kenneth Chambers , et al. (23 additional authors not shown)

    Abstract: We present the discovery and early observations of the nearby Type II supernova (SN) 2024ggi in NGC 3621 at 6.64 +/- 0.3 Mpc. The SN was caught 5.8 (+1.9 -2.9) hours after its explosion by the ATLAS survey. Early-phase, high-cadence, and multi-band photometric follow-up was performed by the Kinder (Kilonova Finder) project, collecting over 1000 photometric data points within a week. The combined o… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 11 pages, 5 figures in manuscript, 6 pages in appendix, submitted to ApJL

  25. arXiv:2406.09215  [pdf, other

    cs.IR cs.AI

    On Softmax Direct Preference Optimization for Recommendation

    Authors: Yuxin Chen, Junfei Tan, An Zhang, Zhengyi Yang, Leheng Sheng, Enzhi Zhang, Xiang Wang, Tat-Seng Chua

    Abstract: Recommender systems aim to predict personalized rankings based on user preference data. With the rise of Language Models (LMs), LM-based recommenders have been widely explored due to their extensive world knowledge and powerful reasoning abilities. Most of the LM-based recommenders convert historical interactions into language prompts, pairing with a positive item as the target response and fine-t… ▽ More

    Submitted 14 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  26. arXiv:2406.09192  [pdf, other

    eess.SP

    Joint Power Allocation and Beamforming Design for Active IRS-Aided Directional Modulation Secure Systems

    Authors: Yifan Zhao, Xiaoyu Wang, Kaibo Zhou, Xuehui Wang, Yan Wang, Wei Gao, Ruiqi Liu, Feng Shu

    Abstract: Since the secrecy rate (SR) performance improvement obtained by secure directional modulation (DM) network is limited, an active intelligent reflective surface (IRS)-assisted DM network is considered to attain a high SR. To address the SR maximization problem, a novel method based on Lagrangian dual transform and closed-form fractional programming algorithm (LDT-CFFP) is proposed, where the soluti… ▽ More

    Submitted 25 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Directional modulation, active intelligent reflective surface, Lagrangian dual transformation, fractional programming, power allocation

  27. arXiv:2406.08911  [pdf, other

    cs.CL eess.AS

    An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

    Authors: Cheng Gong, Erica Cooper, Xin Wang, Chunyu Qiang, Mengzhe Geng, Dan Wells, Longbiao Wang, Jianwu Dang, Marc Tessier, Aidan Pine, Korin Richmond, Junichi Yamagishi

    Abstract: Self-supervised learning (SSL) representations from massively multilingual models offer a promising solution for low-resource language speech tasks. Despite advancements, language adaptation in TTS systems remains an open problem. This paper explores the language adaptation capability of ZMM-TTS, a recent SSL-based multilingual TTS system proposed in our previous work. We conducted experiments on… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  28. arXiv:2406.08834  [pdf, ps, other

    quant-ph

    Interaction and entanglement engineering in driven giant atoms setup with coupled resonator waveguide

    Authors: Mingzhu Weng, Xin Wang, Zhihai Wang

    Abstract: We investigate the coherent interactions mediated by the coupled resonator waveguide between two types of giant atoms. We find that the effective coupling and collective dissipation can be controlled on demand by adjusting the configuration of the giant atoms. As a result, the external driving gives birth to a substantial entanglement between two giant atoms, which exhibits a Rabi splitting charac… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 12 pages, 10 figures, comments are welcomed

  29. arXiv:2406.08814  [pdf, other

    cs.CV

    Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting

    Authors: Zhengqi Zhao, Xiaohu Huang, Hao Zhou, Kun Yao, Errui Ding, **gdong Wang, Xinggang Wang, Wenyu Liu, Bin Feng

    Abstract: The key to action counting is accurately locating each video's repetitive actions. Instead of estimating the probability of each frame belonging to an action directly, we propose a dual-branch network, i.e., SkimFocusNet, working in a two-step manner. The model draws inspiration from empirical observations indicating that humans typically engage in coarse skimming of entire sequences to grasp the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 13 pages, 9 figures

  30. arXiv:2406.08810  [pdf, other

    cs.CV

    Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

    Authors: Chaoqin Huang, Haoyan Guan, Aofan Jiang, Yanfeng Wang, Michael Spratling, Xinchao Wang, Ya Zhang

    Abstract: Most existing anomaly detection methods require a dedicated model for each category. Such a paradigm, despite its promising results, is computationally expensive and inefficient, thereby failing to meet the requirements for real-world applications. Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this paper proposes a novel few-shot anomaly detection (FSAD)… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  31. arXiv:2406.08712  [pdf, other

    physics.ins-det hep-ex

    A Novel Diamond-like Carbon based photocathode for PICOSEC Micromegas detectors

    Authors: X. Wang, R. Aleksan, Y. Angelis, J. Bortfeldt, F. Brunbauer, M. Brunoldi, E. Chatzianagnostou, J. Datta, K. Degmelt, G. Fanourakis, D. Fiorina, K. J. Floethner, M. Gallinaro, F. Garcia, I. Giomataris, K. Gnanvo, F. J. Iguaz, D. Janssens, A. Kallitsopoulou, M. Kovacic, B. Kross, P. Legou, M. Lisowska, J. Liu, I. Maniatis , et al. (26 additional authors not shown)

    Abstract: The PICOSEC Micromegas (MM) detector is a precise timing gaseous detector based on a MM detector operating in a two-stage amplification mode and a Cherenkov radiator. Prototypes equipped with cesium iodide (CsI) photocathodes have shown promising time resolutions as precise as 24 picoseconds (ps) for Minimum Ionizing Particles. However, due to the high hygroscopicity and susceptibility to ion bomb… ▽ More

    Submitted 25 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  32. arXiv:2406.08698  [pdf, other

    astro-ph.HE hep-ph

    Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, accepted by PRL

  33. arXiv:2406.08487  [pdf, other

    cs.CV

    Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models

    Authors: Yi-Fan Zhang, Qingsong Wen, Chaoyou Fu, Xue Wang, Zhang Zhang, Liang Wang, Rong **

    Abstract: Seeing clearly with high resolution is a foundation of Large Multimodal Models (LMMs), which has been proven to be vital for visual perception and reasoning. Existing works usually employ a straightforward resolution upscaling method, where the image consists of global and local branches, with the latter being the sliced image patches but resized to the same resolution as the former. This means th… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Project page: https://github.com/yfzhang114/SliME

  34. arXiv:2406.08407  [pdf, other

    cs.CV cs.AI cs.CL

    MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

    Authors: Xuehai He, Weixi Feng, Kaizhi Zheng, Yujie Lu, Wanrong Zhu, Jiachen Li, Yue Fan, Jianfeng Wang, Linjie Li, Zhengyuan Yang, Kevin Lin, William Yang Wang, Lijuan Wang, Xin Eric Wang

    Abstract: Multimodal Language Language Models (MLLMs) demonstrate the emerging abilities of "world models" -- interpreting and reasoning about complex real-world dynamics. To assess these abilities, we posit videos are the ideal medium, as they encapsulate rich representations of real-world dynamics and causalities. To this end, we introduce MMWorld, a new benchmark for multi-discipline, multi-faceted multi… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  35. arXiv:2406.08393  [pdf, other

    eess.AS cs.SD

    SCDNet: Self-supervised Learning Feature-based Speaker Change Detection

    Authors: Yue Li, Xinsheng Wang, Li Zhang, Lei Xie

    Abstract: Speaker Change Detection (SCD) is to identify boundaries among speakers in a conversation. Motivated by the success of fine-tuning wav2vec 2.0 models for the SCD task, a further investigation of self-supervised learning (SSL) features for SCD is conducted in this work. Specifically, an SCD model, named SCDNet, is proposed. With this model, various state-of-the-art SSL models, including Hubert, wav… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  36. arXiv:2406.08334  [pdf, other

    cs.DC cs.AI cs.LG cs.PF

    ProTrain: Efficient LLM Training via Memory-Aware Techniques

    Authors: Hanmei Yang, ** Zhou, Yao Fu, Xiaoqun Wang, Ramine Roane, Hui Guan, Tong** Liu

    Abstract: It is extremely memory-hungry to train Large Language Models (LLM). To solve this problem, existing work exploits the combination of CPU and GPU for the training process, such as ZeRO-Offload. Such a technique largely democratizes billion-scale model training, making it possible to train with few consumer graphics cards. However, based on our observation, existing frameworks often provide coarse-g… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  37. arXiv:2406.08305  [pdf, other

    cs.NI eess.SP

    Large Language Model(LLM) assisted End-to-End Network Health Management based on Multi-Scale Semanticization

    Authors: Fengxiao Tang, Xiaonan Wang, Xun Yuan, Linfeng Luo, Ming Zhao, Nei Kato

    Abstract: Network device and system health management is the foundation of modern network operations and maintenance. Traditional health management methods, relying on expert identification or simple rule-based algorithms, struggle to cope with the dynamic heterogeneous networks (DHNs) environment. Moreover, current state-of-the-art distributed anomaly detection methods, which utilize specific machine learn… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  38. arXiv:2406.08301  [pdf, other

    nucl-ex

    Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV

    Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, S. Afanasiev, C. Aidala, N. N. Ajitanand, Y. Akiba, H. Al-Bataineh, J. Alexander, M. Alfred, K. Aoki, N. Apadula, L. Aphecetche, J. Asai, H. Asano, E. T. Atomssa, R. Averbeck, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, G. Baksay, L. Baksay, A. Baldisseri , et al. (510 additional authors not shown)

    Abstract: High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 534 authors from 83 institutions, 12 pages, 7 figures. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

  39. arXiv:2406.08251  [pdf, other

    quant-ph physics.atom-ph physics.optics

    Light-induced fictitious magnetic fields for quantum storage in cold atomic ensembles

    Authors: Jianmin Wang, Liang Dong, Xingchang Wang, Zihan Zhou, Ying Zuo, Georgios A. Siviloglou, J. F. Chen

    Abstract: In this work, we have demonstrated that optically generated fictitious magnetic fields can be utilized to extend the lifetime of quantum memories in cold atomic ensembles. All the degrees of freedom of an AC Stark shift such as polarization, spatial profile, and temporal waveform can be readily controlled in a precise manner. Temporal fluctuations over several experimental cycles, and spatial inho… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 14pages,8 figures

  40. arXiv:2406.08225  [pdf, ps, other

    hep-ex

    Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (636 additional authors not shown)

    Abstract: Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  41. arXiv:2406.08190  [pdf

    physics.soc-ph nlin.AO

    CrowdEgress: A Multi-Agent Simulation Platform for Pedestrian Crowd

    Authors: Peng Wang, Xiaoda Wang, Peter Luh, Neal Olderman, Christian Wilkie, Timo Korhonen, Gregor Jäger

    Abstract: This article introduces a simulation platform to study complex crowd behavior in social context. The agent-based model is extended based on the well-known social force model, and it mainly describes how agents interact with each other, and also with surrounding facilities such as walls, doors and exits. The simulation platform is compatible to FDS+Evac, and the input data in FDS+Evac could be impo… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 23 pages, 18 figures

  42. arXiv:2406.08112  [pdf, other

    cs.SD cs.AI eess.AS

    Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio

    Authors: Yi Lu, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Zhiyong Wang, Xin Qi, Xuefei Liu, Yongwei Li, Yukun Liu, Xiaopeng Wang, Shuchen Shi

    Abstract: With the proliferation of Large Language Model (LLM) based deepfake audio, there is an urgent need for effective detection methods. Previous deepfake audio generation methods typically involve a multi-step generation process, with the final step using a vocoder to predict the waveform from handcrafted features. However, LLM-based audio is directly generated from discrete neural codecs in an end-to… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024. arXiv admin note: substantial text overlap with arXiv:2405.04880

  43. arXiv:2406.08097  [pdf, other

    cs.LG stat.AP stat.ME

    Inductive Global and Local Manifold Approximation and Projection

    Authors: Jungeum Kim, Xiao Wang

    Abstract: Nonlinear dimensional reduction with the manifold assumption, often called manifold learning, has proven its usefulness in a wide range of high-dimensional data analysis. The significant impact of t-SNE and UMAP has catalyzed intense research interest, seeking further innovations toward visualizing not only the local but also the global structure information of the data. Moreover, there have been… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  44. arXiv:2406.08037  [pdf, other

    cs.CV

    Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking

    Authors: Xiangyang Yang, Dan Zeng, Xucheng Wang, You Wu, Hengzhou Ye, Qijun Zhao, Shuiwang Li

    Abstract: Empowered by transformer-based models, visual tracking has advanced significantly. However, the slow speed of current trackers limits their applicability on devices with constrained computational resources. To address this challenge, we introduce ABTrack, an adaptive computation framework that adaptively bypassing transformer blocks for efficient visual tracking. The rationale behind ABTrack is ro… ▽ More

    Submitted 1 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  45. arXiv:2406.07926  [pdf, other

    cs.LG cs.AI cs.SI

    Efficient Neural Common Neighbor for Temporal Graph Link Prediction

    Authors: Xiaohui Zhang, Yanbo Wang, Xiyuan Wang, Muhan Zhang

    Abstract: Temporal graphs are ubiquitous in real-world scenarios, such as social network, trade and transportation. Predicting dynamic links between nodes in a temporal graph is of vital importance. Traditional methods usually leverage the temporal neighborhood of interaction history to generate node embeddings first and then aggregate the source and target node embeddings to predict the link. However, such… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  46. arXiv:2406.07870  [pdf, ps, other

    math.OC

    Event-Triggered Optimal Tracking Control for Strict-Feedback Nonlinear Systems With Non-Affine Nonlinear Faults

    Authors: Ling Wang, Xin Wang, Ziming Wang

    Abstract: This article studies the control ideas of the optimal backstep** technique, proposing an event-triggered optimal tracking control scheme for a class of strict-feedback nonlinear systems with non-affine and nonlinear faults. A simplified identifier-critic-actor framework is employed in the reinforcement learning algorithm to achieve optimal control. The identifier estimates the unknown dynamic fu… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  47. arXiv:2406.07857  [pdf, other

    eess.SY cs.LG cs.NI

    Toward Enhanced Reinforcement Learning-Based Resource Management via Digital Twin: Opportunities, Applications, and Challenges

    Authors: Nan Cheng, Xiucheng Wang, Zan Li, Zhisheng Yin, Tom Luan, Xuemin Shen

    Abstract: This article presents a digital twin (DT)-enhanced reinforcement learning (RL) framework aimed at optimizing performance and reliability in network resource management, since the traditional RL methods face several unified challenges when applied to physical networks, including limited exploration efficiency, slow convergence, poor long-term performance, and safety concerns during the exploration… ▽ More

    Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 7pages, 6figures

  48. arXiv:2406.07834  [pdf

    eess.SY

    Research on material identification of mobile phones falling to the ground

    Authors: Xuesong Wang

    Abstract: The failure mode of the phone falling has a lot to do with the ground material. At present, the research on ground material and mobile phone damage is generally carried out through experiments, which is extremely costly. This paper presents a method to identify the material of mobile phones falling on the ground. The method determines the material of the mobile phone falling to the ground accordin… ▽ More

    Submitted 12 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  49. arXiv:2406.07816  [pdf, other

    eess.AS cs.CL cs.SD

    Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio

    Authors: Lin Zhang, Xin Wang, Erica Cooper, Mireia Diez, Federico Landini, Nicholas Evans, Junichi Yamagishi

    Abstract: This paper defines Spoof Diarization as a novel task in the Partial Spoof (PS) scenario. It aims to determine what spoofed when, which includes not only locating spoof regions but also clustering them according to different spoofing methods. As a pioneering study in spoof diarization, we focus on defining the task, establishing evaluation metrics, and proposing a benchmark model, namely the Counte… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  50. arXiv:2406.07806  [pdf, other

    astro-ph.HE astro-ph.SR

    Probing the Shock Breakout Signal of SN 2024ggi from the Transformation of Early Flash Spectroscopy

    Authors: Jujia Zhang, Luc Dessart, Xiaofeng Wang, Qian Zhai, Yi Yang, Li** Li, Han Lin, Giorgio Valerin, Yongzhi Cai, Zhen Guo, Lingzhi Wang, Zeyi Zhao, Zhenyu Wang, Shengyu Yan

    Abstract: We present early-time, hour-to-day cadence spectroscopy of the nearby type II supernova (SN II) 2024ggi, which was discovered at a phase when the SN shock just emerged from the red-supergiant (RSG) progenitor star. Over the first few days after the first light, SN 2024ggi exhibited prominent narrow emission lines formed through intense and persistent photoionization of the nearby circumstellar mat… ▽ More

    Submitted 29 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: 10 pages and 5 figures in the main text (16 pages and 9 figures in total). Accepted for publication in ApJL