Skip to main content

Showing 1–50 of 2,366 results for author: wu, W

.
  1. arXiv:2407.06112  [pdf, other

    cs.CL

    Enhancing Language Model Rationality with Bi-Directional Deliberation Reasoning

    Authors: Yadong Zhang, Shaoguang Mao, Wenshan Wu, Yan Xia, Tao Ge, Man Lan, Furu Wei

    Abstract: This paper introduces BI-Directional DEliberation Reasoning (BIDDER), a novel reasoning approach to enhance the decision rationality of language models. Traditional reasoning methods typically rely on historical information and employ uni-directional (left-to-right) reasoning strategy. This lack of bi-directional deliberation reasoning results in limited awareness of potential future outcomes and… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2407.04118  [pdf, other

    cs.CL cs.AI

    MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization

    Authors: Yuyan Chen, Zhihao Wen, Ge Fan, Zhengyu Chen, Wei Wu, Dayiheng Liu, Zhixu Li, Bang Liu, Yanghua Xiao

    Abstract: Prompt engineering, as an efficient and effective way to leverage Large Language Models (LLM), has drawn a lot of attention from the research community. The existing research primarily emphasizes the importance of adapting prompts to specific tasks, rather than specific LLMs. However, a good prompt is not solely defined by its wording, but also binds to the nature of the LLM in question. In this w… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted to EMNLP 2023 (Findings)

  3. arXiv:2407.01079  [pdf, ps, other

    stat.ML cs.AI cs.LG

    On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

    Authors: Jerry Yao-Chieh Hu, Weimin Wu, Zhuoru Li, Zhao Song, Han Liu

    Abstract: We investigate the statistical and computational limits of latent \textbf{Di}ffusion \textbf{T}ransformers (\textbf{DiT}s) under the low-dimensional linear latent space assumption. Statistically, we study the universal approximation and sample complexity of the DiTs score function, as well as the distribution recovery property of the initial data. Specifically, under mild data assumptions, we deri… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  4. arXiv:2406.19934  [pdf, other

    cs.CL cs.AI

    From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis

    Authors: Chuanqi Cheng, Jian Guan, Wei Wu, Rui Yan

    Abstract: We explore multi-step reasoning in vision-language models (VLMs). The problem is challenging, as reasoning data consisting of multiple steps of visual and language processing are barely available. To overcome the challenge, we first introduce a least-to-most visual reasoning paradigm, which interleaves steps of decomposing a question into sub-questions and invoking external tools for resolving sub… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  5. arXiv:2406.19781  [pdf, other

    cs.RO

    LCSim: A Large-Scale Controllable Traffic Simulator

    Authors: Yuheng Zhang, Tianjian Ouyang, Fudan Yu, Cong Ma, Lei Qiao, Wei Wu, Jian Yuan, Yong Li

    Abstract: With the rapid development of urban transportation and the continuous advancement in autonomous vehicles, the demand for safely and efficiently testing autonomous driving and traffic optimization algorithms arises, which needs accurate modeling of large-scale urban traffic scenarios. Existing traffic simulation systems encounter two significant limitations. Firstly, they often rely on open-source… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Submitted to the 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks

  6. arXiv:2406.19608  [pdf, other

    eess.SY

    Multi-service collaboration and composition of cloud manufacturing customized production based on problem decomposition

    Authors: Hao Yue, Yingtao Wu, Min Wang, Hesuan Hu, Weimin Wu, Jihui Zhang

    Abstract: Cloud manufacturing system is a service-oriented and knowledge-based one, which can provide solutions for the large-scale customized production. The service resource allocation is the primary factor that restricts the production time and cost in the cloud manufacturing customized production (CMCP). In order to improve the efficiency and reduce the cost in CMCP, we propose a new framework which con… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 12 pages, 8 figures

    ACM Class: J.0

  7. Blockchain Based Zero-Knowledge Proof of Location in IoT

    Authors: Wei Wu, Erwu Liu, Xinglin Gong, Rui Wang

    Abstract: With the development of precise positioning technology, a growing number of location-based services (LBSs) facilitate people's life. Most LBSs require proof of location (PoL) to prove that the user satisfies the service requirement, which exposes the user's privacy. In this paper, we propose a zero-knowledge proof of location (zk-PoL) protocol to better protect the user's privacy. With the zk-PoL… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Published on ICC 2020-2020 IEEE International Conference on Communications (ICC)

  8. arXiv:2406.18045  [pdf, other

    cs.CL cs.AI

    PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry

    Authors: Linqing Chen, Weilei Wang, Zilong Bai, Peng Xu, Yan Fang, Jie Fang, Wentao Wu, Lizhi Zhou, Ruiji Zhang, Yubin Xia, Chaobo Xu, Ran Hu, Licong Xu, Qijun Cai, Haoran Hua, **g Sun, ** Liu, Tian Qiu, Haowen Liu, Meng Hu, Xiuwen Li, Fei Gao, Yufu Wang, Lin Tie, Chaochao Wang , et al. (11 additional authors not shown)

    Abstract: Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general pu… ▽ More

    Submitted 3 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  9. arXiv:2406.17824  [pdf, other

    hep-ph hep-ex hep-lat

    Fully heavy tetraquark resonant states with different flavors

    Authors: Wei-Lin Wu, Yao Ma, Yan-Ke Chen, Lu Meng, Shi-Lin Zhu

    Abstract: We use the quark potential model to calculate the mass spectrum of the S-wave fully heavy tetraquark systems with different flavors, including the $ bc\bar b\bar c, bb\bar c\bar c, cc\bar c\bar b $ and $ bb\bar b\bar c $ systems. We employ the Gaussian expansion method to solve the four-body Schrödinger equation, and the complex scaling method to identify resonant states. The… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 10 pages,7 figures,8 tables

  10. arXiv:2406.16066  [pdf, other

    cs.CE

    Constructing Boundary-identical Microstructures by Guided Diffusion for Fast Multiscale Designs

    Authors: **gxuan Feng, Lili Wang, Xiaoya Zhai, Kai Chen, Wenming Wu, Ligang Liu, Xiao-Ming Fu

    Abstract: We propose a novel method to construct large-scale boundary-identical microstructure datasets with high attribute coverage for highly efficient multiscale design. Central to our technique is using a deep generative model to generate microstructures under the two conditions, including the specified boundary and homogenized elastic tensor. We achieve the desired dataset by alternately adding microst… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  11. arXiv:2406.15245  [pdf, other

    cs.CL cs.LG

    Unsupervised Morphological Tree Tokenizer

    Authors: Qingyang Zhu, Xiang Hu, Pengyu Ji, Wei Wu, Kewei Tu

    Abstract: As a cornerstone in language modeling, tokenization involves segmenting text inputs into pre-defined atomic units. Conventional statistical tokenizers often disrupt constituent boundaries within words, thereby corrupting semantic information. To address this drawback, we introduce morphological structure guidance to tokenization and propose a deep model to induce character-level structures of word… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  12. arXiv:2406.14753  [pdf, other

    cs.LG stat.ME

    A General Control-Theoretic Approach for Reinforcement Learning: Theory and Algorithms

    Authors: Weiqin Chen, Mark S. Squillante, Chai Wah Wu, Santiago Paternain

    Abstract: We devise a control-theoretic reinforcement learning approach to support direct learning of the optimal policy. We establish theoretical properties of our approach and derive an algorithm based on a specific instance of this approach. Our empirical results demonstrate the significant benefits of our approach.

    Submitted 20 June, 2024; originally announced June 2024.

  13. arXiv:2406.14133  [pdf, other

    physics.optics

    Beam sha** by nonlinear moiré metasurfaces

    Authors: Lun Qu, Wei Wu, Di Zhang, Chenxiong Wang, Lu Bai, Chenyang Li, Wei Cai, Mengxin Ren, Andrea Alù, **gjun Xu

    Abstract: This paper explores the interplay of momentum transfer and nonlinear optical processes through moiré phenomena. Momentum transfer plays a crucial role in the interaction between photons and matter. Here, we study stacked metasurfaces with tailored dispersion and rotated against each other with varying twisted angles. The stacking introduces interlayer interactions, which can be controlled by the r… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures

  14. arXiv:2406.13625  [pdf

    cs.CV cs.AI physics.med-ph

    Enhance the Image: Super Resolution using Artificial Intelligence in MRI

    Authors: Ziyu Li, Zihan Li, Haoxiang Li, Qiuyun Fan, Karla L. Miller, Wenchuan Wu, Akshay S. Chaudhari, Qiyuan Tian

    Abstract: This chapter provides an overview of deep learning techniques for improving the spatial resolution of MRI, ranging from convolutional neural networks, generative adversarial networks, to more advanced models including transformers, diffusion models, and implicit neural representations. Our exploration extends beyond the methodologies to scrutinize the impact of super-resolved images on clinical an… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: A book chapter in Machine Learning in MRI: From methods to clinical translation. Copyright may be transferred without notice, after which this version may no longer be accessible

  15. arXiv:2406.13410  [pdf, other

    physics.atom-ph

    Exploring atom-ion Feshbach resonances below the s-wave limit

    Authors: Fabian Thielemann, Joachim Siemund, Daniel von Schoenfeld, Wei Wu, Pascal Weckesser, Krzysztof Jachymski, Thomas Walker, Tobias Schaetz

    Abstract: Revealing the quantum properties of matter requires a high degree of experimental control accompanied by a profound theoretical understanding. At ultracold temperatures, quantities that appear continuous in everyday life, such as the motional angular momentum of two colliding particles, become quantized, leaving a measurable imprint on experimental results. Embedding a single particle within a lar… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  16. arXiv:2406.11698  [pdf, other

    cs.CL

    Meta Reasoning for Large Language Models

    Authors: Peizhong Gao, Ao Xie, Shaoguang Mao, Wenshan Wu, Yan Xia, Haipeng Mi, Furu Wei

    Abstract: We introduce Meta-Reasoning Prompting (MRP), a novel and efficient system prompting method for large language models (LLMs) inspired by human meta-reasoning. Traditional in-context learning-based reasoning techniques, such as Tree-of-Thoughts, show promise but lack consistent state-of-the-art performance across diverse tasks due to their specialized nature. MRP addresses this limitation by guiding… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  17. arXiv:2406.11633  [pdf, other

    cs.CV

    DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

    Authors: Renqiu Xia, Song Mao, Xiangchao Yan, Hongbin Zhou, Bo Zhang, Haoyang Peng, Jiahao Pi, Daocheng Fu, Wenjie Wu, Hancheng Ye, Shiyang Feng, Bin Wang, Chao Xu, Conghui He, Pinlong Cai, Min Dou, Botian Shi, Sheng Zhou, Yongwei Wang, Bin Wang, Junchi Yan, Fei Wu, Yu Qiao

    Abstract: Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extract… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Homepage of DocGenome: https://unimodal4reasoning.github.io/DocGenome_page 22 pages, 11 figures

  18. arXiv:2406.11176  [pdf, other

    cs.CL cs.AI

    Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement

    Authors: Weimin Xiong, Yifan Song, Xiutian Zhao, Wenhao Wu, Xun Wang, Ke Wang, Cheng Li, Wei Peng, Sujian Li

    Abstract: Large language model agents have exhibited exceptional performance across a range of complex interactive tasks. Recent approaches have utilized tuning with expert trajectories to enhance agent performance, yet they primarily concentrate on outcome rewards, which may lead to errors or suboptimal actions due to the absence of process supervision signals. In this paper, we introduce the Iterative ste… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  19. arXiv:2406.10583  [pdf, other

    hep-ex

    Demonstration of neutron identification in neutrino interactions in the MicroBooNE liquid argon time projection chamber

    Authors: MicroBooNE collaboration, P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, A. Barnard, G. Barr, D. Barrow, J. Barrow, V. Basque, J. Bateman, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J. Y. Book , et al. (165 additional authors not shown)

    Abstract: A significant challenge in measurements of neutrino oscillations is reconstructing the incoming neutrino energies. While modern fully-active tracking calorimeters such as liquid argon time projection chambers in principle allow the measurement of all final state particles above some detection threshold, undetected neutrons remain a considerable source of missing energy with little to no data const… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Report number: FERMILAB-PUB-24-0301

  20. arXiv:2406.10123  [pdf, other

    hep-ex physics.ins-det

    Improving neutrino energy estimation of charged-current interaction events with recurrent neural networks in MicroBooNE

    Authors: MicroBooNE collaboration, P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, A. Barnard, G. Barr, D. Barrow, J. Barrow, V. Basque, J. Bateman, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J. Y. Book , et al. (164 additional authors not shown)

    Abstract: We present a deep learning-based method for estimating the neutrino energy of charged-current neutrino-argon interactions. We employ a recurrent neural network (RNN) architecture for neutrino energy estimation in the MicroBooNE experiment, utilizing liquid argon time projection chamber (LArTPC) detector technology. Traditional energy estimation approaches in LArTPCs, which largely rely on reconstr… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Report number: FERMILAB-PUB-24-0287

  21. arXiv:2406.09333  [pdf, other

    cs.CV

    Memory-Efficient Sparse Pyramid Attention Networks for Whole Slide Image Analysis

    Authors: Weiyi Wu, Chongyang Gao, Xinwen Xu, Siting Li, Jiang Gui

    Abstract: Whole Slide Images (WSIs) are crucial for modern pathological diagnosis, yet their gigapixel-scale resolutions and sparse informative regions pose significant computational challenges. Traditional dense attention mechanisms, widely used in computer vision and natural language processing, are impractical for WSI analysis due to the substantial data scale and the redundant processing of uninformativ… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  22. arXiv:2406.09194  [pdf, ps, other

    stat.ML cs.IT cs.LG math.NA math.ST

    Benign overfitting in Fixed Dimension via Physics-Informed Learning with Smooth Inductive Bias

    Authors: Honam Wong, Wendao Wu, Fanghui Liu, Yi** Lu

    Abstract: Recent advances in machine learning have inspired a surge of research into reconstructing specific quantities of interest from measurements that comply with certain physical laws. These efforts focus on inverse problems that are governed by partial differential equations (PDEs). In this work, we develop an asymptotic Sobolev norm learning curve for kernel ridge(less) regression when addressing (el… ▽ More

    Submitted 16 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  23. arXiv:2406.09081  [pdf, ps, other

    math.NT math.DS

    Multifractal analysis of the growth rate of digits in Schneider's $p$-adic continued fraction dynamical system

    Authors: Kunkun Song, Wanlou Wu, Yueli Yu, Sainan Zeng

    Abstract: Let $\mathbb{Z}_p$ be the ring of $p$-adic integers and $a_n(x)$ be the $n$-th digit of Schneider's $p$-adic continued fraction of $x\in p\mathbb{Z}_p$. We study the growth rate of the digits $\{a_n(x)\}_{n\geq1}$ from the viewpoint of multifractal analysis. The Hausdorff dimension of the set \[E_{\sup}(ψ)=\Big\{x\in p\mathbb{Z}_p:\ \limsup\limits_{n\to\infty}\frac{a_n(x)}{ψ(n)}=1\Big\}\] is compl… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  24. arXiv:2406.07411  [pdf, other

    cs.SE cs.CL

    VersiCode: Towards Version-controllable Code Generation

    Authors: Tongtong Wu, Weigang Wu, Xingyu Wang, Kang Xu, Suyu Ma, Bo Jiang, ** Yang, Zhenchang Xing, Yuan-Fang Li, Gholamreza Haffari

    Abstract: Significant research has focused on improving the performance of large language model on code-related tasks due to their practical importance. Although performance is typically evaluated using public benchmark datasets, the existing datasets do not account for the concept of \emph{version}, which is crucial in professional software development. In this paper, we introduce VersiCode, the first comp… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  25. arXiv:2406.06393  [pdf, other

    cs.CV cs.CL q-bio.GN

    STimage-1K4M: A histopathology image-gene expression dataset for spatial transcriptomics

    Authors: Jiawen Chen, Muqing Zhou, Wenrong Wu, **wei Zhang, Yun Li, Didong Li

    Abstract: Recent advances in multi-modal algorithms have driven and been driven by the increasing availability of large image-text datasets, leading to significant strides in various fields, including computational pathology. However, in most existing medical image-text datasets, the text typically provides high-level summaries that may not sufficiently describe sub-tile regions within a large pathology ima… ▽ More

    Submitted 20 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    ACM Class: I.4.10; I.2.10

  26. arXiv:2406.05875  [pdf, other

    physics.optics cond-mat.mes-hall cond-mat.mtrl-sci physics.app-ph

    Hybrid terahertz emitter for pulse sha** and chirality control

    Authors: Weipeng Wu, Wilder Acuna, Zhixiang Huang, Xi Wang, Lars Gundlach, Matthew F. Doty, Joshua M. O. Zide, M. Benjamin Jungfleisch

    Abstract: Terahertz (THz) radiation, spanning from 0.3 to 3x10^12 Hz, fills the crucial gap between the microwave and infrared spectral range. THz technology has found applications in various fields, from imaging and sensing to telecommunication and biosensing. However, the full potential of these applications is often hindered by the need for precise control and manipulation of the frequency and polarizati… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  27. arXiv:2406.03882  [pdf, other

    cs.CL cs.SD eess.AS

    Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models

    Authors: Ziyun Cui, Chang Lei, Wen Wu, Yinan Duan, Diyang Qu, Ji Wu, Runsen Chen, Chao Zhang

    Abstract: The early detection of suicide risk is important since it enables the intervention to prevent potential suicide attempts. This paper studies the automatic detection of suicide risk based on spontaneous speech from adolescents, and collects a Mandarin dataset with 15 hours of suicide speech from more than a thousand adolescents aged from ten to eighteen for our experiments. To leverage the diverse… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  28. arXiv:2406.03199  [pdf, other

    cs.CL cs.AI cs.LG

    Bayesian WeakS-to-Strong from Text Classification to Generation

    Authors: Ziyun Cui, Ziyang Zhang, Wen Wu, Guangzhi Sun, Chao Zhang

    Abstract: Advances in large language models raise the question of how alignment techniques will adapt as models become increasingly complex and humans will only be able to supervise them weakly. Weak-to-Strong mimics such a scenario where weak model supervision attempts to harness the full capabilities of a much stronger model. This work extends Weak-to-Strong to WeakS-to-Strong by exploring an ensemble of… ▽ More

    Submitted 24 May, 2024; originally announced June 2024.

  29. arXiv:2406.02987  [pdf, other

    cs.CV

    Enhancing Multimodal Large Language Models with Multi-instance Visual Prompt Generator for Visual Representation Enrichment

    Authors: Wenliang Zhong, Wenyi Wu, Qi Li, Rob Barton, Boxin Du, Shioulin Sam, Karim Bouyarmane, Ismail Tutar, Junzhou Huang

    Abstract: Multimodal Large Language Models (MLLMs) have achieved SOTA performance in various visual language tasks by fusing the visual representations with LLMs leveraging some visual adapters. In this paper, we first establish that adapters using query-based Transformers such as Q-former is a simplified Multi-instance Learning method without considering instance heterogeneity/correlation. We then propose… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  30. arXiv:2406.02861  [pdf, ps, other

    math.AP

    Diffusive Limit of the One-species Vlasov-Maxwell-Boltzmann System for Cutoff Hard Potentials

    Authors: Weijun Wu, Fujun Zhou, Weihua Gong, Yuan Xu

    Abstract: Diffusive limit of the one-species Vlasov-Maxwell-Boltzmann system in perturbation framework still remains unsolved, due to the weaker time decay rate compared with the two-species Vlasov-Maxwell-Boltzmann system. By employing the weighted energy method with two newly introduced weight functions and some novel treatments, we solve this problem for the full range of cutoff hard potentials… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 60 pages. arXiv admin note: text overlap with arXiv:2312.16588

    MSC Class: 35Q20; 35Q83

  31. arXiv:2406.01644  [pdf, other

    eess.IV

    Dual-Stream Attention Network for Hyperspectral Image Unmixing

    Authors: Yufang Wang, Wenmin Wu, Lin Qi, Feng Gao

    Abstract: Hyperspectral image (HSI) contains abundant spatial and spectral information, making it highly valuable for unmixing. In this paper, we propose a Dual-Stream Attention Network (DSANet) for HSI unmixing. The endmembers and abundance of a pixel in HSI have high correlations with its adjacent pixels. Therefore, we adopt a "many to one" strategy to estimate the abundance of the central pixel. In addit… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE IGARSS 2024

  32. arXiv:2406.01059  [pdf, other

    cs.CV

    VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model

    Authors: **ze Yang, Haoran Wang, Zining Zhu, Chenglong Liu, Meng Wymond Wu, Zeke Xie, Zhong Ji, Jungong Han, Mingming Sun

    Abstract: In this paper, we focus on resolving the problem of image outpainting, which aims to extrapolate the surrounding parts given the center contents of an image. Although recent works have achieved promising performance, the lack of versatility and customization hinders their practical applications in broader scenarios. Therefore, this work presents a novel image outpainting framework that is capable… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 15 pages

  33. arXiv:2406.01050  [pdf, ps, other

    math.RT

    Quiver Hecke algebras for Borcherds-Cartan datum II

    Authors: Bolun Tong, Wan Wu

    Abstract: We give the crystal structure of the Grothendieck group $G_0(R)$ of irreducible modules over the quiver Hecke algebra $R$ constructed in \cite{TW2023}. This leads to the categorification of the crystal $B(\infty)$ of the quantum Borcherds algebra $U_q(\mathscr g)$ and its irreducible highest weight crystal $B(λ)$ for arbitrary Borcherds-Cartan data. Additionally, we study the cyclotomic categorifi… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  34. arXiv:2406.01007  [pdf, other

    hep-ex

    Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay

    Authors: Daya Bay collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, J. Cheng, Y. -C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng , et al. (177 additional authors not shown)

    Abstract: This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  35. arXiv:2406.00654  [pdf, other

    cs.CL cs.SD eess.AS

    Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback

    Authors: Chen Chen, Yuchen Hu, Wen Wu, Helin Wang, Eng Siong Chng, Chao Zhang

    Abstract: In recent years, text-to-speech (TTS) technology has witnessed impressive advancements, particularly with large-scale training datasets, showcasing human-level speech quality and impressive zero-shot capabilities on unseen speakers. However, despite human subjective evaluations, such as the mean opinion score (MOS), remaining the gold standard for assessing the quality of synthetic speech, even st… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 19 pages, Preprint

  36. arXiv:2405.20064  [pdf, other

    eess.AS cs.SD

    1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem

    Authors: Mingjie Chen, Hezhao Zhang, Yuanchao Li, Jiachen Luo, Wen Wu, Ziyang Ma, Peter Bell, Catherine Lai, Joshua Reiss, Lin Wang, Philip C. Woodland, Xie Chen, Huy Phan, Thomas Hain

    Abstract: Speech emotion recognition is a challenging classification task with natural emotional speech, especially when the distribution of emotion types is imbalanced in the training and test data. In this case, it is more difficult for a model to learn to separate minority classes, resulting in those sometimes being ignored or frequently misclassified. Previous work has utilised class weighted loss for t… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  37. arXiv:2405.19469  [pdf, other

    astro-ph.CO

    Constraining Inflation with the BICEP/Keck CMB Polarization Experiments

    Authors: The BICEP/Keck Collaboration, :, P. A. R. Ade, Z. Ahmed, M. Amiri, D. Barkats, R. Basu Thakur, C. A. Bischoff, D. Beck, J. J. Bock, H. Boenish, V. Buza, J. R. Cheshire IV, J. Connors, J. Cornelison, M. Crumrine, A. Cukierman, E. V. Denison, M. Dierickx, L. Duband, M. Eiben, B. Elwood, S. Fatigoni, J. P. Filippini, M. Gao , et al. (63 additional authors not shown)

    Abstract: The BICEP/$\textit{Keck}$ (BK) series of cosmic microwave background (CMB) polarization experiments has, over the past decade and a half, produced a series of field-leading constraints on cosmic inflation via measurements of the "B-mode" polarization of the CMB. Primordial B modes are directly tied to the amplitude of primordial gravitational waves (PGW), their strength parameterized by the tensor… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 9 pages, 5 figures. Contribution to the 2024 Cosmology session of the 58th Rencontres de Moriond

  38. arXiv:2405.18767  [pdf, other

    astro-ph.GA

    Kinetic temperature of massive star-forming molecular clumps measured with formaldehyde V. The massive filament DR21

    Authors: X. Zhao, X. D. Tang, C. Henkel, Y. Gong, Y. Lin, D. L. Li, Y. X. He, Y. P. Ao, X. Lu, T. Liu, Y. Sun, K. Wang, X. P. Chen, J. Esimbek, J. J. Zhou, J. W. Wu, J. J. Qiu, X. W. Zheng, J. S. Li, C. S. Luo, Q. Zhao

    Abstract: The kinetic temperature structure of the massive filament DR21 has been mapped using the IRAM 30 m telescope. This map** employed the para-H$_2$CO triplet ($J_{\rm K_aK_c}$ = 3$_{03}$--2$_{02}$, 3$_{22}$--2$_{21}$, and 3$_{21}$--2$_{20}$) on a scale of $\sim$0.1 pc. By modeling the averaged line ratios of para-H$_{2}$CO with RADEX under non-LTE assumptions, the kinetic temperature of the dense g… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 16 pages, 8 figures, 3 tabels. Accepted for publication by Astronomy & Astrophysics

  39. arXiv:2405.17659  [pdf, other

    eess.IV cs.CV

    Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba

    Authors: Jiahao Huang, Liutao Yang, Fanwen Wang, Yang Nan, Weiwen Wu, Chengyan Wang, Kuangyu Shi, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Daoqiang Zhang, Guang Yang

    Abstract: Deep learning has been extensively applied in medical image reconstruction, where Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) represent the predominant paradigms, each possessing distinct advantages and inherent limitations: CNNs exhibit linear complexity with local sensitivity, whereas ViTs demonstrate quadratic complexity with global sensitivity. The emerging Mamba has sh… ▽ More

    Submitted 25 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  40. arXiv:2405.17167  [pdf

    eess.IV cs.CV

    Partitioned Hankel-based Diffusion Models for Few-shot Low-dose CT Reconstruction

    Authors: Wenhao Zhang, Bin Huang, Shuyue Chen, Xiaoling Xu, Weiwen Wu, Qiegen Liu

    Abstract: Low-dose computed tomography (LDCT) plays a vital role in clinical applications by mitigating radiation risks. Nevertheless, reducing radiation doses significantly degrades image quality. Concurrently, common deep learning methods demand extensive data, posing concerns about privacy, cost, and time constraints. Consequently, we propose a few-shot low-dose CT reconstruction method using Partitioned… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  41. arXiv:2405.16464  [pdf, other

    cs.RO cs.CV

    Multi-Modal UAV Detection, Classification and Tracking Algorithm -- Technical Report for CVPR 2024 UG2 Challenge

    Authors: Tianchen Deng, Yi Zhou, Wenhua Wu, Mingrui Li, **gwei Huang, Shuhong Liu, Yanzeng Song, Hao Zuo, Yanbo Wang, Yutao Yue, Hesheng Wang, Weidong Chen

    Abstract: This technical report presents the 1st winning model for UG2+, a task in CVPR 2024 UAV Tracking and Pose-Estimation Challenge. This challenge faces difficulties in drone detection, UAV-type classification and 2D/3D trajectory estimation in extreme weather conditions with multi-modal sensor information, including stereo vision, various Lidars, Radars, and audio arrays. Leveraging this information… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR 2024 workshop. The 1st winning model in CVPR 2024 UG2+ challenge. The code and configuration of our method are available at https://github.com/dtc111111/Multi-Modal-UAV

  42. arXiv:2405.16028  [pdf

    physics.bio-ph nlin.CD

    Symmetry breaking of three self-organization rules:A general theory for the origin of complexity

    Authors: Wen-Hao Wu, Ze-Zheng Li, Wen-Xu Wang

    Abstract: Complex spatiotemporal patterns in nature significantly challenge reductionism-based modern science. The lack of a paradigm beyond reductionism hinders our understanding of the emergence of complexity. The diversity of countless patterns undermines any notion of universal mechanisms. Here, however, we show that breaking the symmetry of three simple and self-organization rules give rise to nearly a… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 97 pages, 60 figures, for full article, see https://doi.org/10.1142/S021812742430012X

    Journal ref: International Journal of Bifurcation and Chaos;Vol. 34, No. 06, 2430012 (2024)

  43. arXiv:2405.15677  [pdf, other

    cs.RO cs.CV

    SMART: Scalable Multi-agent Real-time Simulation via Next-token Prediction

    Authors: Wei Wu, Xiaoxin Feng, Ziyan Gao, Yuheng Kan

    Abstract: Data-driven autonomous driving motion generation tasks are frequently impacted by the limitations of dataset size and the domain gap between datasets, which precludes their extensive application in real-world scenarios. To address this issue, we introduce SMART, a novel autonomous driving motion generation paradigm that models vectorized map and agent trajectory data into discrete sequence tokens.… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  44. arXiv:2405.14621  [pdf, other

    hep-ph hep-th

    Blade: A package for block-triangular form improved Feynman integrals decomposition

    Authors: Xin Guan, Xiao Liu, Yan-Qing Ma, Wen-Hao Wu

    Abstract: In this article, we present the package Blade as the first implementation of the block-triangular form improved Feynman integral reduction method. The block-triangular form has orders of magnitude fewer equations compared to the plain integration-by-parts system, allowing for strictly block-by-block solutions. This results in faster evaluations and reduced resource consumption. We elucidate the al… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 20 pages, 8 figures, 10 tables

  45. arXiv:2405.14342  [pdf, other

    cs.CV

    RoGS: Large Scale Road Surface Reconstruction based on 2D Gaussian Splatting

    Authors: Zhiheng Feng, Wenhua Wu, Hesheng Wang

    Abstract: Road surface reconstruction plays a crucial role in autonomous driving, which can be used for road lane perception and autolabeling tasks. Recently, mesh-based road surface reconstruction algorithms show promising reconstruction results. However, these mesh-based methods suffer from slow speed and poor rendering quality. In contrast, the 3D Gaussian Splatting (3DGS) shows superior rendering speed… ▽ More

    Submitted 23 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  46. arXiv:2405.14256  [pdf, other

    cs.LG cs.AI

    ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification

    Authors: Yefei He, Luoming Zhang, Weijia Wu, **g Liu, Hong Zhou, Bohan Zhuang

    Abstract: KV cache stores key and value states from previous tokens to avoid re-computation, yet it demands substantial storage space, especially for long sequences. Adaptive KV cache compression seeks to discern the saliency of tokens, preserving vital information while aggressively compressing those of less importance. However, previous methods of this approach exhibit significant performance degradation… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 15 pages

  47. arXiv:2405.14231  [pdf, other

    cs.CL

    From Role-Play to Drama-Interaction: An LLM Solution

    Authors: Weiqi Wu, Hongqiu Wu, Lai Jiang, Xingyuan Liu, Jiale Hong, Hai Zhao, Min Zhang

    Abstract: Drama is a form of storytelling inspired by human creativity, proceeding with a predefined storyline, carrying emotions and thoughts. This paper introduces \emph{LLM-based interactive drama}, which endows traditional drama with an unprecedented immersion, where a person is allowed to walk into it and interact with the characters and scenes. We define this new artistic genre by 6 essential elements… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024 Findings

  48. arXiv:2405.13800  [pdf, other

    cs.CV cs.AI

    Dense Connector for MLLMs

    Authors: Huan** Yao, Wenhao Wu, Taojiannan Yang, YuXin Song, Mengxi Zhang, Haocheng Feng, Yifan Sun, Zhiheng Li, Wanli Ouyang, **gdong Wang

    Abstract: Do we fully leverage the potential of visual encoder in Multimodal Large Language Models (MLLMs)? The recent outstanding performance of MLLMs in multimodal understanding has garnered broad attention from both academia and industry. In the current MLLM rat race, the focus seems to be predominantly on the linguistic side. We witness the rise of larger and higher-quality instruction datasets, as well… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: Technical report. 25 pages

  49. arXiv:2405.13702  [pdf

    physics.app-ph cond-mat.mtrl-sci

    Impurity-level induced broadband photoelectric response in wide-band semiconductor SrSnO3

    Authors: Yuyang Zhang, Lisheng Wang, Weijie Wu, Zhaoyang Wang, Fei Sun, He Jiang, Bangmin Zhang, Yue Zheng

    Abstract: Broadband spectrum detectors exhibit great promise in fields such as multispectral imaging and optical communications. Despite significant progress, challenges like materials instability, complex manufacturing process and high costs still hinder further application. Here we present a method that achieves broadband spectral detect by impurity-level in SrSnO3. We report over 200 mA/W photo-responsiv… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 5 Figures

  50. arXiv:2405.13124  [pdf, other

    astro-ph.GA astro-ph.SR

    The Pristine survey -- XXVI. The very metal-poor Galaxy: Chemodynamics through the follow-up of the Pristine-Gaia synthetic catalogue

    Authors: Akshara Viswanathan, Zhen Yuan, Anke Ardern-Arentsen, Else Starkenburg, Nicolas F. Martin, Kris Youakim, Rodrigo A. Ibata, Federico Sestito, Tadafumi Matsuno, Carlos Allende Prieto, Freya Barwell, Manuel Bayer, Amandine Doliva-Dolinsky, Emma Fernandez-Alvar, Pablo M. Galan-de Anta, Kiran Jhass, Nicolas Longeard, Jose Maria Arroyo-Polonio, Pol Massana, Martin Montelius, Samuel Rusterucci, Judith Santos, Guillaume F. Thomas, Sara Vitali, Wenbo Wu , et al. (5 additional authors not shown)

    Abstract: The Pristine-\textit{Gaia} synthetic catalogue provides reliable photometric metallicities for $\sim$30 million FGK stars using the Pristine survey model and Gaia XP spectra. We perform the first low-to-medium-resolution spectroscopic follow-up of bright (G<15) and distant (up to 35 kpc) very and extremely metal-poor (V/EMP, [Fe/H]<-2.5) red giant branch stars from this. We use Isaac Newton Telesc… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Submitted to A&A. 17 pages (9 figures) + 3 pages (3 figures) in Appendix. Comments are very welcome! The catalogue and 1D spectra will be made available public after acceptance and before upon reasonable request to the first author