Skip to main content

Showing 1–50 of 1,258 results for author: Lu, W

.
  1. arXiv:2407.02031  [pdf, other

    cs.DC cs.AI cs.LG

    SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules

    Authors: Suyi Li, Lingyun Yang, Xiaoxiao Jiang, Hanfeng Lu, Zhipeng Di, Weiyi Lu, Jiawei Chen, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Li** Zhang, Wei Wang

    Abstract: This paper documents our characterization study and practices for serving text-to-image requests with stable diffusion models in production. We first comprehensively analyze inference request traces for commercial text-to-image applications. It commences with our observation that add-on modules, i.e., ControlNets and LoRAs, that augment the base stable diffusion models, are ubiquitous in generatin… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2407.01885  [pdf, other

    cs.CL cs.AI

    Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application

    Authors: Chuanpeng Yang, Wang Lu, Yao Zhu, Yidong Wang, Qian Chen, Chenlong Gao, Bingjie Yan, Yiqiang Chen

    Abstract: Large Language Models (LLMs) have showcased exceptional capabilities in various domains, attracting significant interest from both academia and industry. Despite their impressive performance, the substantial size and computational demands of LLMs pose considerable challenges for practical deployment, particularly in environments with limited resources. The endeavor to compress language models whil… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 28 pages

  3. arXiv:2407.01455  [pdf, other

    cs.CL

    TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models' Theory-of-Mind

    Authors: Guiyang Hou, Wenqi Zhang, Yongliang Shen, Linjuan Wu, Weiming Lu

    Abstract: Theory of Mind (ToM)-the cognitive ability to reason about mental states of ourselves and others, is the foundation of social interaction. Although ToM comes naturally to humans, it poses a significant challenge to even the most advanced Large Language Models (LLMs). Due to the complex logical chains in ToM reasoning, especially in higher-order ToM questions, simply utilizing reasoning methods lik… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 16 pages, 6 figures, ACL 2024(findings)

  4. arXiv:2407.00390  [pdf, other

    cs.CL

    Advancing Process Verification for Large Language Models via Tree-Based Preference Learning

    Authors: Mingqian He, Yongliang Shen, Wenqi Zhang, Zeqi Tan, Weiming Lu

    Abstract: Large Language Models (LLMs) have demonstrated remarkable potential in handling complex reasoning tasks by generating step-by-step rationales.Some methods have proven effective in boosting accuracy by introducing extra verifiers to assess these paths. However, existing verifiers, typically trained on binary-labeled reasoning paths, fail to fully utilize the relative merits of intermediate steps, t… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  5. arXiv:2406.19853  [pdf, other

    cs.CL cs.AI

    YuLan: An Open-source Large Language Model

    Authors: Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou , et al. (13 additional authors not shown)

    Abstract: Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of training details hinders further research and development. This paper presents the development of YuLan, a series of open-source LLMs with $12$ billi… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  6. arXiv:2406.18505  [pdf, other

    cs.LG cs.AI cs.CL cs.RO

    Mental Modeling of Reinforcement Learning Agents by Language Models

    Authors: Wenhao Lu, Xufeng Zhao, Josua Spisak, Jae Hee Lee, Stefan Wermter

    Abstract: Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pretrained models have memorized can be utilized to comprehend an agent's behaviour in the physical worl… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: https://lukaswill.github.io/

  7. arXiv:2406.15222  [pdf

    eess.IV cs.AI cs.CV

    Rapid and Accurate Diagnosis of Acute Aortic Syndrome using Non-contrast CT: A Large-scale, Retrospective, Multi-center and AI-based Study

    Authors: Yujian Hu, Yilang Xiang, Yan-Jie Zhou, Yangyan He, Shifeng Yang, Xiaolong Du, Chunlan Den, Youyao Xu, Gaofeng Wang, Zhengyao Ding, **gyong Huang, Wenjun Zhao, Xuejun Wu, Donglin Li, Qianqian Zhu, Zhenjiang Li, Chenyang Qiu, Ziheng Wu, Yunjun He, Chen Tian, Yihui Qiu, Zuodong Lin, Xiaolong Zhang, Yuan He, Zhenpeng Yuan , et al. (15 additional authors not shown)

    Abstract: Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed… ▽ More

    Submitted 24 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: under peer review

  8. arXiv:2406.13448  [pdf, other

    physics.acc-ph physics.plasm-ph

    Demonstration of High-Efficiency Microwave Heating Producing Record Highly Charged Xenon Ion Beams with Superconducting ECR Ion Sources

    Authors: X. Wang, J. B. Li, V. Mironov, J. W. Guo, X. Z. Zhang, O. Tarvainen, Y. C. Feng, L. X. Li, J. D. Ma, Z. H. Zhang, W. Lu, S. Bogomolov, L. Sun, H. W. Zhao

    Abstract: Intense highly charged ion beam production is essential for high-power heavy ion accelerators. A novel movable Vlasov launcher for superconducting high charge state Electron Cyclotron Resonance (ECR) ion source has been devised that can affect the microwave power effectiveness by a factor of about 4 in terms of highly charged ion beam production. This approach based on a dedicated microwave launch… ▽ More

    Submitted 25 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  9. arXiv:2406.13198  [pdf, other

    quant-ph

    Single-photon triggered quantum entanglement between two qubits or at least 2000 identical qubits

    Authors: Wangjun Lu, Cuilu Zhai, Hong Tao, Yaju Song, Shiqing Tang, Lan Xu

    Abstract: This paper studies the effect of single-photon light fields on quantum entanglement between two qubits and multiple identical qubits initially in a direct state. For two qubits, we first analyze the impact of the excited state's weight on single-photon-triggered entanglement, finding that excessive weight disrupts this process. We then explore how initial coherence affects entanglement, discoverin… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 19 pages, 11 figures

  10. arXiv:2406.10956  [pdf, other

    cs.SD cs.LG eess.AS

    Robust Channel Learning for Large-Scale Radio Speaker Verification

    Authors: Wenhao Yang, Jianguo Wei, Wenhuan Lu, Lei Li, Xugang Lu

    Abstract: Recent research in speaker verification has increasingly focused on achieving robust and reliable recognition under challenging channel conditions and noisy environments. Identifying speakers in radio communications is particularly difficult due to inherent limitations such as constrained bandwidth and pervasive noise interference. To address this issue, we present a Channel Robust Speaker Learnin… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 12 pages, 11 figures

  11. arXiv:2406.10505  [pdf, other

    cs.CL

    CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding

    Authors: Libo Qin, Fuxuan Wei, Qiguang Chen, **gxuan Zhou, Shijue Huang, Jiasheng Si, Wenpeng Lu, Wanxiang Che

    Abstract: Slot filling and intent detection are two highly correlated tasks in spoken language understanding (SLU). Recent SLU research attempts to explore zero-shot prompting techniques in large language models to alleviate the data scarcity problem. Nevertheless, the existing prompting work ignores the cross-task interaction information for SLU, which leads to sub-optimal performance. To solve this proble… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  12. arXiv:2406.10222  [pdf, other

    astro-ph.IM

    Ultra-low noise laser and optical frequency comb-based timing system for the Black Hole Explorer (BHEX) mission

    Authors: Hannah Tomio, Guangning Yang, Holly F. Leopardi, Kenji Numata, Anthony W. Yu, Andrew Attar, Xiaozhen Xu, Wei Lu, Cheryl Gramling, T. K. Sridharan, Peter Kurczynski

    Abstract: In this effort, we demonstrate the performance of a highly stable time reference for the proposed Black Hole Explorer (BHEX) mission, a space-based extension to the Event Horizon Telescope (EHT) Very Long Baseline Interferometry (VLBI) project. This precision timing system is based on the use of a space-qualified, ultra-low noise laser developed as part of the Laser Interferometer Space Antenna (L… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: To be published in the proceedings of SPIE Astronomical Telescopes + Instrumentation 2024

  13. arXiv:2406.09988  [pdf, other

    cs.AI cs.CL cs.RO

    Details Make a Difference: Object State-Sensitive Neurorobotic Task Planning

    Authors: Xiaowen Sun, Xufeng Zhao, Jae Hee Lee, Wenhao Lu, Matthias Kerzel, Stefan Wermter

    Abstract: The state of an object reflects its current status or condition and is important for a robot's task planning and manipulation. However, detecting an object's state and generating a state-sensitive plan for robots is challenging. Recently, pre-trained Large Language Models (LLMs) and Vision-Language Models (VLMs) have shown impressive capabilities in generating plans. However, to the best of our kn… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  14. arXiv:2406.09469  [pdf, other

    cs.DB

    Conformance Testing of Relational DBMS Against SQL Specifications

    Authors: Shuang Liu, Chenglin Tian, Jun Sun, Ruifeng Wang, Wei Lu, Yongxin Zhao, Yinxing Xue, Junjie Wang, Xiaoyong Du

    Abstract: A Relational Database Management System (RDBMS) is one of the fundamental software that supports a wide range of applications, making it critical to identify bugs within these systems. There has been active research on testing RDBMS, most of which employ crash or use metamorphic relations as the oracle. Although existing approaches can detect bugs in RDBMS, they are far from comprehensively evalua… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  15. arXiv:2406.08012  [pdf, other

    astro-ph.HE

    Interaction of an outflow with surrounding gaseous clouds as the origin of the late-time radio flares in TDEs

    Authors: Jialun Zhuang, Rong-Feng Shen, Guobin Mou, Wenbin Lu

    Abstract: Close encounter between a star and a supermassive black hole (SMBH) results in the tidal disruption of the star, known as a tidal disruption event (TDE). Recently, a few TDEs, e.g., ASASSN-15oi and AT2018hyz, have shown late-time (hundreds of days after their UV/optical peaks) radio flares with radio luminosities of $10^{38\sim39}$ erg/s. The super-Eddington fallback or accretion in a TDE may gene… ▽ More

    Submitted 26 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 13 pages, 13 figures. Submitted to ApJ. A new version with some modifications. Comments are welcome

  16. arXiv:2406.06594  [pdf, other

    q-fin.CP cs.AI cs.LG

    Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism

    Authors: Chang Zong, Jian Shao, Weiming Lu, Yueting Zhuang

    Abstract: The accurate prediction of stock movements is crucial for investment strategies. Stock prices are subject to the influence of various forms of information, including financial indicators, sentiment analysis, news documents, and relational structures. Predominant analytical approaches, however, tend to address only unimodal or bimodal sources, neglecting the complexity of multimodal data. Further c… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 29 pages, 10 figures

    MSC Class: 68T07 ACM Class: I.2.6; J.4

  17. arXiv:2406.06563  [pdf, other

    cs.CL cs.AI

    Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

    Authors: Tianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng, Biye Li, Weiwei Lü, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Liang Zeng, Xiaokun Wang, Yutuan Ma, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou

    Abstract: In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts. It is initialized from the pre-existing dense checkpoints of our Skywork-13B model. We explore the comparative effectiveness of upcycling versus training from scratch initi… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  18. arXiv:2406.06028  [pdf, other

    cs.CV

    ReCon1M:A Large-scale Benchmark Dataset for Relation Comprehension in Remote Sensing Imagery

    Authors: Xian Sun, Qiwei Yan, Chubo Deng, Chenglong Liu, Yi Jiang, Zhongyan Hou, Wanxuan Lu, Fanglong Yao, Xiaoyu Liu, Lingxiang Hao, Hongfeng Yu

    Abstract: Scene Graph Generation (SGG) is a high-level visual understanding and reasoning task aimed at extracting entities (such as objects) and their interrelationships from images. Significant progress has been made in the study of SGG in natural images in recent years, but its exploration in the domain of remote sensing images remains very limited. The complex characteristics of remote sensing images ne… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  19. arXiv:2406.03523  [pdf, other

    astro-ph.GA astro-ph.HE

    Objects May Be Closer Than They Appear: Significant Host Galaxy Dispersion Measures of Fast Radio Bursts in Zoom-in Simulations

    Authors: Matthew E. Orr, Blakesley Burkhart, Wenbin Lu, Sam B. Ponnada, Cameron B. Hummels

    Abstract: We investigate the contribution of host galaxies to the overall Dispersion Measures (DMs) for Fast Radio Bursts (FRBs) using the Feedback in Realistic Environments (FIRE-2) cosmological zoom-in simulation suite. We calculate DMs from every star particle in the simulated L* galaxies by ray-tracing through their multi-phase interstellar medium (ISM), summing the line-of-sight free thermal electron c… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, submitted to ApJ Letters

  20. arXiv:2406.00692  [pdf

    cond-mat.mtrl-sci

    Harvesting room-temperature plasticity in ceramics by mechanically seeded dislocations

    Authors: Xufei Fang, Wenjun Lu, Jiawen Zhang, Christian Minnert, Junhua Hou, Sebastian Bruns, Ulrike Kunz, Atsutomo Nakamura, Karsten Durst, Jürgen Rödel

    Abstract: The quest for room-temperature ductile ceramics has been repeatedly fueled by hopes for large-scale applications but so far has been not successful. Recent demonstrations of enhanced functional properties in ceramics through judicious dislocation imprint, however, have been sparking renewed interest in dislocation plasticity in brittle ceramics. Here, we propose a facile approach using room-temper… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  21. arXiv:2406.00623  [pdf

    cond-mat.mtrl-sci

    Room-temperature bulk plasticity and tunable dislocation densities in KTaO3

    Authors: Xufei Fang, Jiawen Zhang, Alexander Frisch, Oliver Preuß, Chukwudalu Okafor, Martin Setvin, Wenjun Lu

    Abstract: We report room-temperature bulk plasticity mediated by dislocations in single-crystal cubic KTaO3, contrasting the conventional knowledge that single-crystal KTaO3 is susceptible to brittle cleavage. A mechanics-based combinatorial experimental approach using cyclic Brinell indentation, scratching, and uniaxial bulk compression consistently demonstrates room-temperature dislocation plasticity in K… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  22. arXiv:2405.19596  [pdf, ps, other

    cs.IT

    The weight hierarchies of three classes of linear codes

    Authors: Wei Lu, Qingyao Wang, Xiaoqiang Wang, Dabin Zheng

    Abstract: Studying the generalized Hamming weights of linear codes is a significant research area within coding theory, as it provides valuable structural information about the codes and plays a crucial role in determining their performance in various applications. However, determining the generalized Hamming weights of linear codes, particularly their weight hierarchy, is generally a challenging task. In t… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  23. arXiv:2405.14307  [pdf, other

    cs.LG cs.AI

    AdaGMLP: AdaBoosting GNN-to-MLP Knowledge Distillation

    Authors: Weigang Lu, Ziyu Guan, Wei Zhao, Yaming Yang

    Abstract: Graph Neural Networks (GNNs) have revolutionized graph-based machine learning, but their heavy computational demands pose challenges for latency-sensitive edge devices in practical industrial applications. In response, a new wave of methods, collectively known as GNN-to-MLP Knowledge Distillation, has emerged. They aim to transfer GNN-learned knowledge to a more efficient MLP student, which offers… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024

    Journal ref: KDD 2024

  24. arXiv:2405.11343  [pdf, other

    astro-ph.HE

    Sub-relativistic Outflow and Hours-Timescale Large-amplitude X-ray Dips during Super-Eddington Accretion onto a Low-mass Massive Black Hole in the Tidal Disruption Event AT2022lri

    Authors: Yuhan Yao, Muryel Guolo, Francesco Tombesi, Ruancun Li, Suvi Gezari, Javier A. García, Lixin Dai, Ryan Chornock, Wenbin Lu, S. R. Kulkarni, Keith C. Gendreau, Dheeraj R. Pasham, S. Bradley Cenko, Erin Kara, Raffaella Margutti, Yukta Ajay, Thomas Wevers, Tom M. Kwan, Igor Andreoni, Joshua S. Bloom, Andrew J. Drake, Matthew J. Graham, Erica Hammerstein, Russ R. Laher, Natalie LeBaron , et al. (10 additional authors not shown)

    Abstract: We present the tidal disruption event (TDE) AT2022lri, hosted in a nearby ($\approx\!144$ Mpc) quiescent galaxy with a low-mass massive black hole ($10^4\,M_\odot < M_{\rm BH} < 10^6\,M_\odot$). AT2022lri belongs to the TDE-H+He subtype. More than 1 Ms of X-ray data were collected with NICER, Swift, and XMM-Newton from 187 d to 672 d after peak. The X-ray luminosity gradually declined from… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: 35 pages, 20 figures, submitted

  25. arXiv:2405.09681  [pdf

    physics.chem-ph

    Inactive Overhang in Silicon Anodes

    Authors: Aidin I. OBrien, Stephen E. Trask, Devashish Salpekar, Seoung-Bum Son, Alison R. Dunlop, Gabriel M. Veith, Wenquan Lu, Brian J. Ingram, Daniel P. Abraham, Andrew N. Jansen, Marco-Tulio F. Rodrigues

    Abstract: Li-ion batteries contain excess anode area to improve manufacturability and prevent Li plating. These overhang areas in graphite electrodes are active but experience decreased Li+ flux during cycling. Over time, the overhang and the anode portions directly opposite to the cathode can exchange Li+, driven by differences in local electrical potential across the electrode, which artificially inflates… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  26. arXiv:2405.09054  [pdf, other

    cs.CV

    Dim Small Target Detection and Tracking: A Novel Method Based on Temporal Energy Selective Scaling and Trajectory Association

    Authors: Weihua Gao, Wenlong Niu, Wenlong Lu, Pengcheng Wang, Zhaoyuan Qi, Xiaodong Peng, Zhen Yang

    Abstract: The detection and tracking of small targets in passive optical remote sensing (PORS) has broad applications. However, most of the previously proposed methods seldom utilize the abundant temporal features formed by target motion, resulting in poor detection and tracking performance for low signal-to-clutter ratio (SCR) targets. In this article, we analyze the difficulty based on spatial features an… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  27. arXiv:2405.09048  [pdf

    physics.optics

    Beam Sha** Based on Axisymmetric Aspheric Mirrors

    Authors: Zhihao Chen, Xiaonan Ning, Jiucheng Chen, Jianfei Hua, Wei Lu

    Abstract: Flat-top beam, known for its ability to generate a consistently even irradiation area, holds vast utility in many fields of scientific and industrial applications. In this paper, a reflective laser beam sha** method based on two axisymmetric aspheric mirrors (AAMs), a polarizing beam splitter (PBS) and two quarter wave plates (QWPs) is proposed to transform Gaussian beam into flat-top beam. Comp… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 7 pages, 9 figures

  28. arXiv:2405.07687  [pdf, other

    cs.RO

    Highly Efficient Observation Process based on FFT Filtering for Robot Swarm Collaborative Navigation in Unknown Environments

    Authors: Chenxi Li, Weining Lu, Zhihao Ma, Litong Meng, Bin Liang

    Abstract: Collaborative path planning for robot swarms in complex, unknown environments without external positioning is a challenging problem. This requires robots to find safe directions based on real-time environmental observations, and to efficiently transfer and fuse these observations within the swarm. This study presents a filtering method based on Fast Fourier Transform (FFT) to address these two iss… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 8 pages, 8 figures, 1 table

  29. arXiv:2405.07519  [pdf, ps, other

    math.PR

    Stability equivalence for stochastic differential equations, stochastic differential delay equations and their corresponding Euler-Maruyama methods in $G$-framework

    Authors: Wen Lu

    Abstract: In this paper, we investigate the stability equivalence problem for stochastic differential delay equations, the auxiliary stochastic differential equations and their corresponding Euler-Maruyama (EM) methods under $G$-framework. More precisely, for $p\geq 2$, we prove the equivalence of practical exponential stability in $p$-th moment sense among stochastic differential delay equations driven by… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  30. arXiv:2405.04673  [pdf, ps, other

    math.AP

    Singularity Structures of Linear Inviscid Dam** in a Channel

    Authors: Wenjie Lu

    Abstract: This paper studies singularity structures of the linear inviscid dam** of two-dimensional Euler equations in a finite periodic channel. We introduce a recursive definition of singularity structures which characterize the singularities of the spectrum density function from different sources: the free part and the boundary part of the Green function. As an application, we demonstrate that the stre… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  31. arXiv:2405.01908  [pdf, other

    math.ST stat.ML

    A Full Adagrad algorithm with O(Nd) operations

    Authors: Antoine Godichon-Baggioni, Wei Lu, Bruno Portier

    Abstract: A novel approach is given to overcome the computational challenges of the full-matrix Adaptive Gradient algorithm (Full AdaGrad) in stochastic optimization. By develo** a recursive method that estimates the inverse of the square root of the covariance of the gradient, alongside a streaming variant for parameter updates, the study offers efficient and practical algorithms for large-scale applicat… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  32. arXiv:2404.19330  [pdf, other

    cs.CV cs.AI

    G2LTraj: A Global-to-Local Generation Approach for Trajectory Prediction

    Authors: Zhanwei Zhang, Zishuo Hua, Minghao Chen, Wei Lu, Binbin Lin, Deng Cai, Wenxiao Wang

    Abstract: Predicting future trajectories of traffic agents accurately holds substantial importance in various applications such as autonomous driving. Previous methods commonly infer all future steps of an agent either recursively or simultaneously. However, the recursive strategy suffers from the accumulated error, while the simultaneous strategy overlooks the constraints among future steps, resulting in k… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  33. arXiv:2404.15624  [pdf, other

    math.NA

    A new framework of high-order unfitted finite element methods using ALE maps for moving-domain problems

    Authors: Wenhao Lu, Chuwen Ma, Weiying Zheng

    Abstract: As a sequel to our previous work [C. Ma, Q. Zhang and W. Zheng, SIAM J. Numer. Anal., 60 (2022)], [C. Ma and W. Zheng, J. Comput. Phys. 469 (2022)], this paper presents a generic framework of arbitrary Lagrangian-Eulerian unfitted finite element (ALE-UFE) methods for partial differential equations (PDEs) on time-varying domains. The ALE-UFE method has a great potential in develo** high-order unf… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  34. arXiv:2404.15462  [pdf

    physics.optics physics.app-ph

    Environmental permittivity-asymmetric BIC metasurfaces with electrical reconfigurability

    Authors: Haiyang Hu, Wenzheng Lu, Rodrigo Berte, Stefan A Maier, Andreas Tittl

    Abstract: In the rapidly evolving field of nanophotonics, achieving precise spectral and temporal light manipulation at the nanoscale remains a critical challenge. While photonic bound states in the continuum (BICs) have emerged as a powerful means of controlling light, their common reliance on geometrical symmetry breaking for obtaining tailored resonances makes them highly susceptible to fabrication imper… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 35 pages, 4 figures, and 11 supporting figures

  35. arXiv:2404.13985  [pdf, other

    cs.CL

    Information Re-Organization Improves Reasoning in Large Language Models

    Authors: Xiaoxia Cheng, Zeqi Tan, Wei Xue, Weiming Lu

    Abstract: Improving the reasoning capabilities of large language models (LLMs) has attracted considerable interest. Recent approaches primarily focus on improving the reasoning process to yield a more precise final answer. However, in scenarios involving contextually aware reasoning, these methods neglect the importance of first identifying logical relationships from the context before proceeding with the r… ▽ More

    Submitted 24 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 15 pages, 4 figures

  36. arXiv:2404.13885  [pdf, other

    cs.CY cs.AI cs.CL cs.LG

    Surveying Attitudinal Alignment Between Large Language Models Vs. Humans Towards 17 Sustainable Development Goals

    Authors: Qingyang Wu, Ying Xu, Tingsong Xiao, Yunze Xiao, Yitong Li, Tianyang Wang, Yichi Zhang, Shanghai Zhong, Yuwei Zhang, Wei Lu, Yifan Yang

    Abstract: Large Language Models (LLMs) have emerged as potent tools for advancing the United Nations' Sustainable Development Goals (SDGs). However, the attitudinal disparities between LLMs and humans towards these goals can pose significant challenges. This study conducts a comprehensive review and analysis of the existing literature on the attitudes of LLMs towards the 17 SDGs, emphasizing the comparison… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  37. arXiv:2404.13298  [pdf, other

    cs.IR eess.SY

    MARec: Metadata Alignment for cold-start Recommendation

    Authors: Julien Monteil, Volodymyr Vaskovych, Wentao Lu, Anirban Majumder, Anton van den Hengel

    Abstract: For many recommender systems the primary data source is a historical record of user clicks. The associated click matrix which is often very sparse, however, as the number of users x products can be far larger than the number of clicks, and such sparsity is accentuated in cold-start settings. The sparsity of the click matrix is the reason matrix factorization and autoencoders techniques remain high… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  38. arXiv:2404.12597  [pdf, other

    cs.LG math.ST stat.ML

    The phase diagram of kernel interpolation in large dimensions

    Authors: Haobo Zhang, Weihao Lu, Qian Lin

    Abstract: The generalization ability of kernel interpolation in large dimensions (i.e., $n \asymp d^γ$ for some $γ>0$) might be one of the most interesting problems in the recent renaissance of kernel regression, since it may help us understand the 'benign overfitting phenomenon' reported in the neural networks literature. Focusing on the inner product kernel on the sphere, we fully characterized the exact… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 18 pages, 1 figure

  39. arXiv:2404.12014  [pdf, other

    cs.CL cs.CR

    Enhance Robustness of Language Models Against Variation Attack through Graph Integration

    Authors: Zi Xiong, Lizhi Qing, Yangyang Kang, Jiawei Liu, Hongsong Li, Changlong Sun, Xiaozhong Liu, Wei Lu

    Abstract: The widespread use of pre-trained language models (PLMs) in natural language processing (NLP) has greatly improved performance outcomes. However, these models' vulnerability to adversarial attacks (e.g., camouflaged hints from drug dealers), particularly in the Chinese language with its rich character diversity/variation and complex structures, hatches vital apprehension. In this study, we propose… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 12 pages, 4 figures, accepted by COLING 2024

  40. arXiv:2404.10599  [pdf, other

    cs.NE

    Towards free-response paradigm: a theory on decision-making in spiking neural networks

    Authors: Zhichao Zhu, Yang Qi, Wenlian Lu, Zhigang Wang, Lu Cao, Jianfeng Feng

    Abstract: The energy-efficient and brain-like information processing abilities of Spiking Neural Networks (SNNs) have attracted considerable attention, establishing them as a crucial element of brain-inspired computing. One prevalent challenge encountered by SNNs is the trade-off between inference speed and accuracy, which requires sufficient time to achieve the desired level of performance. Drawing inspira… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 27 pages, 6 figures, 3 tables

  41. arXiv:2404.07342  [pdf, ps, other

    math.RT math.NT

    The global Gan-Gross-Prasad conjecture for Fourier-Jacobi periods on unitary groups

    Authors: Paul Boisseau, Weixiao Lu, Hang Xue

    Abstract: We prove the Gan-Gross-Prasad conjecture for Fourier-Jacobi periods on unitary groups and an Ichino-Ikeda type refinement. Our strategy is based on the comparison of relative trace formulae formulated by Liu. We develop the full coarse spectral and geometric expansions of the relative trace formulae, and compute relevant spectral terms via zeta integrals and truncated periods. We compare all geome… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  42. arXiv:2404.07108  [pdf, other

    cs.CL cs.IR

    From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications

    Authors: Yongqiang Ma, Lizhi Qing, Jiawei Liu, Yangyang Kang, Yue Zhang, Wei Lu, Xiaozhong Liu, Qikai Cheng

    Abstract: Evaluating large language models (LLMs) is fundamental, particularly in the context of practical applications. Conventional evaluation methods, typically designed primarily for LLM development, yield numerical scores that ignore the user experience. Therefore, our study shifts the focus from model-centered to human-centered evaluation in the context of AI-powered writing assistance applications. O… ▽ More

    Submitted 10 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: 9 pages, 2 figures, under review

  43. arXiv:2404.07097  [pdf, other

    cs.CV

    Fast Encoder-Based 3D from Casual Videos via Point Track Processing

    Authors: Yoni Kasten, Wuyue Lu, Haggai Maron

    Abstract: This paper addresses the long-standing challenge of reconstructing 3D structures from videos with dynamic content. Current approaches to this problem were not designed to operate on casual videos recorded by standard cameras or require a long optimization time. Aiming to significantly improve the efficiency of previous approaches, we present TracksTo4D, a learning-based approach that enables inf… ▽ More

    Submitted 26 June, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  44. arXiv:2404.05880  [pdf, other

    cs.CL

    Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge

    Authors: Weikai Lu, Ziqian Zeng, Jianwei Wang, Zhengdong Lu, Zelin Chen, Hui** Zhuang, Cen Chen

    Abstract: Jailbreaking attacks can enable Large Language Models (LLMs) to bypass the safeguard and generate harmful content. Existing jailbreaking defense methods have failed to address the fundamental issue that harmful knowledge resides within the model, leading to potential jailbreak risks for LLMs. In this paper, we propose a novel defense method called Eraser, which mainly includes three goals: unlearn… ▽ More

    Submitted 3 July, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  45. arXiv:2404.03688  [pdf, other

    physics.ins-det hep-ex

    Beam test of a baseline vertex detector prototype for CEPC

    Authors: Shuqi Li, Tianya Wu, Xinhui Huang, Jia Zhou, Ziyue Yan, Wei Wang, Hao Zeng, Yiming Hu, Xiaoxu Zhang, Zhijun Liang, Wei Wei, Ying Zhang, Xiaomin Wei, Lei Zhang, Ming Qi, Jun Hu, **yu Fu, Hongyu Zhang, Gang Li, Linghui Wu, Mingyi Dong, Xiaoting Li, Raimon Casanova, Liang Zhang, Jianing Dong , et al. (5 additional authors not shown)

    Abstract: The Circular Electron Positron Collider (CEPC) has been proposed to enable more thorough and precise measurements of the properties of Higgs, W, and Z bosons, as well as to search for new physics. In response to the stringent performance requirements of the vertex detector for the CEPC, a baseline vertex detector prototype was tested and characterized for the first time using a 6 GeV electron beam… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  46. arXiv:2404.03608  [pdf, other

    cs.CL cs.AI

    Sailor: Open Language Models for South-East Asia

    Authors: Longxu Dou, Qian Liu, Guangtao Zeng, Jia Guo, Jiahui Zhou, Wei Lu, Min Lin

    Abstract: We present Sailor, a family of open language models ranging from 0.5B to 7B parameters, tailored for South-East Asian (SEA) languages. These models are continually pre-trained from Qwen1.5, a great language model for multilingual use cases. From Qwen1.5, Sailor models accept 200B to 400B tokens, primarily covering the languages of English, Chinese, Vietnamese, Thai, Indonesian, Malay, and Lao. The… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Code is available at https://github.com/sail-sg/sailor-llm

  47. arXiv:2404.02291  [pdf, other

    cs.CR eess.SY

    Towards a New Configurable and Practical Remote Automotive Security Testing Platform

    Authors: Sekar Kulandaivel, Wenjuan Lu, Brandon Barry, Jorge Guajardo

    Abstract: In the automotive security sector, the absence of a testing platform that is configurable, practical, and user-friendly presents considerable challenges. These difficulties are compounded by the intricate design of vehicle systems, the rapid evolution of attack vectors, and the absence of standardized testing methodologies. We propose a next-generation testing platform that addresses several chall… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 7 pages, 2 figures

  48. arXiv:2404.02018  [pdf, other

    cs.RO cs.AI

    Large Language Models for Orchestrating Bimanual Robots

    Authors: Kun Chu, Xufeng Zhao, Cornelius Weber, Mengdi Li, Wenhao Lu, Stefan Wermter

    Abstract: Although there has been rapid progress in endowing robots with the ability to solve complex manipulation tasks, generating control policies for bimanual robots to solve tasks involving two hands is still challenging because of the difficulties in effective temporal and spatial coordination. With emergent abilities in terms of step-by-step reasoning and in-context learning, Large Language Models (L… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: The project website can be found at http://labor-agent.github.io

  49. arXiv:2404.01072  [pdf

    cs.DL

    How biomedical papers accumulated their clinical citations: A large-scale retrospective analysis based on PubMed

    Authors: Xin Li, Xuli Tang, Wei Lu

    Abstract: This paper explored the temporal characteristics of clinical citations of biomedical papers, including how long it takes to receive its first clinical citation (the initial stage) and how long it takes to receive two or more clinical citations after its first clinical citation (the build-up stage). Over 23 million biomedical papers in PubMed between 1940 and 2013 and their clinical citations are u… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  50. arXiv:2404.00385  [pdf, other

    cs.CV cs.AI cs.LG

    Constrained Layout Generation with Factor Graphs

    Authors: Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng, Guoji Fu, Yong Liang Goh, Wei Lu, Wee Sun Lee

    Abstract: This paper addresses the challenge of object-centric layout generation under spatial constraints, seen in multiple domains including floorplan design process. The design process typically involves specifying a set of spatial constraints that include object attributes like size and inter-object relations such as relative positioning. Existing works, which typically represent objects as single nodes… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: To be published at IEEE/CVF CVPR 2024