Skip to main content

Showing 51–100 of 1,014 results for author: Xia, C

.
  1. arXiv:2403.17431  [pdf, other

    cs.CL cs.LG

    Robust and Scalable Model Editing for Large Language Models

    Authors: Yingfa Chen, Zhengyan Zhang, Xu Han, Chaojun Xiao, Zhiyuan Liu, Chen Chen, Kuai Li, Tao Yang, Maosong Sun

    Abstract: Large language models (LLMs) can make predictions using parametric knowledge--knowledge encoded in the model weights--or contextual knowledge--knowledge presented in the context. In many scenarios, a desirable behavior is that LLMs give precedence to contextual knowledge when it conflicts with the parametric knowledge, and fall back to using their parametric knowledge when the context is irrelevan… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: LREC-COLING 2024 paper, 16 pages, 4 figures

  2. arXiv:2403.17336  [pdf, other

    cs.CR cs.CL

    Don't Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models

    Authors: Zhiyuan Yu, Xiaogeng Liu, Shunning Liang, Zach Cameron, Chaowei Xiao, Ning Zhang

    Abstract: Recent advancements in generative AI have enabled ubiquitous access to large language models (LLMs). Empowered by their exceptional capabilities to understand and generate human-like text, these models are being increasingly integrated into our society. At the same time, there are also concerns on the potential misuse of this powerful technology, prompting defensive measures from service providers… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted by USENIX Security 2024

  3. arXiv:2403.16586  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Intrinsic Dipole Hall effect in tMoTe$_2$ moiré: magnetoelectricity and contact-free signature of topological transitions

    Authors: Feng-Ren Fan, Cong Xiao, Wang Yao

    Abstract: We discover an intrinsic dipole Hall effect in a variety of magnetic insulating states at integer fillings of twisted MoTe$_2$ moiré superlattice, including topologically trivial and nontrivial ferro-, antiferro-, and ferri-magnetic configurations. The dipole Hall current, in linear response to in-plane electric field, generates an in-plane orbital magnetization $M_{\parallel}$ along the field, th… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 5 pages, 4 figures

  4. arXiv:2403.12497  [pdf, other

    astro-ph.SR physics.plasm-ph

    Formation of Polar Crown Filaments Magnetic Fields by Supergranular Helicity Injection

    Authors: Huanxin Chen, Chun Xia, Hechao Chen

    Abstract: To understand the magnetic fields of the polar crown filaments (PCFs) at high latitudes near polar regions of the Sun, we perform magnetofrictional numerical simulations on the long-term magnetic evolution of bipolar fields with roughly east-west polarity inversion lines (PILs) in a three-dimensional (3D) spherical wedge domain near polar regions. The Coriolis effect induced vortical motions at th… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 16 pages, 12 figures

  5. arXiv:2403.10351  [pdf, other

    cs.CL

    TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale

    Authors: Pengcheng Jiang, Cao Xiao, Zifeng Wang, Parminder Bhatia, Jimeng Sun, Jiawei Han

    Abstract: The advent of large language models (LLMs) has significantly advanced natural language processing tasks like text summarization. However, their large size and computational demands, coupled with privacy concerns in data transmission, limit their use in resource-constrained and privacy-centric settings. To overcome this, we introduce TriSum, a framework for distilling LLMs' text summarization abili… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: NAACL'24

  6. arXiv:2403.09513  [pdf, other

    cs.CR cs.AI

    AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting

    Authors: Yu Wang, Xiaogeng Liu, Yu Li, Muhao Chen, Chaowei Xiao

    Abstract: With the advent and widespread deployment of Multimodal Large Language Models (MLLMs), the imperative to ensure their safety has become increasingly pronounced. However, with the integration of additional modalities, MLLMs are exposed to new vulnerabilities, rendering them prone to structured-based jailbreak attacks, where semantic content (e.g., "harmful text") has been injected into the images t… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Multimodal Large Language Models Defense, 25 Pages

  7. arXiv:2403.08361  [pdf, other

    hep-ex hep-ph

    Search for cosmic-ray boosted sub-MeV dark matter-electron scatterings in PandaX-4T

    Authors: Xiaofeng Shang, Abdusalam Abdukerim, Zihao Bo, Wei Chen, Xun Chen, Chen Cheng, Zhaokan Cheng, Xiangyi Cui, Yingjie Fan, Deqing Fang, Lisheng Geng, Karl Giboni, Xuyuan Guo, Chencheng Han, Ke Han, Changda He, **rong He, Di Huang, Junting Huang, Zhou Huang, Ruquan Hou, Yu Hou, Xiangdong Ji, Yonglin Ju, Chenxiang Li , et al. (67 additional authors not shown)

    Abstract: We report the first search for the elastic scatterings between cosmic-ray boosted sub-MeV dark matter and electrons in the PandaX-4T liquid xenon experiment. Sub-MeV dark matter particles can be accelerated by scattering with electrons in the cosmic rays and produce detectable electron recoil signals in the detector. Using the commissioning data from PandaX-4T of 0.63~tonne$\cdot$year exposure, we… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 6 pages, 3 figures

  8. arXiv:2403.08216  [pdf, other

    cs.LG cs.CV

    PaddingFlow: Improving Normalizing Flows with Padding-Dimensional Noise

    Authors: Qinglong Meng, Chongkun Xia, Xueqian Wang

    Abstract: Normalizing flow is a generative modeling approach with efficient sampling. However, Flow-based models suffer two issues: 1) If the target distribution is manifold, due to the unmatch between the dimensions of the latent target distribution and the data distribution, flow-based models might perform badly. 2) Discrete data might make flow-based models collapse into a degenerate mixture of point mas… ▽ More

    Submitted 23 April, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  9. arXiv:2403.07392  [pdf, other

    cs.CV

    ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions

    Authors: Chunlong Xia, Xinliang Wang, Feng Lv, Xin Hao, Yifeng Shi

    Abstract: Although Vision Transformer (ViT) has achieved significant success in computer vision, it does not perform well in dense prediction tasks due to the lack of inner-patch information interaction and the limited diversity of feature scale. Most existing studies are devoted to designing vision-specific transformers to solve the above problems, which introduce additional pre-training costs. Therefore,… ▽ More

    Submitted 27 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: CVPR2024

  10. arXiv:2403.07344  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci cond-mat.str-el

    Electronic Structure of Superconducting Infinite-Layer Lanthanum Nickelates

    Authors: Wenjie Sun, Zhicheng Jiang, Chengliang Xia, Bo Hao, Yueying Li, Shengjun Yan, Maosen Wang, Hongquan Liu, Jianyang Ding, Jiayu Liu, Zhengtai Liu, Jishan Liu, Hanghui Chen, Dawei Shen, Yuefeng Nie

    Abstract: Revealing the momentum-resolved electronic structure of infinite-layer nickelates is essential for understanding this new class of unconventional superconductors, but has been hindered by the formidable challenges in improving the sample quality. In this work, we report for the first time the angle-resolved photoemission spectroscopy of superconducting La$_{0.8}$Sr$_{0.2}$NiO$_{2}$ films prepared… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 29 pages,13 figures

  11. arXiv:2403.06974  [pdf, other

    cs.CV

    Memory-based Adapters for Online 3D Scene Perception

    Authors: Xiuwei Xu, Chong Xia, Ziwei Wang, Linqing Zhao, Yueqi Duan, Jie Zhou, Jiwen Lu

    Abstract: In this paper, we propose a new framework for online 3D scene perception. Conventional 3D scene perception methods are offline, i.e., take an already reconstructed 3D scene geometry as input, which is not applicable in robotic applications where the input data is streaming RGB-D videos rather than a complete 3D scene reconstructed from pre-collected RGB-D videos. To deal with online 3D scene perce… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR24. Link: https://xuxw98.github.io/Online3D/

  12. Investigating the Proton Structure: The FAMU experiment

    Authors: A. Vacchi, A. Adamczak, D. Bakalov, G. Baldazzi, M. Baruzzo, R. Benocci, R. Bertoni, M. Bonesini, H. Cabrera, S. Carsi, D. Cirrincione, F. Chignoli, M. Clemenza, L. Colace, M. Danailov, P. Danev, A. de Bari, C. De Vecchi, M. De Vincenzi, E. Fasci, K. S. Gadedjisso-Tossou, L. Gianfrani, A. D. Hillier, K. Ishida, P. J. C. King , et al. (24 additional authors not shown)

    Abstract: The article gives the motivations for the measurement of the hyperfine splitting (hfs) in the ground state of muonic hydrogen to explore the properties of the proton at low momentum transfer. It summarizes these proposed measurement methods and finally describes the FAMU experiment in more detail.

    Submitted 8 March, 2024; originally announced March 2024.

    Journal ref: Nuclear Physics News 33:4, 9-16, 2023

  13. arXiv:2403.04957  [pdf, other

    cs.AI

    Automatic and Universal Prompt Injection Attacks against Large Language Models

    Authors: Xiaogeng Liu, Zhiyuan Yu, Yizhe Zhang, Ning Zhang, Chaowei Xiao

    Abstract: Large Language Models (LLMs) excel in processing and generating human language, powered by their ability to interpret and follow instructions. However, their capabilities can be exploited through prompt injection attacks. These attacks manipulate LLM-integrated applications into producing responses aligned with the attacker's injected content, deviating from the user's actual requests. The substan… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Pre-print, code is available at https://github.com/SheltonLiu-N/Universal-Prompt-Injection

  14. arXiv:2403.04213  [pdf, ps, other

    math.RT

    Representations of non-finitely graded Lie algebras related to Virasoro algebra

    Authors: Chunguang Xia, Tianyu Ma, Xiao Dong, Ming**g Zhang

    Abstract: In this paper, we study representations of non-finitely graded Lie algebras $\mathcal{W}(ε)$ related to Virasoro algebra, where $ε= \pm 1$. Precisely speaking, we completely classify the free $\mathcal{U}(\mathfrak h)$-modules of rank one over $\mathcal{W}(ε)$,and find that these module structures are rather different from those of other graded Lie algebras. We also determine the simplicity and is… ▽ More

    Submitted 3 June, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: 18 pages

  15. arXiv:2403.04192  [pdf, other

    cond-mat.mes-hall cond-mat.str-el

    Orbital Magneto-Nonlinear Anomalous Hall Effect in Kagome Magnet Fe$_3$Sn$_2$

    Authors: Lujunyu Wang, Jiaojiao Zhu, Haiyun Chen, Hui Wang, **** Liu, Yue-Xin Huang, Bingyan Jiang, Jiaji Zhao, Hengjie Shi, Guang Tian, Haoyu Wang, Yugui Yao, Dapeng Yu, Zhiwei Wang, Cong Xiao, Shengyuan A. Yang, Xiaosong Wu

    Abstract: It has been theoretically predicted that perturbation of the Berry curvature by electromagnetic fields gives rise to intrinsic nonlinear anomalous Hall effects that are independent of scattering. Two types of nonlinear anomalous Hall effects are expected. The electric nonlinear Hall effect has recently begun to receive attention, while very few studies are concerned with the magneto-nonlinear Hall… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 18 pages, 4 figures, featured in Physics: Viewpoint and Editors' suggestions

    Journal ref: Phys. Rev. Lett. 132, 106601(2024)

  16. arXiv:2403.01778  [pdf, other

    math.NA math.OC

    HOSCF: Efficient decoupling algorithms for finding the best rank-one approximation of higher-order tensors

    Authors: Chuanfu Xiao, Zeyu Li, Chao Yang

    Abstract: Best rank-one approximation is one of the most fundamental tasks in tensor computation. In order to fully exploit modern multi-core parallel computers, it is necessary to develop decoupling algorithms for computing the best rank-one approximation of higher-order tensors at large scales. In this paper, we first build a bridge between the rank-one approximation of tensors and the eigenvector-depende… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    MSC Class: 15A18; 15A69; 15A72; 65F15; 68W10

  17. arXiv:2402.18667  [pdf, other

    cs.CL

    FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability

    Authors: Congying Xia, Chen Xing, Jiangshu Du, Xinyi Yang, Yihao Feng, Ran Xu, Wenpeng Yin, Caiming Xiong

    Abstract: This paper presents FoFo, a pioneering benchmark for evaluating large language models' (LLMs) ability to follow complex, domain-specific formats, a crucial yet underexamined capability for their application as AI agents. Despite LLMs' advancements, existing benchmarks fail to assess their format-following proficiency adequately. FoFo fills this gap with a diverse range of real-world formats and in… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: The first two authors contributed equally

  18. arXiv:2402.18649  [pdf, other

    cs.CR cs.AI

    A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems

    Authors: Fangzhou Wu, Ning Zhang, Somesh Jha, Patrick McDaniel, Chaowei Xiao

    Abstract: Large Language Model (LLM) systems are inherently compositional, with individual LLM serving as the core foundation with additional layers of objects such as plugins, sandbox, and so on. Along with the great potential, there are also increasing concerns over the security of such probabilistic intelligent systems. However, existing studies on LLM security often focus on individual LLM, but without… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  19. arXiv:2402.17980  [pdf, other

    hep-th cond-mat.quant-gas

    Emergence of Large-Scale Structures in Holographic Superfluid Turbulence

    Authors: Wei-Can Yang, Chuan-Yin Xia, Yu Tian, Makoto Tsubota, Hua-Bi Zeng

    Abstract: In two-dimensional turbulence systems, the emergence of large-scale structures holds profound physical implications, particularly as it indicates the occurrence of inverse energy cascades, thereby garnering significant attention. In this paper, we report a novel vortex clusters formation in the background of near-extreme Reissner-Nordstr$\ddot{o}$m black hole holographic model. At temperatures nea… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 8 pages, 6 figures

  20. arXiv:2402.17624  [pdf, other

    cs.CV cs.GR

    CustomSketching: Sketch Concept Extraction for Sketch-based Image Synthesis and Editing

    Authors: Chufeng Xiao, Hongbo Fu

    Abstract: Personalization techniques for large text-to-image (T2I) models allow users to incorporate new concepts from reference images. However, existing methods primarily rely on textual descriptions, leading to limited control over customized images and failing to support fine-grained and local editing (e.g., shape, pose, and details). In this paper, we identify sketches as an intuitive and versatile rep… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  21. arXiv:2402.17584  [pdf, other

    hep-ph

    Triangle singularity in the $J/ψ\to φπ^+ a_0^-(π^- η),\; φπ^- a_0^+(π^+ η)$ decays

    Authors: C. W. Xiao, J. M. Dias, L. R. Dai, W. H. Liang, E. Oset

    Abstract: We study the $J/ψ\to φπ^+ a_0(980)^- (a_0^- \to π^- η)$ decay, evaluating the double mass distribution in terms of the $π^- η$ and $π^+ a^-_0$ invariant masses. We show that the $π^- η$ mass distribution exhibits the typical cusp structure of the $a_0(980)$ seen in recent high statistics experiments, and the $π^+ a^-_0$ spectrum shows clearly a peak around… ▽ More

    Submitted 24 April, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: 8 pages, 3 figures; V2: discussion added, references added, version to appear in Phys. Rev. D

  22. arXiv:2402.17166  [pdf, other

    cond-mat.mes-hall

    Layer Coherence Origin of Intrinsic Planar Hall Effect in 2D Limit

    Authors: Huiyuan Zheng, Dawei Zhai, Cong Xiao, Wang Yao

    Abstract: The intrinsic planar Hall effect has attracted intensive interest inspired by recent experiments. Existing theories of this effect require three dimensional orbital motion, or strong spin-orbit coupling of certain forms, which do not exist in van der Waals thin films. Here, we uncover a new origin of the planar Hall effect - as an intrinsic property of layer coherent electrons - that allows its pr… ▽ More

    Submitted 12 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 6 pages, 5 figures

  23. arXiv:2402.16965  [pdf, other

    cs.CR cs.AI

    WIPI: A New Web Threat for LLM-Driven Web Agents

    Authors: Fangzhou Wu, Shutong Wu, Yulong Cao, Chaowei Xiao

    Abstract: With the fast development of large language models (LLMs), LLM-driven Web Agents (Web Agents for short) have obtained tons of attention due to their superior capability where LLMs serve as the core part of making decisions like the human brain equipped with multiple web tools to actively interact with external deployed websites. As uncountable Web Agents have been released and such LLM systems are… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  24. arXiv:2402.16679  [pdf, other

    astro-ph.SR

    Unveiling the Initiation Route of Coronal Mass Ejections through their Slow Rise Phase

    Authors: Chen Xing, Guillaume Aulanier, Xin Cheng, Chun Xia, Mingde Ding

    Abstract: Understanding the early evolution of coronal mass ejections (CMEs), in particular their initiation, is the key to forecasting solar eruptions and induced disastrous space weather. Although many initiation mechanisms have been proposed, a full understanding of CME initiation, which is identified as a slow rise of CME progenitors in kinematics before the impulsive acceleration, remains elusive. Here… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 35 pages, 15 figures, accepted for publication in ApJ

  25. arXiv:2402.14968  [pdf, other

    cs.CR cs.CL

    Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment

    Authors: Jiongxiao Wang, Jiazhao Li, Yiquan Li, Xiangyu Qi, Junjie Hu, Yixuan Li, Patrick McDaniel, Muhao Chen, Bo Li, Chaowei Xiao

    Abstract: Despite the general capabilities of Large Language Models (LLM), these models still request fine-tuning or adaptation with customized data when meeting specific business demands. However, this process inevitably introduces new threats, particularly against the Fine-tuning based Jailbreak Attack (FJAttack) under the setting of Language-Model-as-a-Service (LMaaS), where the model's safety has been s… ▽ More

    Submitted 20 June, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  26. arXiv:2402.14744  [pdf, other

    cs.AI cs.CL cs.CY cs.LG

    Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation

    Authors: Jiawei Wang, Renhe Jiang, Chuang Yang, Zengqing Wu, Makoto Onizuka, Ryosuke Shibasaki, Noboru Koshizuka, Chuan Xiao

    Abstract: This paper introduces a novel approach using Large Language Models (LLMs) integrated into an agent framework for flexible and effective personal mobility generation. LLMs overcome the limitations of previous models by effectively processing semantic data and offering versatility in modeling various tasks. Our approach addresses three research questions: aligning LLMs with real-world urban mobility… ▽ More

    Submitted 23 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: Source codes are available at https://github.com/Wangjw6/LLMob/

  27. arXiv:2402.14167  [pdf, other

    cs.CV cs.LG

    T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

    Authors: Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, Anima Anandkumar

    Abstract: Sampling from diffusion probabilistic models (DPMs) is often expensive for high-quality image generation and typically requires many steps with a large model. In this paper, we introduce sampling Trajectory Stitching T-Stitch, a simple yet efficient technique to improve the sampling efficiency with little or no generation degradation. Instead of solely using a large DPM for the entire sampling tra… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  28. arXiv:2402.13720  [pdf, other

    cs.CL

    Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster Speculative Decoding

    Authors: Weilin Zhao, Yuxiang Huang, Xu Han, Wang Xu, Chaojun Xiao, Xinrong Zhang, Yewei Fang, Kaihuo Zhang, Zhiyuan Liu, Maosong Sun

    Abstract: Speculative decoding is a widely used method that accelerates the generation process of large language models (LLMs) with no compromise in model performance. It achieves this goal by using an existing smaller model for drafting and then employing the target LLM to verify the draft in a low-cost parallel manner. Under such a drafting-verification framework, drafting efficiency has become a bottlene… ▽ More

    Submitted 26 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  29. arXiv:2402.12327  [pdf, other

    cs.AI cs.CL cs.CY cs.MA econ.GN

    Shall We Team Up: Exploring Spontaneous Cooperation of Competing LLM Agents

    Authors: Zengqing Wu, Run Peng, Shuyuan Zheng, Qianying Liu, Xu Han, Brian Inhyuk Kwon, Makoto Onizuka, Shaojie Tang, Chuan Xiao

    Abstract: Large Language Models (LLMs) have increasingly been utilized in social simulations, where they are often guided by carefully crafted instructions to stably exhibit human-like behaviors during simulations. Nevertheless, we doubt the necessity of sha** agents' behaviors for accurate social simulations. Instead, this paper emphasizes the importance of spontaneous phenomena, wherein agents deeply en… ▽ More

    Submitted 2 July, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Source codes available at https://github.com/wuzengqing001225/SABM_ShallWeTeamUp

  30. arXiv:2402.11354  [pdf, other

    cs.LG cs.AI cs.CV cs.DB cs.DS

    Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search

    Authors: Ke**g Lu, Chuan Xiao, Yoshiharu Ishikawa

    Abstract: Approximate nearest neighbor search (ANNS) in high-dimensional spaces is a pivotal challenge in the field of machine learning. In recent years, graph-based methods have emerged as the superior approach to ANNS, establishing a new state of the art. Although various optimizations for graph-based ANNS have been introduced, they predominantly rely on heuristic methods that lack formal theoretical back… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: Source code will be released at GitHub soon

  31. arXiv:2402.10196  [pdf, other

    cs.CL cs.AI

    A Trembling House of Cards? Map** Adversarial Attacks against Language Agents

    Authors: Lingbo Mo, Zeyi Liao, Boyuan Zheng, Yu Su, Chaowei Xiao, Huan Sun

    Abstract: Language agents powered by large language models (LLMs) have seen exploding development. Their capability of using language as a vehicle for thought and communication lends an incredible level of flexibility and versatility. People have quickly capitalized on this capability to connect LLMs to a wide range of external components and environments: databases, tools, the Internet, robotic embodiment,… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  32. arXiv:2402.08183  [pdf, other

    cs.CL cs.CV

    Pixel Sentence Representation Learning

    Authors: Chenghao Xiao, Zhuoxu Huang, Danlu Chen, G Thomas Hudson, Yizhi Li, Haoran Duan, Chenghua Lin, Jie Fu, Jungong Han, Noura Al Moubayed

    Abstract: Pretrained language models are long known to be subpar in capturing sentence and document-level semantics. Though heavily investigated, transferring perturbation-based methods from unsupervised visual representation learning to NLP remains an unsolved problem. This is largely due to the discreteness of subword units brought by tokenization of language models, limiting small perturbations of inputs… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  33. arXiv:2402.07756  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci

    Extrinsic Contribution to Nonlinear Current Induced Spin Polarization

    Authors: Ruda Guo, Yue-Xin Huang, Xiaoxin Yang, Yi Liu, Cong Xiao, Zhe Yuan

    Abstract: Nonlinear spin polarization occurring in the second order of driving electric current is the dominant source of nonequilibrium magnetization in centrosymmetric or weakly noncentrosymmetric nonmagnetic materials, and induces nonlinear spin-orbit torque in magnets. Up to now, only the intrinsic mechanism based on anomalous spin polarizability dipole, which is the spin counterpart of Berry curvature… ▽ More

    Submitted 7 March, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  34. arXiv:2402.07587  [pdf, other

    hep-ph

    Magnetized strangelets with anomalous magnetic moment and Coulomb interactions

    Authors: Huai-Min Chen, Xiao-Wei Li, Cheng-Jun Xia, **g-Tao Wang, Guang-Xiong Peng

    Abstract: We study the magnetized strangelets in the baryon density-dependent quark mass model, including the effects of both confinement and lead-order perturbation interactions. The properties of magnetized strangelets are investigated under the the field strength 2*10^17 G, where the anisotropy caused by the strong magnetic field is insignificant can be treated approximately as an isotropic system. The c… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  35. arXiv:2402.04617  [pdf, other

    cs.CL cs.AI cs.LG

    InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory

    Authors: Chaojun Xiao, Pengle Zhang, Xu Han, Guangxuan Xiao, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Maosong Sun

    Abstract: Large language models (LLMs) have emerged as a cornerstone in real-world applications with lengthy streaming inputs (e.g., LLM-driven agents). However, existing LLMs, pre-trained on sequences with a restricted maximum length, cannot process longer sequences due to the out-of-domain and distraction issues. Common solutions often involve continual pre-training on longer sequences, which will introdu… ▽ More

    Submitted 28 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  36. arXiv:2402.03804  [pdf, other

    cs.LG cs.AI

    ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse LLMs

    Authors: Zhengyan Zhang, Yixin Song, Guanghui Yu, Xu Han, Yankai Lin, Chaojun Xiao, Chenyang Song, Zhiyuan Liu, Zeyu Mi, Maosong Sun

    Abstract: Sparse computation offers a compelling solution for the inference of Large Language Models (LLMs) in low-resource scenarios by dynamically skip** the computation of inactive neurons. While traditional approaches focus on ReLU-based LLMs, leveraging zeros in activation values, we broaden the scope of sparse LLMs beyond zero activation values. We introduce a general method that defines neuron acti… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  37. arXiv:2402.02539  [pdf, other

    hep-ph hep-ex

    $a_0(1710)$-$f_0(1710)$ mixing effect in the $D_{s}^{+} \rightarrow K_S^{0} K_S^{0} π^{+}$ decay

    Authors: Yu-Wen Peng, Wei Liang, Xiaonu Xiong, Chu-Wen Xiao

    Abstract: With the measurements of the decay $D^+_s \rightarrow K^0_S K^0_S π^+$ by the BESIII Collaboration, we investigate this three-body weak decay via the chiral unitary approach for the final state interaction, where the resonances $S(980)$ and $S(1710)$ are dynamically reproduced with the interaction of eleven coupled channels, and the $W$-external and -internal emission mechanisms are considered at… ▽ More

    Submitted 8 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: 17 pages, 7 figures, 2 tables

  38. arXiv:2402.01920  [pdf, other

    cs.LG cs.AI cs.CL

    Preference Poisoning Attacks on Reward Model Learning

    Authors: Junlin Wu, Jiongxiao Wang, Chaowei Xiao, Chenguang Wang, Ning Zhang, Yevgeniy Vorobeychik

    Abstract: Learning utility, or reward, models from pairwise comparisons is a fundamental component in a number of application domains. These approaches inherently entail collecting preference information from people, with feedback often provided anonymously. Since preferences are subjective, there is no gold standard to compare against; yet, reliance of high-impact systems on preference learning creates a s… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  39. arXiv:2402.01077  [pdf, ps, other

    cs.LG cs.AI

    Recent Advances in Predictive Modeling with Electronic Health Records

    Authors: Jiaqi Wang, Junyu Luo, Muchao Ye, Xiaochen Wang, Yuan Zhong, Aofei Chang, Guanjie Huang, Ziyi Yin, Cao Xiao, Jimeng Sun, Fenglong Ma

    Abstract: The development of electronic health records (EHR) systems has enabled the collection of a vast amount of digitized patient data. However, utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics. With the advancements in machine learning techniques, deep learning has demonstrated its superiority in various applications, including healthcare. This su… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  40. arXiv:2402.00532  [pdf, other

    cond-mat.mes-hall

    Quantum Metric Nonlinear Spin-Orbit Torque Enhanced by Topological Bands

    Authors: Xukun Feng, Weikang Wu, Hui Wang, Weibo Gao, Lay Kee Ang, Y. X. Zhao, Cong Xiao, Shengyuan A. Yang

    Abstract: Effects manifesting quantum geometry have been a focus of physics research. Here, we reveal that quantum metric plays a crucial role in nonlinear electric spin response, leading to a quantum metric spin-orbit torque. We argue that enhanced quantum metric can occur at band (anti)crossings, so the nonlinear torque could be amplified in topological metals with nodal features close to Fermi level. By… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  41. arXiv:2401.17634  [pdf, ps, other

    math.AP

    The global well-posedness and Newtonian limit for the relativistic Boltzmann equation in a periodic box

    Authors: Chuqi Cao, **g Ouyang, Yong Wang, Changguo Xiao

    Abstract: In this paper, we study the Newtonian limit for relativistic Boltzmann equation in a periodic box $\mathbb{T}^3$. We first establish the global-in-time mild solutions of relativistic Boltzmann equation with uniform-in-$\mathfrak{c}$ estimates and time decay rate. Then we rigorously justify the global-in-time Newtonian limits from the relativistic Boltzmann solutions to the solution of Newtonian Bo… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 56 pages, All comments are welcome

  42. arXiv:2401.15770  [pdf, other

    cs.CL

    PILOT: Legal Case Outcome Prediction with Case Law

    Authors: Lang Cao, Zifeng Wang, Cao Xiao, Jimeng Sun

    Abstract: Machine learning shows promise in predicting the outcome of legal cases, but most research has concentrated on civil law cases rather than case law systems. We identified two unique challenges in making legal case outcome predictions with case law. First, it is crucial to identify relevant precedent cases that serve as fundamental evidence for judges during decision-making. Second, it is necessary… ▽ More

    Submitted 12 April, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

  43. arXiv:2401.14302  [pdf, ps, other

    hep-ph

    Correlation function and the inverse problem in the $BD$ interaction

    Authors: Hai-Peng Li, **g-Yu Yi, Chu-Wen Xiao, De-Liang Yao, Wei-Hong Liang, Eulogio Oset

    Abstract: We study the correlation functions of the $B^0 D^+, B^+ D^0$ system, which develops a bound state of approximately $40$ MeV, using inputs consistent with the $T_{cc}(3875)$ state. Then we address the inverse problem starting from these correlation functions to determine the scattering observables related to the system, including the existence of the bound state and its molecular nature. The import… ▽ More

    Submitted 28 March, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: 16 pages, 3 figures, 7 tables; V2: version to be published in Chinese Physics C

  44. arXiv:2401.13958  [pdf

    cond-mat.supr-con

    Unveiling a Novel Metal-to-Metal Transition in LuH2: Critically Challenging Superconductivity Claims in Lutetium Hydrides

    Authors: Dong Wang, Ningning Wang, Caoshun Zhang, Chunsheng Xia, Weicheng Guo, Xia Yin, Kejun Bu, Takeshi Nakagawa, Jianbo Zhang, Federico Gorelli, Philip Dalladay-Simpson, Thomas Meier, Xujie Lü, Liling Sun, **guang Cheng, Qiaoshi Zeng, Yang Ding, Ho-kwang Mao

    Abstract: Following the recent report by Dasenbrock-Gammon et al. (2023) of near-ambient superconductivity in nitrogen-doped lutetium trihydride (LuH3-δNε), significant debate has emerged surrounding the composition and interpretation of the observed sharp resistance drop. Here, we meticulously revisit these claims through comprehensive characterization and investigations. We definitively identify the repor… ▽ More

    Submitted 28 January, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Journal ref: Matter Radiat. Extremes 9, 037401 (2024)

  45. arXiv:2401.13478  [pdf, other

    cs.IR cs.CL cs.CV cs.MM

    SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval

    Authors: Siwei Wu, Yizhi Li, Kang Zhu, Ge Zhang, Yiming Liang, Kai**g Ma, Chenghao Xiao, Haoran Zhang, Bohao Yang, Wenhu Chen, Wenhao Huang, Noura Al Moubayed, Jie Fu, Chenghua Lin

    Abstract: Multi-modal information retrieval (MMIR) is a rapidly evolving field, where significant progress, particularly in image-text pairing, has been made through advanced representation learning and cross-modality alignment research. However, current benchmarks for evaluating MMIR performance in image-text pairing within the scientific domain show a notable gap, where chart and table images described in… ▽ More

    Submitted 11 June, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: camera-ready version for ACL 2024 Findings

  46. arXiv:2401.13278  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci

    Dynamical Chiral Nernst Effect in Twisted Van der Waals Few Layers

    Authors: Juncheng Li, Dawei Zhai, Cong Xiao, Wang Yao

    Abstract: The Nernst effect is a fundamental thermoelectric conversion phenomenon that was deemed to be possible only in systems with magnetic field or magnetization. In this work, we propose a novel dynamical chiral Nernst effect that can appear in two-dimensional van der Waals materials with chiral structural symmetry in the absence of any magnetic degree of freedom. This unconventional effect is triggere… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Journal ref: Quantum Front 3, 11 (2024)

  47. arXiv:2401.12393  [pdf, other

    cs.DB cs.AI

    A Learning-based Declarative Privacy-Preserving Framework for Federated Data Management

    Authors: Hong Guan, Summer Gautier, Deepti Gupta, Rajan Hari Ambrish, Yancheng Wang, Harsha Lakamsani, Dhanush Giriyan, Saajan Maslanka, Chaowei Xiao, Yingzhen Yang, Jia Zou

    Abstract: It is challenging to balance the privacy and accuracy for federated query processing over multiple private data silos. In this work, we will demonstrate an end-to-end workflow for automating an emerging privacy-preserving technique that uses a deep learning model trained using the Differentially-Private Stochastic Gradient Descent (DP-SGD) algorithm to replace portions of actual data to answer a q… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  48. arXiv:2401.12255  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    Instructional Fingerprinting of Large Language Models

    Authors: Jiashu Xu, Fei Wang, Mingyu Derek Ma, Pang Wei Koh, Chaowei Xiao, Muhao Chen

    Abstract: The exorbitant cost of training Large language models (LLMs) from scratch makes it essential to fingerprint the models to protect intellectual property via ownership authentication and to ensure downstream users and developers comply with their license terms (e.g. restricting commercial use). In this study, we present a pilot study on LLM fingerprinting as a form of very lightweight instruction tu… ▽ More

    Submitted 3 April, 2024; v1 submitted 21 January, 2024; originally announced January 2024.

    Comments: Accepted at NAACL 2024; 30 pages

  49. arXiv:2401.09819  [pdf, other

    cs.RO cs.AI cs.LG

    PPNet: A Two-Stage Neural Network for End-to-end Path Planning

    Authors: Qinglong Meng, Chongkun Xia, Xueqian Wang, Song** Mai, Bin Liang

    Abstract: The classical path planners, such as sampling-based path planners, can provide probabilistic completeness guarantees in the sense that the probability that the planner fails to return a solution if one exists, decays to zero as the number of samples approaches infinity. However, finding a near-optimal feasible solution in a given period is challenging in many applications such as the autonomous ve… ▽ More

    Submitted 23 April, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

  50. arXiv:2401.09189  [pdf, other

    cond-mat.quant-gas cond-mat.str-el gr-qc hep-th

    Interface Dynamics of Strongly interacting Binary Superfluids

    Authors: Yu-** An, Li Li, Chuan-Yin Xia, Hua-Bi Zeng

    Abstract: Understanding the interface dynamics in non-equilibrium quantum systems remains a challenge. We study the interface dynamics of strongly coupled immiscible binary superfluids by using holographic duality. The full nonlinear evolution of the binary superfluids with a relative velocity shows rich nonlinear patterns toward quantum turbulence, which is reminiscent of the quantum Kelvin-Helmholtz insta… ▽ More

    Submitted 29 May, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: Accepted version, 10 pages, 5 figures