Skip to main content

Showing 1–50 of 2,967 results for author: Sun, L

.
  1. arXiv:2407.01505  [pdf, other

    cs.CL cs.AI

    Self-Cognition in Large Language Models: An Exploratory Study

    Authors: Dong** Chen, Jiawen Shi, Yao Wan, Pan Zhou, Neil Zhenqiang Gong, Lichao Sun

    Abstract: While Large Language Models (LLMs) have achieved remarkable success across various applications, they also raise concerns regarding self-cognition. In this paper, we perform a pioneering study to explore self-cognition in LLMs. Specifically, we first construct a pool of self-cognition instruction prompts to evaluate where an LLM exhibits self-cognition and four well-designed principles to quantify… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted at ICML 2024 Large Language Models and Cognition Workshop

  2. arXiv:2407.01004  [pdf, other

    cs.LG stat.ME

    CURLS: Causal Rule Learning for Subgroups with Significant Treatment Effect

    Authors: Jiehui Zhou, Linxiao Yang, Xingyu Liu, Xinyue Gu, Liang Sun, Wei Chen

    Abstract: In causal inference, estimating heterogeneous treatment effects (HTE) is critical for identifying how different subgroups respond to interventions, with broad applications in fields such as precision medicine and personalized advertising. Although HTE estimation methods aim to improve accuracy, how to provide explicit subgroup descriptions remains unclear, hindering data interpretation and strateg… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 12 pages, 3 figures

  3. arXiv:2407.00285  [pdf, other

    physics.atom-ph hep-ex nucl-ex

    Imaging of single barium atoms in a second matrix site in solid xenon for barium tagging in a $^{136}$Xe double beta decay experiment

    Authors: M. Yvaine, D. Fairbank, J. Soderstrom, C. Taylor, J. Stanley, T. Walton, C. Chambers, A. Iverson, W. Fairbank, S. Al Kharusi, A. Amy, E. Angelico, A. Anker, I. J. Arnquist, A. Atencio, J. Bane, V. Belov, E. P. Bernard, T. Bhatta, A. Bolotnikov, J. Breslin, P. A. Breur, J. P. Brodsky, E. Brown, T. Brunner , et al. (112 additional authors not shown)

    Abstract: Neutrinoless double beta decay is one of the most sensitive probes for new physics beyond the Standard Model of particle physics. One of the isotopes under investigation is $^{136}$Xe, which would double beta decay into $^{136}$Ba. Detecting the single $^{136}$Ba daughter provides a sort of ultimate tool in the discrimination against backgrounds. Previous work demonstrated the ability to perform s… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: 9 pages, 8 figures

  4. arXiv:2407.00136  [pdf, other

    hep-ex

    Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  5. arXiv:2406.19845  [pdf, other

    cs.CR

    Virtual Context: Enhancing Jailbreak Attacks with Special Token Injection

    Authors: Yuqi Zhou, Lin Lu, Hanchi Sun, Pan Zhou, Lichao Sun

    Abstract: Jailbreak attacks on large language models (LLMs) involve inducing these models to generate harmful content that violates ethics or laws, posing a significant threat to LLM security. Current jailbreak attacks face two main challenges: low success rates due to defensive measures and high resource requirements for crafting specific prompts. This paper introduces Virtual Context, which leverages spec… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 14 pages, 4 figures

  6. arXiv:2406.19190  [pdf, ps, other

    hep-ex

    Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 13 pages, 6 figures

  7. arXiv:2406.19043  [pdf

    eess.IV cs.AI cs.CV cs.DB

    CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

    Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Ya**g Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, **g Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

    Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 19 pages, 3 figures, 2 tables

  8. arXiv:2406.18966  [pdf, other

    cs.CL

    UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models

    Authors: Siyuan Wu, Yue Huang, Chujie Gao, Dong** Chen, Qihui Zhang, Yao Wan, Tianyi Zhou, Xiangliang Zhang, Jianfeng Gao, Chaowei Xiao, Lichao Sun

    Abstract: Large Language Models (LLMs) such as GPT-4 and Llama3 have significantly impacted various fields by enabling high-quality synthetic data generation and reducing dependence on expensive human-generated datasets. Despite this, challenges remain in the areas of generalization, controllability, diversity, and truthfulness within the existing generative frameworks. To address these challenges, this pap… ▽ More

    Submitted 28 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

  9. arXiv:2406.18944  [pdf, other

    cs.CV cs.AI cs.CR

    Investigating and Defending Shortcut Learning in Personalized Diffusion Models

    Authors: Yixin Liu, Ruoxi Chen, Lichao Sun

    Abstract: Personalized diffusion models have gained popularity for adapting pre-trained text-to-image models to generate images of specific topics with only a few images. However, recent studies find that these models are vulnerable to minor adversarial perturbation, and the fine-tuning performance is largely degraded on corrupted datasets. Such characteristics are further exploited to craft protective pert… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Preprint

  10. arXiv:2406.18183  [pdf, other

    hep-ex

    Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 26 pages,5 tables, 4 figures

  11. arXiv:2406.18083  [pdf, other

    hep-ex

    Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 19 pages, 2 figures

  12. arXiv:2406.18051  [pdf, other

    cs.CV

    ViT-1.58b: Mobile Vision Transformers in the 1-bit Era

    Authors: Zhengqing Yuan, Rong Zhou, Hongyi Wang, Lifang He, Yanfang Ye, Lichao Sun

    Abstract: Vision Transformers (ViTs) have achieved remarkable performance in various image classification tasks by leveraging the attention mechanism to process image patches as tokens. However, the high computational and memory demands of ViTs pose significant challenges for deployment in resource-constrained environments. This paper introduces ViT-1.58b, a novel 1.58-bit quantized ViT model designed to dr… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  13. arXiv:2406.17675  [pdf, other

    cs.CL

    Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models

    Authors: Yuan Li, Yue Huang, Hongyi Wang, Xiangliang Zhang, James Zou, Lichao Sun

    Abstract: Large Language Models (LLMs) have demonstrated exceptional task-solving capabilities, increasingly adopting roles akin to human-like assistants. The broader integration of LLMs into society has sparked interest in whether they manifest psychological attributes, and whether these attributes are stable-inquiries that could deepen the understanding of their behaviors. Inspired by psychometrics, this… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  14. arXiv:2406.17452  [pdf, ps, other

    hep-ex

    Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (649 additional authors not shown)

    Abstract: We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  15. arXiv:2406.17166  [pdf, ps, other

    math.DG math.AP

    Sinh-Gordon equations on finite graphs

    Authors: Linlin Sun

    Abstract: In this paper, we focus on the sinh-Gordon equation on graphs. We introduce a uniform a priori estimate to define the topological degree for this equation with nonzero prescribed functions on finite, connected and symmetric graphs. Furthermore, we calculate this topological degree case by case and show several existence results. In particular, we prove that the classical sinh-Gordon equation with… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  16. arXiv:2406.17006  [pdf, other

    hep-ex

    Probing the nature of the $χ_{c1}(3872)$ state using radiative decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1094 additional authors not shown)

    Abstract: The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 31 pages, 2 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-015.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-015, CERN-EP-2025-157

  17. arXiv:2406.16271  [pdf, other

    cs.CV

    Feature-prompting GBMSeg: One-Shot Reference Guided Training-Free Prompt Engineering for Glomerular Basement Membrane Segmentation

    Authors: Xueyu Liu, Guangze Shi, Rui Wang, Yexin Lai, Jianan Zhang, Lele Sun, Quan Yang, Yongfei Wu, MIng Li, Weixia Han, Wen Zheng

    Abstract: Assessment of the glomerular basement membrane (GBM) in transmission electron microscopy (TEM) is crucial for diagnosing chronic kidney disease (CKD). The lack of domain-independent automatic segmentation tools for the GBM necessitates an AI-based solution to automate the process. In this study, we introduce GBMSeg, a training-free framework designed to automatically segment the GBM in TEM images… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: Accepted for MICCAI2024

  18. arXiv:2406.15030  [pdf, ps, other

    hep-ex

    Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 11 pages, 3 figures

  19. arXiv:2406.14721  [pdf, other

    cs.CL

    1+1>2: Can Large Language Models Serve as Cross-Lingual Knowledge Aggregators?

    Authors: Yue Huang, Chenrui Fan, Yuan Li, Siyuan Wu, Tianyi Zhou, Xiangliang Zhang, Lichao Sun

    Abstract: Large Language Models (LLMs) have garnered significant attention due to their remarkable ability to process information across various languages. Despite their capabilities, they exhibit inconsistencies in handling identical queries in different languages, presenting challenges for further advancement. This paper introduces a method to enhance the multilingual performance of LLMs by aggregating kn… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  20. arXiv:2406.13662  [pdf, other

    cs.CL

    ObscurePrompt: Jailbreaking Large Language Models via Obscure Input

    Authors: Yue Huang, **gyu Tang, Dong** Chen, Bingda Tang, Yao Wan, Lichao Sun, Xiangliang Zhang

    Abstract: Recently, Large Language Models (LLMs) have garnered significant attention for their exceptional natural language processing capabilities. However, concerns about their trustworthiness remain unresolved, particularly in addressing "jailbreaking" attacks on aligned LLMs. Previous research predominantly relies on scenarios with white-box LLMs or specific and fixed prompt templates, which are often i… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  21. arXiv:2406.13448  [pdf, other

    physics.acc-ph physics.plasm-ph

    Demonstration of High-Efficiency Microwave Heating Producing Record Highly Charged Xenon Ion Beams with Superconducting ECR Ion Sources

    Authors: X. Wang, J. B. Li, V. Mironov, J. W. Guo, X. Z. Zhang, O. Tarvainen, Y. C. Feng, L. X. Li, J. D. Ma, Z. H. Zhang, W. Lu, S. Bogomolov, L. Sun, H. W. Zhao

    Abstract: Intense highly charged ion beam production is essential for high-power heavy ion accelerators. A novel movable Vlasov launcher for superconducting high charge state Electron Cyclotron Resonance (ECR) ion source has been devised that can affect the microwave power effectiveness by a factor of about 4 in terms of highly charged ion beam production. This approach based on a dedicated microwave launch… ▽ More

    Submitted 25 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  22. arXiv:2406.12671  [pdf, other

    cs.CV

    GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models

    Authors: Yongtao Ge, Guangkai Xu, Zhiyue Zhao, Libo Sun, Zheng Huang, Yanlong Sun, Hao Chen, Chunhua Shen

    Abstract: Recent advances in discriminative and generative pretraining have yielded geometry estimation models with strong generalization capabilities. While discriminative monocular geometry estimation methods rely on large-scale fine-tuning data to achieve zero-shot generalization, several generative-based paradigms show the potential of achieving impressive generalization performance on unseen scenes by… ▽ More

    Submitted 20 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: Code and Benchmark are available at: https://github.com/aim-uofa/GeoBench

  23. arXiv:2406.12646  [pdf, other

    eess.IV cs.AI cs.CV

    An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

    Authors: Qin Li, Yizhe Zhang, Yan Li, Jun Lyu, Meng Liu, Longyu Sun, Mengting Sun, Qirong Li, Wenyue Mao, Xinran Wu, Ya**g Zhang, Yinghua Chu, Shuo Wang, Chengyan Wang

    Abstract: The segmentation foundation model, e.g., Segment Anything Model (SAM), has attracted increasing interest in the medical image community. Early pioneering studies primarily concentrated on assessing and improving SAM's performance from the perspectives of overall accuracy and efficiency, yet little attention was given to the fairness considerations. This oversight raises questions about the potenti… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted to MICCAI-2024

  24. arXiv:2406.12221  [pdf, other

    cs.CL

    On-Policy Fine-grained Knowledge Feedback for Hallucination Mitigation

    Authors: Xueru Wen, Xinyu Lu, Xinyan Guan, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Le Sun

    Abstract: Hallucination occurs when large language models (LLMs) exhibit behavior that deviates from the boundaries of their knowledge during the response generation process. Previous learning-based methods focus on detecting knowledge boundaries and finetuning models with instance-level feedback, but they suffer from inaccurate signals due to off-policy data sampling and coarse-grained feedback. In this pa… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  25. arXiv:2406.12111  [pdf, other

    hep-ex

    Precision measurement of the $Ξ^-_b$ baryon lifetime

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1064 additional authors not shown)

    Abstract: A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second sys… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 12 pages, 5 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2014-010.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-010, CERN-EP-2024-139

  26. arXiv:2406.11519  [pdf, other

    cs.CV eess.IV

    HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

    Authors: Di Wang, Meiqi Hu, Yao **, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi Han, Naoto Yokoya, **g Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo Du, Dacheng Tao, Liangpei Zhang

    Abstract: Foundation models (FMs) are revolutionizing the analysis and understanding of remote sensing (RS) scenes, including aerial RGB, multispectral, and SAR images. However, hyperspectral images (HSIs), which are rich in spectral information, have not seen much application of FMs, with existing methods often restricted to specific tasks and lacking generality. To fill this gap, we introduce HyperSIGMA,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: The code and models will be released at https://github.com/WHU-Sigma/HyperSIGMA

  27. arXiv:2406.10819  [pdf, other

    cs.CV cs.AI cs.CL

    GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents

    Authors: Dong** Chen, Yue Huang, Siyuan Wu, **gyu Tang, Liuyi Chen, Yilin Bai, Zhigang He, Chenlong Wang, Huichi Zhou, Yiqiang Li, Tianshuo Zhou, Yue Yu, Chujie Gao, Qihui Zhang, Yi Gui, Zhen Li, Yao Wan, Pan Zhou, Jianfeng Gao, Lichao Sun

    Abstract: Recently, Multimodal Large Language Models (MLLMs) have been used as agents to control keyboard and mouse inputs by directly perceiving the Graphical User Interface (GUI) and generating corresponding code. However, current agents primarily exhibit excellent understanding capabilities in static environments and are predominantly applied in relatively simple domains, such as Web or mobile interfaces… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  28. arXiv:2406.10744   

    cs.CV

    Technique Report of CVPR 2024 PBDL Challenges

    Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Jose Alvarez, Coert van Gemeren, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Sheng** Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou , et al. (77 additional authors not shown)

    Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More

    Submitted 27 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: The author list and contents need to be verified by all authors

  29. arXiv:2406.09475  [pdf, other

    hep-ex

    Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  30. arXiv:2406.08922  [pdf, other

    cs.CL cs.AI cs.LG

    Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors

    Authors: Ying Zhou, Ben He, Le Sun

    Abstract: With the launch of ChatGPT, large language models (LLMs) have attracted global attention. In the realm of article writing, LLMs have witnessed extensive utilization, giving rise to concerns related to intellectual property protection, personal privacy, and academic integrity. In response, AI-text detection has emerged to distinguish between human and machine-generated content. However, recent rese… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL 2024, Main Conference

  31. arXiv:2406.08372  [pdf, other

    cs.CV

    APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation

    Authors: Weizhao He, Yang Zhang, Wei Zhuo, Linlin Shen, Jiaqi Yang, Songhe Deng, Liang Sun

    Abstract: Few-shot semantic segmentation (FSS) endeavors to segment unseen classes with only a few labeled samples. Current FSS methods are commonly built on the assumption that their training and application scenarios share similar domains, and their performances degrade significantly while applied to a distinct domain. To this end, we propose to leverage the cutting-edge foundation model, the Segment Anyt… ▽ More

    Submitted 12 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 15 pages, 9 figures

  32. arXiv:2406.08225  [pdf, ps, other

    hep-ex

    Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (636 additional authors not shown)

    Abstract: Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  33. arXiv:2406.08177  [pdf, other

    eess.IV cs.CV

    One-Step Effective Diffusion Network for Real-World Image Super-Resolution

    Authors: Rongyuan Wu, Lingchen Sun, Zhiyuan Ma, Lei Zhang

    Abstract: The pre-trained text-to-image diffusion models have been increasingly employed to tackle the real-world image super-resolution (Real-ISR) problem due to their powerful generative image priors. Most of the existing methods start from random noise to reconstruct the high-quality (HQ) image under the guidance of the given low-quality (LQ) image. While promising results have been achieved, such Real-… ▽ More

    Submitted 14 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  34. arXiv:2406.06118  [pdf, other

    hep-ex

    Strong and weak $CP$ tests in sequential decays of polarized $Σ^0$ hyperons

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The wea… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  35. arXiv:2406.05827  [pdf, ps, other

    hep-ex

    Measurement of the integrated luminosity of the data collected at 3.773 GeV by BESIII from 2021 to 2024

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

    Abstract: We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$,… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  36. arXiv:2406.05791  [pdf, other

    cs.CV

    OD-DETR: Online Distillation for Stabilizing Training of Detection Transformer

    Authors: Shengjian Wu, Li Sun, Qingli Li

    Abstract: DEtection TRansformer (DETR) becomes a dominant paradigm, mainly due to its common architecture with high accuracy and no post-processing. However, DETR suffers from unstable training dynamics. It consumes more data and epochs to converge compared with CNN-based detectors. This paper aims to stabilize DETR training through the online distillation. It utilizes a teacher model, accumulated by Expone… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: IJCAI24

  37. arXiv:2406.03387  [pdf, other

    hep-ex

    Measurement of the branching fraction ratios $R(D^{+})$ and $R(D^{*+})$ using muonic $τ$ decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1063 additional authors not shown)

    Abstract: The branching fraction ratios of $\overline{B}^0\to D^+τ^-\overlineν_τ$ and $\overline{B}^0\to D^{*+}τ^-\overlineν_τ$ decays are measured with respect to their muonic counterparts, using a data sample corresponding to an integrated luminosity of 2.0 fb$^{-1}$ collected by the LHCb experiment in proton-proton collisions at $\sqrt{s} = 13$ TeV. The reconstructed final states are formed by combining… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lhcbproject.web.cern.ch/Publications/LHCbProjectPublic/LHCb-PAPER-2024-007.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-007, CERN-EP-2024-125

  38. arXiv:2406.03294  [pdf, other

    cond-mat.mtrl-sci cond-mat.str-el

    Feedback loop dependent charge density wave imaging by scanning tunneling spectroscopy

    Authors: Alessandro Scarfato, Árpád Pásztor, Lihuan Sun, Ivan Maggio-Aprile, Vincent Pasquier, Tejas Parasram Singar, Andreas Ørsted, Ishita Pushkarna, Marcello Spera, Enrico Giannini, Christoph Renner

    Abstract: Scanning Tunneling Spectroscopy (STS) is a unique technique to probe the local density of states (LDOS) at the atomic scale by measuring the tunneling conductance between a sharp tip and a sample surface. However, the technique suffers of well-known limitations, the so-called set-point effect, which can potentially introduce artifacts in the measurements. We compare several STS imaging schemes app… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 7 pages, 6 figures

  39. arXiv:2406.03156  [pdf, other

    hep-ex

    Observation of new charmonium(-like) states in $B^+ \to D^{*\pm} D^{\mp} K^+$ decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

    Abstract: A study of resonant structures in $B^{+}\rightarrow{D^{\ast+}D^{-}K^{+}}$ and $B^{+}\rightarrow{D^{\ast-}D^{+}K^{+}}$ decays is performed, using proton-proton collision data at centre-of-mass energies of $\sqrt{s}=7, 8$, and $13$ TeV recorded by the LHCb experiment, corresponding to an integrated luminosity of 9 fb$^{-1}$. A simultaneous amplitude fit is performed to the two channels with contribu… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-047.html (LHCb public pages)

    Report number: LHCb-PAPER-2023-047, CERN-EP-2024-096

  40. arXiv:2406.02931  [pdf, other

    hep-ex

    Measurements of the branching fractions of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^-π^0/η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 9 pages, 7 figures

  41. arXiv:2406.02903  [pdf, other

    cs.CL

    Open Grounded Planning: Challenges and Benchmark Construction

    Authors: Shiguang Guo, Ziliang Deng, Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun

    Abstract: The emergence of large language models (LLMs) has increasingly drawn attention to the use of LLMs for human-like planning. Existing work on LLM-based planning either focuses on leveraging the inherent language generation capabilities of LLMs to produce free-style plans, or employs reinforcement learning approaches to learn decision-making for a limited set of actions within restricted environments… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accept to ACL 2024 main conference

  42. arXiv:2406.02560  [pdf, other

    eess.AS cs.AI cs.CL cs.LG

    Less Peaky and More Accurate CTC Forced Alignment by Label Priors

    Authors: Ruizhe Huang, Xiaohui Zhang, Zhaoheng Ni, Li Sun, Moto Hira, Jeff Hwang, Vimal Manohar, Vineel Pratap, Matthew Wiesner, Shinji Watanabe, Daniel Povey, Sanjeev Khudanpur

    Abstract: Connectionist temporal classification (CTC) models are known to have peaky output distributions. Such behavior is not a problem for automatic speech recognition (ASR), but it can cause inaccurate forced alignments (FA), especially at finer granularity, e.g., phoneme level. This paper aims at alleviating the peaky behavior for CTC and improve its suitability for forced alignment generation, by leve… ▽ More

    Submitted 15 June, 2024; v1 submitted 22 April, 2024; originally announced June 2024.

    Comments: Accepted by ICASSP 2024. Github repo: https://github.com/huangruizhe/audio/tree/aligner_label_priors

  43. arXiv:2406.01882  [pdf, other

    cs.CR cs.AI cs.ET cs.SE

    HoneyGPT: Breaking the Trilemma in Terminal Honeypots with Large Language Model

    Authors: Ziyang Wang, Jianzhou You, Haining Wang, Tianwei Yuan, Shichao Lv, Yang Wang, Limin Sun

    Abstract: Honeypots, as a strategic cyber-deception mechanism designed to emulate authentic interactions and bait unauthorized entities, continue to struggle with balancing flexibility, interaction depth, and deceptive capability despite their evolution over decades. Often they also lack the capability of proactively adapting to an attacker's evolving tactics, which restricts the depth of engagement and sub… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  44. arXiv:2406.01392  [pdf, other

    cs.CL

    Sparsity-Accelerated Training for Large Language Models

    Authors: Da Ma, Lu Chen, Pengyu Wang, Hongshen Xu, Hanqi Li, Liangtai Sun, Su Zhu, Shuai Fan, Kai Yu

    Abstract: Large language models (LLMs) have demonstrated proficiency across various natural language processing (NLP) tasks but often require additional training, such as continual pre-training and supervised fine-tuning. However, the costs associated with this, primarily due to their large parameter count, remain high. This paper proposes leveraging \emph{sparsity} in pre-trained LLMs to expedite this trai… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 Findings

  45. arXiv:2406.01332  [pdf, ps, other

    hep-ex

    Measurements of the branching fractions of semileptonic $D^{+}_s$ decays via $e^+e^-\to D_s^{*+}D_s^{*-}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 14 pages, 3 figures

  46. arXiv:2406.01252  [pdf, other

    cs.CL cs.AI stat.ML

    Towards Scalable Automated Alignment of LLMs: A Survey

    Authors: Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, Bowen Yu

    Abstract: Alignment is the most critical step in building large language models (LLMs) that meet human needs. With the rapid development of LLMs gradually surpassing human capabilities, traditional alignment methods based on human-annotation are increasingly unable to meet the scalability demands. Therefore, there is an urgent need to explore new sources of automated alignment signals and technical approach… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  47. arXiv:2406.01007  [pdf, other

    hep-ex

    Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay

    Authors: Daya Bay collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, J. Cheng, Y. -C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng , et al. (177 additional authors not shown)

    Abstract: This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  48. arXiv:2406.00380  [pdf, other

    cs.CL cs.AI

    The Best of Both Worlds: Toward an Honest and Helpful Large Language Model

    Authors: Chujie Gao, Qihui Zhang, Dong** Chen, Yue Huang, Siyuan Wu, Zhengyan Fu, Yao Wan, Xiangliang Zhang, Lichao Sun

    Abstract: Large Language Models (LLMs) have achieved remarkable success across various industries due to their exceptional generative capabilities. However, for safe and effective real-world deployments, ensuring honesty and helpfulness is critical. This paper addresses the question: Can we prioritize the helpfulness of LLMs while preserving their honesty? To begin with, we establish exhaustive principles a… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  49. arXiv:2406.00235  [pdf, other

    hep-ex

    Amplitude analysis of the radiative decay $B^0_s\to K^+K^-γ$

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1061 additional authors not shown)

    Abstract: A search for radiative decay of $B^0_s$ mesons to orbitally excited $K^+K^-$ states is performed using proton proton collisions recorded by the \mbox{LHCb}\xspace experiment, corresponding to an integrated luminosity of 9~fb$^{-1}$. The dikaon spectrum in the mass range $m_{KK}<2400$~{\ensuremath{\,\text{Me\kern -0.1em V\!/}c^2}\xspace} is dominated by the $φ(1020)$ resonance that accounts for alm… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-002.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-002, CERN-EP-2024-115

  50. arXiv:2406.00079  [pdf, other

    cs.LG

    Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling

    Authors: Sili Huang, Jifeng Hu, Zhejian Yang, Liwei Yang, Tao Luo, Hechang Chen, Lichao Sun, Bo Yang

    Abstract: Recent works have shown the remarkable superiority of transformer models in reinforcement learning (RL), where the decision-making problem is formulated as sequential generation. Transformer-based agents could emerge with self-improvement in online environments by providing task contexts, such as multiple trajectories, called in-context RL. However, due to the quadratic computation complexity of a… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2405.20692. arXiv admin note: text overlap with arXiv:2405.20692; text overlap with arXiv:2305.16554, arXiv:2210.14215 by other authors