Search | arXiv e-print repository

AI-Olympics: Exploring the Generalization of Agents through Open Competitions

Authors: Chen Wang, Yan Song, Shuai Wu, Sa Wu, Ruizhi Zhang, Shu Lin, Haifeng Zhang

Abstract: Between 2021 and 2023, AI-Olympics, a series of online AI competitions was hosted by the online evaluation platform Jidi in collaboration with the IJCAI committee. In these competitions, an agent is required to accomplish diverse sports tasks in a two-dimensional continuous world, while competing against an opponent. This paper provides a brief overview of the competition series and highlights not… ▽ More Between 2021 and 2023, AI-Olympics, a series of online AI competitions was hosted by the online evaluation platform Jidi in collaboration with the IJCAI committee. In these competitions, an agent is required to accomplish diverse sports tasks in a two-dimensional continuous world, while competing against an opponent. This paper provides a brief overview of the competition series and highlights notable findings. We aim to contribute insights to the field of multi-agent decision-making and explore the generalization of agents through engineering efforts. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: IJCAI 2024 Demo Track Paper

arXiv:2405.14300 [pdf, other]

Automatic diagnosis of cardiac magnetic resonance images based on semi-supervised learning

Authors: Hejun Huang, Zuguo Chen, Yi Huang, Guangqiang Luo, Chaoyang Chen, Youzhi Song

Abstract: Cardiac magnetic resonance imaging (MRI) is a pivotal tool for assessing cardiac function. Precise segmentation of cardiac structures is imperative for accurate cardiac functional evaluation. This paper introduces a semi-supervised model for automatic segmentation of cardiac images and auxiliary diagnosis. By harnessing cardiac MRI images and necessitating only a small portion of annotated image d… ▽ More Cardiac magnetic resonance imaging (MRI) is a pivotal tool for assessing cardiac function. Precise segmentation of cardiac structures is imperative for accurate cardiac functional evaluation. This paper introduces a semi-supervised model for automatic segmentation of cardiac images and auxiliary diagnosis. By harnessing cardiac MRI images and necessitating only a small portion of annotated image data, the model achieves fully automated, high-precision segmentation of cardiac images, extraction of features, calculation of clinical indices, and prediction of diseases. The provided segmentation results, clinical indices, and prediction outcomes can aid physicians in diagnosis, thereby serving as auxiliary diagnostic tools. Experimental results showcase that this semi-supervised model for automatic segmentation of cardiac images and auxiliary diagnosis attains high accuracy in segmentation and correctness in prediction, demonstrating substantial practical guidance and application value. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14212 [pdf, other]

Federated Domain-Specific Knowledge Transfer on Large Language Models Using Synthetic Data

Authors: Haoran Li, Xinyuan Zhao, Dadi Guo, Hanlin Gu, Ziqian Zeng, Yuxing Han, Yangqiu Song, Lixin Fan, Qiang Yang

Abstract: As large language models (LLMs) demonstrate unparalleled performance and generalization ability, LLMs are widely used and integrated into various applications. When it comes to sensitive domains, as commonly described in federated learning scenarios, directly using external LLMs on private data is strictly prohibited by stringent data security and privacy regulations. For local clients, the utiliz… ▽ More As large language models (LLMs) demonstrate unparalleled performance and generalization ability, LLMs are widely used and integrated into various applications. When it comes to sensitive domains, as commonly described in federated learning scenarios, directly using external LLMs on private data is strictly prohibited by stringent data security and privacy regulations. For local clients, the utilization of LLMs to improve the domain-specific small language models (SLMs), characterized by limited computational resources and domain-specific data, has attracted considerable research attention. By observing that LLMs can empower domain-specific SLMs, existing methods predominantly concentrate on leveraging the public data or LLMs to generate more data to transfer knowledge from LLMs to SLMs. However, due to the discrepancies between LLMs' generated data and clients' domain-specific data, these methods cannot yield substantial improvements in the domain-specific tasks. In this paper, we introduce a Federated Domain-specific Knowledge Transfer (FDKT) framework, which enables domain-specific knowledge transfer from LLMs to SLMs while preserving clients' data privacy. The core insight is to leverage LLMs to augment data based on domain-specific few-shot demonstrations, which are synthesized from private domain data using differential privacy. Such synthetic samples share similar data distribution with clients' private data and allow the server LLM to generate particular knowledge to improve clients' SLMs. The extensive experimental results demonstrate that the proposed FDKT framework consistently and greatly improves SLMs' task performance by around 5\% with a privacy budget of less than 10, compared to local training on private data. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14192 [pdf, other]

IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck

Authors: He Zou, Meng'en Qin, Yu Song, Xiaohui Yang

Abstract: In the realm of neural network models, the perpetual challenge remains in retaining task-relevant information while effectively discarding redundant data during propagation. In this paper, we introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory. IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks by dynamically adjust… ▽ More In the realm of neural network models, the perpetual challenge remains in retaining task-relevant information while effectively discarding redundant data during propagation. In this paper, we introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory. IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks by dynamically adjusting the trade-off hyperparameter $λ$ through gradient descent, updating it within the FISTA(Fast Iterative Shrinkage-Thresholding Algorithm ) framework. By optimizing the compressive excitation loss function induced by the information bottleneck principle, IB-AdCSCNet achieves an optimal balance between compression and fitting at a global level, approximating the globally optimal representation feature. This information bottleneck trade-off strategy driven by downstream tasks not only helps to learn effective features of the data, but also improves the generalization of the model. This study's contribution lies in presenting a model with consistent performance and offering a fresh perspective on merging deep learning with sparse representation theory, grounded in the information bottleneck concept. Experimental results on CIFAR-10 and CIFAR-100 datasets demonstrate that IB-AdCSCNet not only matches the performance of deep residual convolutional networks but also outperforms them when handling corrupted data. Through the inference of the IB trade-off, the model's robustness is notably enhanced. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.13800 [pdf, other]

Dense Connector for MLLMs

Authors: Huan** Yao, Wenhao Wu, Taojiannan Yang, YuXin Song, Mengxi Zhang, Haocheng Feng, Yifan Sun, Zhiheng Li, Wanli Ouyang, **gdong Wang

Abstract: Do we fully leverage the potential of visual encoder in Multimodal Large Language Models (MLLMs)? The recent outstanding performance of MLLMs in multimodal understanding has garnered broad attention from both academia and industry. In the current MLLM rat race, the focus seems to be predominantly on the linguistic side. We witness the rise of larger and higher-quality instruction datasets, as well… ▽ More Do we fully leverage the potential of visual encoder in Multimodal Large Language Models (MLLMs)? The recent outstanding performance of MLLMs in multimodal understanding has garnered broad attention from both academia and industry. In the current MLLM rat race, the focus seems to be predominantly on the linguistic side. We witness the rise of larger and higher-quality instruction datasets, as well as the involvement of larger-sized LLMs. Yet, scant attention has been directed towards the visual signals utilized by MLLMs, often assumed to be the final high-level features extracted by a frozen visual encoder. In this paper, we introduce the Dense Connector - a simple, effective, and plug-and-play vision-language connector that significantly enhances existing MLLMs by leveraging multi-layer visual features, with minimal additional computational overhead. Furthermore, our model, trained solely on images, showcases remarkable zero-shot capabilities in video understanding as well. Experimental results across various vision encoders, image resolutions, training dataset scales, varying sizes of LLMs (2.7B->70B), and diverse architectures of MLLMs (e.g., LLaVA and Mini-Gemini) validate the versatility and scalability of our approach, achieving state-of-the-art performance on across 19 image and video benchmarks. We hope that this work will provide valuable experience and serve as a basic module for future MLLM development. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Technical report. 25 pages

arXiv:2405.13315 [pdf, other]

Study of the decays $χ_{cJ}\toΛ\barΛω$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$,… ▽ More Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$, $\mathcal{B}(χ_{c1}\toΛ\barΛω)=({1.01 \pm 0.10 \pm 0.11}) \times 10^{-4}$, and $\mathcal{B}(χ_{c2}\toΛ\barΛω)=({1.40 \pm 0.13 \pm 0.17}) \times 10^{-4}$, where the first uncertainties are statistical and the second are systematic. We observe no clear intermediate structures. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 11 pages, 10 figures

arXiv:2405.13311 [pdf, other]

Observation of a large-scale filament eruption initiated by two small-scale erupting filaments pushing out from below

Authors: Yongliang Song, Jiangtao Su, Qingmin Zhang, Mei Zhang, Yuanyong Deng, Xianyong Bai, Suo Liu, Xiao Yang, Jie Chen, Haiqing Xu, Kaifan Ji, Ziyao Hu

Abstract: Filament eruptions often result in flares and coronal mass ejections (CMEs). Most studies attribute the filament eruptions to their instabilities or magnetic reconnection. In this study, we report a unique observation of a filament eruption whose initiation process has not been reported before. This large-scale filament, with a length of about 360 Mm crossing an active region, is forced to erupted… ▽ More Filament eruptions often result in flares and coronal mass ejections (CMEs). Most studies attribute the filament eruptions to their instabilities or magnetic reconnection. In this study, we report a unique observation of a filament eruption whose initiation process has not been reported before. This large-scale filament, with a length of about 360 Mm crossing an active region, is forced to erupted by two small-scale erupting filaments pushing out from below. This process of multi-filament eruption results in an M6.4 flare in the active region NOAA 13229 on 25th February 2023. The whole process can be divided into three stages: the eruptions of two active-region filaments F1 and F2; the interactions between the erupting F1, F2, and the large-scale filament F3; and the eruption of F3. Though this multi-filament eruption occurs near the northwest limb of the solar disk, it produces a strong halo CME that causes a significant geomagnetic disturbance. Our observations present a new filament eruption mechanism, in which the initial kinetic energy of the eruption is obtained from and transported to by other erupting structures. This event provides us a unique insight into the dynamics of multi-filament eruptions and their corresponding effects on the interplanetary space. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 16 pages, 10 figures. Accepted for publication in Solar Physics

arXiv:2405.13103 [pdf, other]

Search for the lepton-flavor violating decay $B^0_s\toφμ^\pmτ^\mp$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

Abstract: A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper l… ▽ More A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper limit on the branching fraction is determined to be ${\cal B}( B^0_s\toφμ^\pmτ^\mp) < 1.0\times 10^{-5}$ at 90% confidence level. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-006.html (LHCb public pages)

Report number: LHCb-PAPER-2024-006, CERN-EP-2024-114

arXiv:2405.12809 [pdf, other]

Precision measurement of the branching fraction of \boldmath $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (604 additional authors not shown)

Abstract: Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with sig… ▽ More Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with significantly improved precision. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: to be submitted to PRD

arXiv:2405.12688 [pdf, other]

Study of $b$-hadron decays to $Λ_c^+ h^- h^{\prime -}$ final states

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1072 additional authors not shown)

Abstract: Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and… ▽ More Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and $13\,\mathrm{Te\kern -0.1em V}$. The products of the relative branching fractions and fragmentation fractions for each signal mode, relative to the $B^- \to Λ_c^+ \overline{p} π^-$ mode, are measured, with $Ξ_{b}^- \toΛ_{c}^+ K^- π^-$, $Ξ_{b}^- \toΛ_{c}^+ K^- K^-$ and $Ω_{b}^- \toΛ_{c}^+ K^- K^-$ decays being observed at over $5\,σ$ significance. The $Ξ_{b}^- \toΛ_{c}^+ K^- π^-$ mode is also used to measure the $Ξ_{b}^-$ production asymmetry, which is found to be consistent with zero. In addition, the $B^- \to Λ_{c}^+ \overline{p} K^-$ decay is observed for the first time, and its branching fraction is measured relative to that of the $B^- \to Λ_{c}^+ \overline{p} π^-$ mode. △ Less

Submitted 22 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-013.html

Report number: CERN-EP-2024-116, LHCb-PAPER-2024-013

arXiv:2405.12575 [pdf, other]

Three-dimensional map** and electronic origin of large altermagnetic splitting near Fermi level in CrSb

Authors: Guowei Yang, Zhanghuan Li, Sai Yang, Jiyuan Li, Hao Zheng, Weifan Zhu, Saizheng Cao, Wenxuan Zhao, Jiawen Zhang, Mao Ye, Yu Song, Lun-Hui Hu, Lexian Yang, Ming Shi, Huiqiu Yuan, Yongjun Zhang, Yuanfeng Xu, Yang Liu

Abstract: Recently, a new kind of collinear magnetism, dubbed altermagnetism, has attracted considerable interests. A key characteristic of altermagnet is the momentum-dependent band and spin splitting without net magnetization. However, finding altermagnetic materials with large splitting near the Fermi level, which necessarily requires three-dimensional k-space map** and is crucial for spintronic applic… ▽ More Recently, a new kind of collinear magnetism, dubbed altermagnetism, has attracted considerable interests. A key characteristic of altermagnet is the momentum-dependent band and spin splitting without net magnetization. However, finding altermagnetic materials with large splitting near the Fermi level, which necessarily requires three-dimensional k-space map** and is crucial for spintronic applications and emergent phenomena, remains challenging. Here by employing synchrotron-based angle-resolved photoemission spectroscopy (ARPES) and model calculations, we uncover a large altermagnetic splitting, up to ~1.0 eV, near the Fermi level in CrSb. We verify its bulk-type g-wave altermagnetism through systematic three-dimensional kspace map**, which unambiguously reveals the altermagnetic symmetry and associated nodal planes. The ARPES results are well captured by density functional theory calculations. In addition, tight-binding model analysis indicate that the large altermagnetic splitting arises from strong third-nearest-neighbor hop** mediated by Sb ions, which breaks both the space-time reversal symmetry and the translational spin-rotation symmetry. The large band/spin splitting near Fermi level in metallic CrSb, together with its high TN (up to 705 K) and simple spin configuration, paves the way for exploring emergent phenomena and spintronic applications based on altermagnets. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 16 pages, 4 figures and 1 table

arXiv:2405.11585 [pdf, other]

Improved measurement of the branching fraction of $h_{c}\rightarrowγη^\prime/η$ and search for $h_{c}\rightarrowγπ^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (645 additional authors not shown)

Abstract: The processes $h_c\rightarrowγP(P = η^\prime,~η,~π^{0}))$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The branching fractions of $h_c\rightarrowγη^\prime$ and $h_c\rightarrowγη$ are measured to be $(1.40\pm0.11\pm0.04\pm0.10)\times10^{-3}$ and $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, respectively, where the… ▽ More The processes $h_c\rightarrowγP(P = η^\prime,~η,~π^{0}))$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The branching fractions of $h_c\rightarrowγη^\prime$ and $h_c\rightarrowγη$ are measured to be $(1.40\pm0.11\pm0.04\pm0.10)\times10^{-3}$ and $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, respectively, where the first uncertainties are statistical, the second systematic, and the third from the branching fraction of $ψ(3686)\rightarrowπ^{0}h_c$. The ratio $R_{h_c}=\frac{\mathscr{B}(h_c\rightarrowγη)}{\mathscr{B}(h_c\rightarrowγη^\prime)}$ is calculated to be $(27.0\pm4.4\pm1.0)\%$. The measurements are consistent with the previous results with improved precision by a factor of 2. The results are valuable for gaining a deeper understanding of $η-η^\prime$ mixing, and its manifestation within quantum chromodynamics. No significant signal is found for the decay $h_c\rightarrowγπ^{0}$, and an upper limit is placed on its branching fraction of $\mathscr{B}(h_c\rightarrowγπ^{0})<5.0\times10^{-5}$, at the 90\% confidence level. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.11324 [pdf, other]

Transverse polarization measurement of $Λ$ hyperons in $p$Ne collisions at $\sqrt{s_{NN}}$ = 68.4 GeV with the $\mbox{LHCb}$ detector

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1065 additional authors not shown)

Abstract: A measurement of the transverse polarization of the $Λ$ and $\barΛ$ hyperons in $p$Ne fixed-target collisions at $\sqrt{s_{NN}}$ = 68.4 GeV is presented using data collected by the LHCb detector. The polarization is studied using the decay $Λ\rightarrow p π^-$ together with its charge conjugated process, the integrated values measured are… ▽ More A measurement of the transverse polarization of the $Λ$ and $\barΛ$ hyperons in $p$Ne fixed-target collisions at $\sqrt{s_{NN}}$ = 68.4 GeV is presented using data collected by the LHCb detector. The polarization is studied using the decay $Λ\rightarrow p π^-$ together with its charge conjugated process, the integrated values measured are $$ P_Λ = 0.029 \pm 0.019 \, (\rm{stat}) \pm 0.012 \, (\rm{syst}) \, , $$ $$ P_{\barΛ} = 0.003 \pm 0.023 \, (\rm{stat}) \pm 0.014 \,(\rm{syst}) \,. $$ Furthermore, the results are shown as a function of the Feynman~$x$~variable, transverse momentum, pseudorapidity and rapidity of the hyperons, and are compared with previous measurements. △ Less

Submitted 24 May, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3120 (LHCb public pages)

Report number: CERN-EP-2024-121, LHCb-PAPER-2024-009

arXiv:2405.11165 [pdf, other]

Automated Multi-level Preference for MLLMs

Authors: Mengxi Zhang, Wenhao Wu, Yu Lu, Yuxin Song, Kang Rong, Huan** Yao, Jianbo Zhao, Fanglong Liu, Yifan Sun, Haocheng Feng, **gdong Wang

Abstract: Current multimodal Large Language Models (MLLMs) suffer from ``hallucination'', occasionally generating responses that are not grounded in the input images. To tackle this challenge, one promising path is to utilize reinforcement learning from human feedback (RLHF), which steers MLLMs towards learning superior responses while avoiding inferior ones. We rethink the common practice of using binary p… ▽ More Current multimodal Large Language Models (MLLMs) suffer from ``hallucination'', occasionally generating responses that are not grounded in the input images. To tackle this challenge, one promising path is to utilize reinforcement learning from human feedback (RLHF), which steers MLLMs towards learning superior responses while avoiding inferior ones. We rethink the common practice of using binary preferences (i.e., superior, inferior), and find that adopting multi-level preferences (e.g., superior, medium, inferior) is better for two benefits: 1) It narrows the gap between adjacent levels, thereby encouraging MLLMs to discern subtle differences. 2) It further integrates cross-level comparisons (beyond adjacent-level comparisons), thus providing a broader range of comparisons with hallucination examples. To verify our viewpoint, we present the Automated Multi-level Preference (AMP) framework for MLLMs. To facilitate this framework, we first develop an automated dataset generation pipeline that provides high-quality multi-level preference datasets without any human annotators. Furthermore, we design the Multi-level Direct Preference Optimization (MDPO) algorithm to robustly conduct complex multi-level preference learning. Additionally, we propose a new hallucination benchmark, MRHal-Bench. Extensive experiments across public hallucination and general benchmarks, as well as our MRHal-Bench, demonstrate the effectiveness of our proposed method. Code is available at https://github.com/takomc/amp. △ Less

Submitted 28 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

Comments: Preprint

arXiv:2405.10890 [pdf, other]

A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model

Authors: Mingxiang Fu, Yu Song, Jiameng Lv, Liang Cao, Peng Jia, Nan Li, Xiangru Li, Jifeng Liu, A-Li Luo, Bo Qiu, Shiyin Shen, Liang** Tu, Lili Wang, Shoulin Wei, Haifeng Yang, Zhen** Yi, Zhiqiang Zou

Abstract: The exponential growth of astronomical datasets provides an unprecedented opportunity for humans to gain insight into the Universe. However, effectively analyzing this vast amount of data poses a significant challenge. Astronomers are turning to deep learning techniques to address this, but the methods are limited by their specific training sets, leading to considerable duplicate workloads too. He… ▽ More The exponential growth of astronomical datasets provides an unprecedented opportunity for humans to gain insight into the Universe. However, effectively analyzing this vast amount of data poses a significant challenge. Astronomers are turning to deep learning techniques to address this, but the methods are limited by their specific training sets, leading to considerable duplicate workloads too. Hence, as an example to present how to overcome the issue, we built a framework for general analysis of galaxy images, based on a large vision model (LVM) plus downstream tasks (DST), including galaxy morphological classification, image restoration, object detection, parameter extraction, and more. Considering the low signal-to-noise ratio of galaxy images and the imbalanced distribution of galaxy categories, we have incorporated a Human-in-the-loop (HITL) module into our large vision model, which leverages human knowledge to enhance the reliability and interpretability of processing galaxy images interactively. The proposed framework exhibits notable few-shot learning capabilities and versatile adaptability to all the abovementioned tasks on galaxy images in the DESI legacy imaging surveys. Expressly, for object detection, trained by 1000 data points, our DST upon the LVM achieves an accuracy of 96.7%, while ResNet50 plus Mask R-CNN gives an accuracy of 93.1%; for morphology classification, to obtain AUC ~0.9, LVM plus DST and HITL only requests 1/50 training sets compared to ResNet18. Expectedly, multimodal data can be integrated similarly, which opens up possibilities for conducting joint analyses with datasets spanning diverse domains in the era of multi-message astronomy. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 26 pages, 10 figures, to be published on Chinese Physics C

arXiv:2405.10140 [pdf, other]

Libra: Building Decoupled Vision System on Large Language Models

Authors: Yifan Xu, Xiaoshan Yang, Yaguang Song, Changsheng Xu

Abstract: In this work, we introduce Libra, a prototype model with a decoupled vision system on a large language model (LLM). The decoupled vision system decouples inner-modal modeling and cross-modal interaction, yielding unique visual information modeling and effective cross-modal comprehension. Libra is trained through discrete auto-regressive modeling on both vision and language inputs. Specifically, we… ▽ More In this work, we introduce Libra, a prototype model with a decoupled vision system on a large language model (LLM). The decoupled vision system decouples inner-modal modeling and cross-modal interaction, yielding unique visual information modeling and effective cross-modal comprehension. Libra is trained through discrete auto-regressive modeling on both vision and language inputs. Specifically, we incorporate a routed visual expert with a cross-modal bridge module into a pretrained LLM to route the vision and language flows during attention computing to enable different attention patterns in inner-modal modeling and cross-modal interaction scenarios. Experimental results demonstrate that the dedicated design of Libra achieves a strong MLLM baseline that rivals existing works in the image-to-text scenario with merely 50 million training data, providing a new perspective for future multimodal foundation models. Code is available at https://github.com/YifanXu74/Libra. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: ICML2024

arXiv:2405.09820 [pdf, other]

Densely Distilling Cumulative Knowledge for Continual Learning

Authors: Zenglin Shi, Pei Liu, Tong Su, Yunpeng Wu, Kuien Liu, Yu Song, Meng Wang

Abstract: Continual learning, involving sequential training on diverse tasks, often faces catastrophic forgetting. While knowledge distillation-based approaches exhibit notable success in preventing forgetting, we pinpoint a limitation in their ability to distill the cumulative knowledge of all the previous tasks. To remedy this, we propose Dense Knowledge Distillation (DKD). DKD uses a task pool to track t… ▽ More Continual learning, involving sequential training on diverse tasks, often faces catastrophic forgetting. While knowledge distillation-based approaches exhibit notable success in preventing forgetting, we pinpoint a limitation in their ability to distill the cumulative knowledge of all the previous tasks. To remedy this, we propose Dense Knowledge Distillation (DKD). DKD uses a task pool to track the model's capabilities. It partitions the output logits of the model into dense groups, each corresponding to a task in the task pool. It then distills all tasks' knowledge using all groups. However, using all the groups can be computationally expensive, we also suggest random group selection in each optimization step. Moreover, we propose an adaptive weighting scheme, which balances the learning of new classes and the retention of old classes, based on the count and similarity of the classes. Our DKD outperforms recent state-of-the-art baselines across diverse benchmarks and scenarios. Empirical analysis underscores DKD's ability to enhance model stability, promote flatter minima for improved generalization, and remains robust across various memory budgets and task orders. Moreover, it seamlessly integrates with other CL methods to boost performance and proves versatile in offline scenarios like model compression. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 12 pages; Continual Leanrning; Class-incremental Learning; Knowledge Distillation; Forgetting

arXiv:2405.09510 [pdf, other]

The Instrumental Variable Model with Categorical Instrument, Treatment and Outcome

Authors: Yilin Song, K. C. Gary Chan, Thomas S. Richardson

Abstract: Instrumental variable models are central to the inference of causal effects in many settings. We consider the instrumental variable model with discrete variables where the instrument (Z), exposure (X) and outcome (Y) take Q, K, and M levels respectively. We assume that the instrument is randomized and that there is no direct effect of Z on Y so that Y(x,z) = Y(x). We first provide a simple charact… ▽ More Instrumental variable models are central to the inference of causal effects in many settings. We consider the instrumental variable model with discrete variables where the instrument (Z), exposure (X) and outcome (Y) take Q, K, and M levels respectively. We assume that the instrument is randomized and that there is no direct effect of Z on Y so that Y(x,z) = Y(x). We first provide a simple characterization of the set of joint distributions of the potential outcomes P(Y(x=1), ..., Y(x=K)) compatible with a given observed distribution P(X, Y | Z). We then discuss the variation (in)dependence property of the marginal probability distribution of the potential outcomes P(Y(x=1)), ..., P(Y(x=K)) which has direct implications for partial identification of average causal effect contrasts such as E[Y(x=i) - Y(x=j)]. We also include simulation results on the volume of the observed distributions not compatible with the IV model as K and Q change. △ Less

Submitted 27 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.09359 [pdf, other]

Visual Attention Based Cognitive Human-Robot Collaboration for Pedicle Screw Placement in Robot-Assisted Orthopedic Surgery

Authors: Chen Chen, Qikai Zou, Yuhang Song, Shiji Song, Xiang Li

Abstract: Current orthopedic robotic systems largely focus on navigation, aiding surgeons in positioning a guiding tube but still requiring manual drilling and screw placement. The automation of this task not only demands high precision and safety due to the intricate physical interactions between the surgical tool and bone but also poses significant risks when executed without adequate human oversight. As… ▽ More Current orthopedic robotic systems largely focus on navigation, aiding surgeons in positioning a guiding tube but still requiring manual drilling and screw placement. The automation of this task not only demands high precision and safety due to the intricate physical interactions between the surgical tool and bone but also poses significant risks when executed without adequate human oversight. As it involves continuous physical interaction, the robot should collaborate with the surgeon, understand the human intent, and always include the surgeon in the loop. To achieve this, this paper proposes a new cognitive human-robot collaboration framework, including the intuitive AR-haptic human-robot interface, the visual-attention-based surgeon model, and the shared interaction control scheme for the robot. User studies on a robotic platform for orthopedic surgery are presented to illustrate the performance of the proposed method. The results demonstrate that the proposed human-robot collaboration framework outperforms full robot and full human control in terms of safety and ergonomics. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 7 pages, 8 figures, submitted to IROS 2024

arXiv:2405.09066 [pdf, other]

Search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, V. Batozskaya, D. Becker, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko , et al. (559 additional authors not shown)

Abstract: We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for… ▽ More We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ are set to be $1.1 \times 10^{-5}$ and $4.3 \times 10^{-6}$ at 90\% confidence level, respectively. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 14 pages, 7 figures

arXiv:2405.09050 [pdf, other]

3D Shape Augmentation with Content-Aware Shape Resizing

Authors: Mingxiang Chen, Jian Zhang, Boli Zhou, Yang Song

Abstract: Recent advancements in deep learning for 3D models have propelled breakthroughs in generation, detection, and scene understanding. However, the effectiveness of these algorithms hinges on large training datasets. We address the challenge by introducing Efficient 3D Seam Carving (E3SC), a novel 3D model augmentation method based on seam carving, which progressively deforms only part of the input mo… ▽ More Recent advancements in deep learning for 3D models have propelled breakthroughs in generation, detection, and scene understanding. However, the effectiveness of these algorithms hinges on large training datasets. We address the challenge by introducing Efficient 3D Seam Carving (E3SC), a novel 3D model augmentation method based on seam carving, which progressively deforms only part of the input model while ensuring the overall semantics are unchanged. Experiments show that our approach is capable of producing diverse and high-quality augmented 3D shapes across various types and styles of input models, achieving considerable improvements over previous methods. Quantitative evaluations demonstrate that our method effectively enhances the novelty and quality of shapes generated by other subsequent 3D generation algorithms. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.08882 [pdf, other]

Lollipop: SVM Rollups on Solana

Authors: Irvin Steve Cardenas, Yugart Song

Abstract: We present a formal specification for the implementation of Solana virtual machine (SVM) rollups deployed on top of the Solana Layer 1 (L1) blockchain. We further discuss our motivation, implementation, design decisions, limitations, and preliminary results. Overall, this paper is intended to serve as an initial introduction to building such system(s) on top of the Solana L1 blockchain, but does n… ▽ More We present a formal specification for the implementation of Solana virtual machine (SVM) rollups deployed on top of the Solana Layer 1 (L1) blockchain. We further discuss our motivation, implementation, design decisions, limitations, and preliminary results. Overall, this paper is intended to serve as an initial introduction to building such system(s) on top of the Solana L1 blockchain, but does not represent an absolute. Lastly, we comment discuss on extensions of this specification to support SVM rollups on other well-established L1 blockchains systems such as Ethereum. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.08570 [pdf, other]

Rethinking the adaptive relationship between Encoder Layers and Decoder Layers

Authors: Yubo Song

Abstract: This article explores the adaptive relationship between Encoder Layers and Decoder Layers using the SOTA model Helsinki-NLP/opus-mt-de-en, which translates German to English. The specific method involves introducing a bias-free fully connected layer between the Encoder and Decoder, with different initializations of the layer's weights, and observing the outcomes of fine-tuning versus retraining. F… ▽ More This article explores the adaptive relationship between Encoder Layers and Decoder Layers using the SOTA model Helsinki-NLP/opus-mt-de-en, which translates German to English. The specific method involves introducing a bias-free fully connected layer between the Encoder and Decoder, with different initializations of the layer's weights, and observing the outcomes of fine-tuning versus retraining. Four experiments were conducted in total. The results suggest that directly modifying the pre-trained model structure for fine-tuning yields suboptimal performance. However, upon observing the outcomes of the experiments with retraining, this structural adjustment shows significant potential. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.08419 [pdf, other]

WaterMamba: Visual State Space Model for Underwater Image Enhancement

Authors: Meisheng Guan, Haiyong Xu, Gangyi Jiang, Mei Yu, Yeyao Chen, Ting Luo, Yang Song

Abstract: Underwater imaging often suffers from low quality due to factors affecting light propagation and absorption in water. To improve image quality, some underwater image enhancement (UIE) methods based on convolutional neural networks (CNN) and Transformer have been proposed. However, CNN-based UIE methods are limited in modeling long-range dependencies, and Transformer-based methods involve a large n… ▽ More Underwater imaging often suffers from low quality due to factors affecting light propagation and absorption in water. To improve image quality, some underwater image enhancement (UIE) methods based on convolutional neural networks (CNN) and Transformer have been proposed. However, CNN-based UIE methods are limited in modeling long-range dependencies, and Transformer-based methods involve a large number of parameters and complex self-attention mechanisms, posing efficiency challenges. Considering computational complexity and severe underwater image degradation, a state space model (SSM) with linear computational complexity for UIE, named WaterMamba, is proposed. We propose spatial-channel omnidirectional selective scan (SCOSS) blocks comprising spatial-channel coordinate omnidirectional selective scan (SCCOSS) modules and a multi-scale feedforward network (MSFFN). The SCOSS block models pixel and channel information flow, addressing dependencies. The MSFFN facilitates information flow adjustment and promotes synchronized operations within SCCOSS modules. Extensive experiments showcase WaterMamba's cutting-edge performance with reduced parameters and computational resources, outperforming state-of-the-art methods on various datasets, validating its effectiveness and generalizability. The code will be released on GitHub after acceptance. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2403.06098

arXiv:2405.07741 [pdf, other]

Search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (635 additional authors not shown)

Abstract: Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions… ▽ More Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions $\mathcal{B}(χ_{c1}(3872)\toγψ_2(3823), ψ_2(3823)\toγχ_{c1})/\mathcal{B}(χ_{c1}(3872)\toπ^+π^- J/ψ)$ is set as 0.075 at the 90\% confidence level. Our result contradicts theoretical predictions under the assumption that the $χ_{c1}(3872)$ is the pure charmonium state $χ_{c1}(2P)$. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 8 pages, 2 figures

arXiv:2405.07667 [pdf, other]

Backdoor Removal for Generative Large Language Models

Authors: Haoran Li, Yulin Chen, Zihao Zheng, Qi Hu, Chunkit Chan, Heshan Liu, Yangqiu Song

Abstract: With rapid advances, generative large language models (LLMs) dominate various Natural Language Processing (NLP) tasks from understanding to reasoning. Yet, language models' inherent vulnerabilities may be exacerbated due to increased accessibility and unrestricted model training on massive textual data from the Internet. A malicious adversary may publish poisoned data online and conduct backdoor a… ▽ More With rapid advances, generative large language models (LLMs) dominate various Natural Language Processing (NLP) tasks from understanding to reasoning. Yet, language models' inherent vulnerabilities may be exacerbated due to increased accessibility and unrestricted model training on massive textual data from the Internet. A malicious adversary may publish poisoned data online and conduct backdoor attacks on the victim LLMs pre-trained on the poisoned data. Backdoored LLMs behave innocuously for normal queries and generate harmful responses when the backdoor trigger is activated. Despite significant efforts paid to LLMs' safety issues, LLMs are still struggling against backdoor attacks. As Anthropic recently revealed, existing safety training strategies, including supervised fine-tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), fail to revoke the backdoors once the LLM is backdoored during the pre-training stage. In this paper, we present Simulate and Eliminate (SANDE) to erase the undesired backdoored map**s for generative LLMs. We initially propose Overwrite Supervised Fine-tuning (OSFT) for effective backdoor removal when the trigger is known. Then, to handle the scenarios where the trigger patterns are unknown, we integrate OSFT into our two-stage framework, SANDE. Unlike previous works that center on the identification of backdoors, our safety-enhanced LLMs are able to behave normally even when the exact triggers are activated. We conduct comprehensive experiments to show that our proposed SANDE is effective against backdoor attacks while bringing minimal harm to LLMs' powerful capability without any additional access to unbackdoored clean models. We will release the reproducible code. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.07497 [pdf, other]

Towards Subgraph Isomorphism Counting with Graph Kernels

Authors: Xin Liu, Weiqi Wang, Jiaxin Bai, Yangqiu Song

Abstract: Subgraph isomorphism counting is known as #P-complete and requires exponential time to find the accurate solution. Utilizing representation learning has been shown as a promising direction to represent substructures and approximate the solution. Graph kernels that implicitly capture the correlations among substructures in diverse graphs have exhibited great discriminative power in graph classifica… ▽ More Subgraph isomorphism counting is known as #P-complete and requires exponential time to find the accurate solution. Utilizing representation learning has been shown as a promising direction to represent substructures and approximate the solution. Graph kernels that implicitly capture the correlations among substructures in diverse graphs have exhibited great discriminative power in graph classification, so we pioneeringly investigate their potential in counting subgraph isomorphisms and further explore the augmentation of kernel capability through various variants, including polynomial and Gaussian kernels. Through comprehensive analysis, we enhance the graph kernels by incorporating neighborhood information. Finally, we present the results of extensive experiments to demonstrate the effectiveness of the enhanced graph kernels and discuss promising directions for future research. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.07220 [pdf, other]

On Discovery of Local Independence over Continuous Variables via Neural Contextual Decomposition

Authors: Inwoo Hwang, Yunhyeok Kwak, Yeon-Ji Song, Byoung-Tak Zhang, Sanghack Lee

Abstract: Conditional independence provides a way to understand causal relationships among the variables of interest. An underlying system may exhibit more fine-grained causal relationships especially between a variable and its parents, which will be called the local independence relationships. One of the most widely studied local relationships is Context-Specific Independence (CSI), which holds in a specif… ▽ More Conditional independence provides a way to understand causal relationships among the variables of interest. An underlying system may exhibit more fine-grained causal relationships especially between a variable and its parents, which will be called the local independence relationships. One of the most widely studied local relationships is Context-Specific Independence (CSI), which holds in a specific assignment of conditioned variables. However, its applicability is often limited since it does not allow continuous variables: data conditioned to the specific value of a continuous variable contains few instances, if not none, making it infeasible to test independence. In this work, we define and characterize the local independence relationship that holds in a specific set of joint assignments of parental variables, which we call context-set specific independence (CSSI). We then provide a canonical representation of CSSI and prove its fundamental properties. Based on our theoretical findings, we cast the problem of discovering multiple CSSI relationships in a system as finding a partition of the joint outcome space. Finally, we propose a novel method, coined neural contextual decomposition (NCD), which learns such partition by imposing each set to induce CSSI via modeling a conditional distribution. We empirically demonstrate that the proposed method successfully discovers the ground truth local independence relationships in both synthetic dataset and complex system reflecting the real-world physical dynamics. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: Conference on Causal Learning and Reasoning (CLeaR), 2023

arXiv:2405.06894 [pdf]

Dual-grating single-shot pump-probe technique

Authors: Tianchen Yu, Junyi Yang, Wenfa Zhou, Zhongguo Li, Xingzhi Wu, Yu Fang, Yong Yang, Yinglin Song

Abstract: A simple and effective single-shot pump-probe technique is reported for studying the ultrafast dynamic processes in various materials. Using only two commercial gratings, a large time window of ~ 95.58 ps is spatially encoded in a single probe pulse, and single-shot time-resolved measurements are implemented. This time window exceeds the maximum reported values for single-shot pump-probe technique… ▽ More A simple and effective single-shot pump-probe technique is reported for studying the ultrafast dynamic processes in various materials. Using only two commercial gratings, a large time window of ~ 95.58 ps is spatially encoded in a single probe pulse, and single-shot time-resolved measurements are implemented. This time window exceeds the maximum reported values for single-shot pump-probe techniques using the echelon or angle beam encoding strategy. The phase difference problem in the echelon encoding strategies is also eliminated and a customized echelon is not needed in this technique. The ultrafast dynamic processes of ZnSe and indolium squaraine at a wavelength of 650 nm were investigated using this technique. △ Less

Submitted 31 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

Comments: 14 pages, 4 figures

arXiv:2405.06654 [pdf, other]

PROflow: An iterative refinement model for PROTAC-induced structure prediction

Authors: Bo Qiang, Wenxian Shi, Yuxuan Song, Menghua Wu

Abstract: Proteolysis targeting chimeras (PROTACs) are small molecules that trigger the breakdown of traditionally ``undruggable'' proteins by binding simultaneously to their targets and degradation-associated proteins. A key challenge in their rational design is understanding their structural basis of activity. Due to the lack of crystal structures (18 in the PDB), existing PROTAC docking methods have been… ▽ More Proteolysis targeting chimeras (PROTACs) are small molecules that trigger the breakdown of traditionally ``undruggable'' proteins by binding simultaneously to their targets and degradation-associated proteins. A key challenge in their rational design is understanding their structural basis of activity. Due to the lack of crystal structures (18 in the PDB), existing PROTAC docking methods have been forced to simplify the problem into a distance-constrained protein-protein docking task. To address the data issue, we develop a novel pseudo-data generation scheme that requires only binary protein-protein complexes. This new dataset enables PROflow, an iterative refinement model for PROTAC-induced structure prediction that models the full PROTAC flexibility during constrained protein-protein docking. PROflow outperforms the state-of-the-art across docking metrics and runtime. Its inference speed enables the large-scale screening of PROTAC designs, and computed properties of predicted structures achieve statistically significant correlations with published degradation activities. △ Less

Submitted 10 April, 2024; originally announced May 2024.

Comments: Published at the GEM workshop, ICLR 2024

arXiv:2405.06556 [pdf, other]

Search for time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

Abstract: A measurement of time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays using a $pp$ collision data sample collected by the LHCb experiment in 2012 and from 2015 to 2018, corresponding to an integrated luminosity of 7.7$\,\mathrm{fb}^{-1}$, is presented. The initial flavour of each $D^0$ candidate is determined from the charge of the pion produced in the… ▽ More A measurement of time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays using a $pp$ collision data sample collected by the LHCb experiment in 2012 and from 2015 to 2018, corresponding to an integrated luminosity of 7.7$\,\mathrm{fb}^{-1}$, is presented. The initial flavour of each $D^0$ candidate is determined from the charge of the pion produced in the $D^*(2010)^+ \rightarrow D^0 π^+$ decay. The decay $D^0 \rightarrow K^- π^+ π^0$ is used as a control channel to validate the measurement procedure. The gradient of the time-dependent $CP$ asymmetry, $ΔY$, in $D^0 \rightarrow π^+ π^- π^0$ decays is measured to be \begin{equation*} ΔY = (-1.3 \pm 6.3 \pm 2.4) \times 10^{-4}, \end{equation*} where the first uncertainty is statistical and the second is systematic, which is compatible with $CP$ conservation. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://lhcbproject.web.cern.ch/Publications/p/LHCb-PAPER-2024-003.html (LHCb public pages)

Report number: LHCb-PAPER-2024-003, CERN-EP-2024-111

arXiv:2405.06393 [pdf, other]

Measurement of the ${e}^{+}{e}^{-}\to p \bar{p}π^{0}$ cross section at $\sqrt{s}=2.1000-3.0800$ GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the… ▽ More The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the $p\bar{p}π^0$ energy threshold, we can probe the threshold behavior for this reaction. However, no anomalous threshold enhancement is found in the cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.05237 [pdf, other]

EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning

Authors: **gfeng Yao, Xinggang Wang, Yuehao Song, Huangxuan Zhao, Jun Ma, Yajie Chen, Wenyu Liu, Bo Wang

Abstract: The diagnosis and treatment of chest diseases play a crucial role in maintaining human health. X-ray examination has become the most common clinical examination means due to its efficiency and cost-effectiveness. Artificial intelligence analysis methods for chest X-ray images are limited by insufficient annotation data and varying levels of annotation, resulting in weak generalization ability and… ▽ More The diagnosis and treatment of chest diseases play a crucial role in maintaining human health. X-ray examination has become the most common clinical examination means due to its efficiency and cost-effectiveness. Artificial intelligence analysis methods for chest X-ray images are limited by insufficient annotation data and varying levels of annotation, resulting in weak generalization ability and difficulty in clinical dissemination. Here we present EVA-X, an innovative foundational model based on X-ray images with broad applicability to various chest disease detection tasks. EVA-X is the first X-ray image based self-supervised learning method capable of capturing both semantic and geometric information from unlabeled images for universal X-ray image representation. Through extensive experimentation, EVA-X has demonstrated exceptional performance in chest disease analysis and localization, becoming the first model capable of spanning over 20 different chest diseases and achieving leading results in over 11 different detection tasks in the medical field. Additionally, EVA-X significantly reduces the burden of data annotation in the medical AI field, showcasing strong potential in the domain of few-shot learning. The emergence of EVA-X will greatly propel the development and application of foundational medical models, bringing about revolutionary changes in future medical research and clinical practice. Our codes and models are available at: https://github.com/hustvl/EVA-X. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: codes available at: https://github.com/hustvl/EVA-X

arXiv:2405.04844 [pdf, ps, other]

Full Stage Learning to Rank: A Unified Framework for Multi-Stage Systems

Authors: Kai Zheng, Haijun Zhao, Rui Huang, Beichuan Zhang, Na Mou, Yanan Niu, Yang Song, Hongning Wang, Kun Gai

Abstract: The Probability Ranking Principle (PRP) has been considered as the foundational standard in the design of information retrieval (IR) systems. The principle requires an IR module's returned list of results to be ranked with respect to the underlying user interests, so as to maximize the results' utility. Nevertheless, we point out that it is inappropriate to indiscriminately apply PRP through eve… ▽ More The Probability Ranking Principle (PRP) has been considered as the foundational standard in the design of information retrieval (IR) systems. The principle requires an IR module's returned list of results to be ranked with respect to the underlying user interests, so as to maximize the results' utility. Nevertheless, we point out that it is inappropriate to indiscriminately apply PRP through every stage of a contemporary IR system. Such systems contain multiple stages (e.g., retrieval, pre-ranking, ranking, and re-ranking stages, as examined in this paper). The \emph{selection bias} inherent in the model of each stage significantly influences the results that are ultimately presented to users. To address this issue, we propose an improved ranking principle for multi-stage systems, namely the Generalized Probability Ranking Principle (GPRP), to emphasize both the selection bias in each stage of the system pipeline as well as the underlying interest of users. We realize GPRP via a unified algorithmic framework named Full Stage Learning to Rank. Our core idea is to first estimate the selection bias in the subsequent stages and then learn a ranking model that best complies with the downstream modules' selection bias so as to deliver its top ranked results to the final ranked list in the system's output. We performed extensive experiment evaluations of our developed Full Stage Learning to Rank solution, using both simulations and online A/B tests in one of the leading short-video recommendation platforms. The algorithm is proved to be effective in both retrieval and ranking stages. Since deployed, the algorithm has brought consistent and significant performance gain to the platform. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted by WWW 2024

arXiv:2405.04840 [pdf, other]

Federated Adaptation for Foundation Model-based Recommendations

Authors: Chunxu Zhang, Guodong Long, Hongkuan Guo, Xiao Fang, Yang Song, Zhaojie Liu, Guorui Zhou, Zijian Zhang, Yang Liu, Bo Yang

Abstract: With the recent success of large language models, particularly foundation models with generalization abilities, applying foundation models for recommendations becomes a new paradigm to improve existing recommendation systems. It becomes a new open challenge to enable the foundation model to capture user preference changes in a timely manner with reasonable communication and computation costs while… ▽ More With the recent success of large language models, particularly foundation models with generalization abilities, applying foundation models for recommendations becomes a new paradigm to improve existing recommendation systems. It becomes a new open challenge to enable the foundation model to capture user preference changes in a timely manner with reasonable communication and computation costs while preserving privacy. This paper proposes a novel federated adaptation mechanism to enhance the foundation model-based recommendation system in a privacy-preserving manner. Specifically, each client will learn a lightweight personalized adapter using its private data. The adapter then collaborates with pre-trained foundation models to provide recommendation service efficiently with fine-grained manners. Importantly, users' private behavioral data remains secure as it is not shared with the server. This data localization-based privacy preservation is embodied via the federated learning framework. The model can ensure that shared knowledge is incorporated into all adapters while simultaneously preserving each user's personal preferences. Experimental results on four benchmark datasets demonstrate our method's superior performance. Implementation code is available to ease reproducibility. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted as a regular paper of IJCAI'24

arXiv:2405.04303 [pdf, other]

Progressive Quantum Algorithm for Quantum Alternating Operator Ansatz

Authors: Xiao-Hui Ni, Yan-Qi Song, Ling-Xiao Li, Su-Juan Qin, Fei Gao, Qiao-Yan Wen

Abstract: Recently, Hadfield has proposed a novel Quantum Alternating Operator Ansatz (QAOA+) to tackle Constrained Combinatorial Optimization Problems (CCOPs), and it has wide applications. However, the large requirement of multi-qubit controlled gates in QAOA+ limits its applications in solving larger-scale CCOPs. To mitigate the resources overhead of QAOA+, we introduce an approach termed Progressive Qua… ▽ More Recently, Hadfield has proposed a novel Quantum Alternating Operator Ansatz (QAOA+) to tackle Constrained Combinatorial Optimization Problems (CCOPs), and it has wide applications. However, the large requirement of multi-qubit controlled gates in QAOA+ limits its applications in solving larger-scale CCOPs. To mitigate the resources overhead of QAOA+, we introduce an approach termed Progressive Quantum Algorithm (PQA). In this paper, the concept and performance of PQA are introduced focusing on the Maximal Independent Set (MIS) problem. PQA aims to yield the solution of the target graph $G$ with fewer resources by solving the MIS problem on a desired derived subgraph that has the same MIS solution as $G$ but has a much smaller graph size. To construct such a desired subgraph, PQA gradually and regularly expands the graph size starting from a well-designed initial subgraph. After each expansion, PQA solves the MIS problem on the current subgraph using QAOA+ and estimates whether the current graph has the same MIS solution as the target graph. PQA repeats the graph expansion and solving process until reaching the stop condition. In our simulations, the performance of PQA is benchmarked on Erdős-Rényi (ER) and regular graphs. The simulation results suggest that PQA showcases higher average approximation ratio (AAR) and significant quantum resource savings compared with directly solves the original problem using QAOA+ (DS-QAOA+) at the same level depth $p$. Remarkably, the AAR obtained by PQA is $12.9305\%$ ($4.8645\%$) higher than DS-QAOA+ on ER (regular) graphs, and the average number of multi-qubit gates (qubits) consumed by PQA is 1/3 (1/2) of that of DS-QAOA+. The remarkable efficiency of PQA makes it possible to solve larger-scale CCOPs on the current quantum devices. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.04059 [pdf]

Three-dimensional hidden phase probed by in-plane magnetotransport in kagome metal CsV$_3$Sb$_5$ thin flakes

Authors: Xinjian Wei, Congkuan Tian, Hang Cui, Yuxin Zhai, Yongkai Li, Shaobo Liu, Yuanjun Song, Ya Feng, Miaoling Huang, Zhiwei Wang, Yi Liu, Qihua Xiong, Yugui Yao, X. C. Xie, Jian-Hao Chen

Abstract: Transition metal compounds with kagome structure have been found to exhibit a variety of exotic structural, electronic, and magnetic orders. These orders are competing with energies very close to each other, resulting in complex phase transitions. Some of the phases are easily observable, such as the charge density wave (CDW) and the superconducting phase, while others are more challenging to iden… ▽ More Transition metal compounds with kagome structure have been found to exhibit a variety of exotic structural, electronic, and magnetic orders. These orders are competing with energies very close to each other, resulting in complex phase transitions. Some of the phases are easily observable, such as the charge density wave (CDW) and the superconducting phase, while others are more challenging to identify and characterize. Here we present magneto-transport evidence of a new phase below ~35 K in the kagome topological metal CsV$_3$Sb$_5$ (CVS) thin flakes between the CDW and the superconducting transition temperatures. This phase is characterized by six-fold rotational symmetry in the in-plane magnetoresistance (MR) and is connected to the orbital current order in CVS. Furthermore, the phase is characterized by a large in-plane negative magnetoresistance, which suggests the existence of a three-dimensional, magnetic field-tunable orbital current ordered phase. Our results highlight the potential of magneto-transport to reveal the interactions between exotic quantum states of matter and to uncover the symmetry of such hidden phases. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.03987 [pdf, other]

Navigating Chemical Space with Latent Flows

Authors: Guanghao Wei, Yining Huang, Chenru Duan, Yue Song, Yuanqi Du

Abstract: Recent progress of deep generative models in the vision and language domain has stimulated significant interest in more structured data generation such as molecules. However, beyond generating new random molecules, efficient exploration and a comprehensive understanding of the vast chemical space are of great importance to molecular science and applications in drug design and materials discovery.… ▽ More Recent progress of deep generative models in the vision and language domain has stimulated significant interest in more structured data generation such as molecules. However, beyond generating new random molecules, efficient exploration and a comprehensive understanding of the vast chemical space are of great importance to molecular science and applications in drug design and materials discovery. In this paper, we propose a new framework, ChemFlow, to traverse chemical space through navigating the latent space learned by molecule generative models through flows. We introduce a dynamical system perspective that formulates the problem as learning a vector field that transports the mass of the molecular distribution to the region with desired molecular properties or structure diversity. Under this framework, we unify previous approaches on molecule latent space traversal and optimization and propose alternative competing methods incorporating different physical priors. We validate the efficacy of ChemFlow on molecule manipulation and single- and multi-objective molecule optimization tasks under both supervised and unsupervised molecular discovery settings. Codes and demos are publicly available on GitHub at https://github.com/garywei944/ChemFlow. △ Less

Submitted 7 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.03408 [pdf, other]

An Image Quality Evaluation and Masking Algorithm Based On Pre-trained Deep Neural Networks

Authors: Peng Jia, Yu Song, Jiameng Lv, Runyu Ning

Abstract: With the growing amount of astronomical data, there is an increasing need for automated data processing pipelines, which can extract scientific information from observation data without human interventions. A critical aspect of these pipelines is the image quality evaluation and masking algorithm, which evaluates image qualities based on various factors such as cloud coverage, sky brightness, scat… ▽ More With the growing amount of astronomical data, there is an increasing need for automated data processing pipelines, which can extract scientific information from observation data without human interventions. A critical aspect of these pipelines is the image quality evaluation and masking algorithm, which evaluates image qualities based on various factors such as cloud coverage, sky brightness, scattering light from the optical system, point spread function size and shape, and read-out noise. Occasionally, the algorithm requires masking of areas severely affected by noise. However, the algorithm often necessitates significant human interventions, reducing data processing efficiency. In this study, we present a deep learning based image quality evaluation algorithm that uses an autoencoder to learn features of high quality astronomical images. The trained autoencoder enables automatic evaluation of image quality and masking of noise affected areas. We have evaluated the performance of our algorithm using two test cases: images with point spread functions of varying full width half magnitude, and images with complex backgrounds. In the first scenario, our algorithm could effectively identify variations of the point spread functions, which can provide valuable reference information for photometry. In the second scenario, our method could successfully mask regions affected by complex regions, which could significantly increase the photometry accuracy. Our algorithm can be employed to automatically evaluate image quality obtained by different sky surveying projects, further increasing the speed and robustness of data processing pipelines. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: Accepted by the AJ. The code could be downloaded from: https://nadc.china-vo.org/res/r101415/ with DOI of: 10.12149/101415

arXiv:2405.01837 [pdf, ps, other]

doi 10.1103/PhysRevA.109.052801

Anomalously reduced homogeneous broadening of two-dimensional electronic spectroscopy at high temperature by detailed balance

Authors: Ru-Qiong Deng, Cheng-Ge Liu, Yi-Xuan Yao, **g-Yi-Ran **, Hao-Yue Zhang, Yin Song, Qing Ai

Abstract: Dissipation and decoherence of quantum systems in thermal environments is important to various spectroscopies. It is generally believed that dissipation can broaden the line shape of spectroscopies, and thus stronger system-bath interaction can result in more significant homogeneous broadening of two-dimensional electronic spectroscopy (2DES). Here we show that the case can be the opposite in the… ▽ More Dissipation and decoherence of quantum systems in thermal environments is important to various spectroscopies. It is generally believed that dissipation can broaden the line shape of spectroscopies, and thus stronger system-bath interaction can result in more significant homogeneous broadening of two-dimensional electronic spectroscopy (2DES). Here we show that the case can be the opposite in the regime of electromagnetically induced transparency (EIT). We predict that assisted by EIT, the homogeneous broadening of the 2DES at a higher temperature can be significantly reduced due to the detailed balance. This anomalous effect is due to the long-lasting off-diagonal peaks in 2DES. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: 10 pages, 6 figures

Journal ref: Phys. Rev. A 109, 052801 (2024)

arXiv:2405.00098 [pdf, other]

Amplitude analysis and branching fraction measurement of $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1057 additional authors not shown)

Abstract: The decays of the $B^{+}$ meson to the final state $D^{*-}D^{+}_{s}π^{+}$ are studied in proton-proton collision data collected with the LHCb detector at centre-of-mass energies of 7, 8, and 13 TeV, corresponding to a total integrated luminosity of 9 fb$^{-1}$. The ratio of branching fractions of the $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ and $B^{0}\to D^{*-}D^{+}_{s}$ decays is measured to be… ▽ More The decays of the $B^{+}$ meson to the final state $D^{*-}D^{+}_{s}π^{+}$ are studied in proton-proton collision data collected with the LHCb detector at centre-of-mass energies of 7, 8, and 13 TeV, corresponding to a total integrated luminosity of 9 fb$^{-1}$. The ratio of branching fractions of the $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ and $B^{0}\to D^{*-}D^{+}_{s}$ decays is measured to be $0.173\pm 0.006\pm 0.010$, where the first uncertainty is statistical and the second is systematic. Using partially reconstructed $D^{*+}_{s}\to D^{+}_{s}γ$ and $D^{+}_{s}π^{0}$ decays, the ratio of branching fractions between the $B^{+}\to D^{*-}D^{*+}_{s}π^{+}$ and $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ decays is determined as $1.31\pm 0.07\pm 0.14$. An amplitude analysis of the $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ decay is performed for the first time, revealing dominant contributions from known excited charm resonances decaying to the $D^{*-}π^{+}$ final state. No significant evidence of exotic contributions in the $D^{+}_{s}π^{+}$ or $D^{*-}D^{+}_{s}$ channels is found. The fit fraction of the scalar state $T_{c\bar{s} 0}^{\ast}(2900)^{++}$ observed in the $B^{+}\to D^{-}D^{+}_{s}π^{+}$ decay is determined to be less than 2.3% at a 90% confidence level. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-001.html (LHCb public pages)

Report number: LHCb-PAPER-2024-001, CERN-EP-2024-110

arXiv:2404.19510 [pdf, other]

First observation of $Λ_{b}^{0} \rightarrow Σ_c^{(*)++} D^{(*)-} K^{-}$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1067 additional authors not shown)

Abstract: The four decays, $Λ_{b}^{0} \rightarrow Σ_c^{(*)++} D^{(*)-} K^{-}$, are observed for the first time using proton-proton collision data collected with the LHCb detector at a centre-of-mass energy of $13\,\rm{TeV}$, corresponding to an integrated luminosity of $6\,\rm{fb}^{-1}$. By considering the $Λ_b^0 \rightarrow Λ_c^{+} \overline{D}^0 K^{-}$ decay as reference channel, the following branching f… ▽ More The four decays, $Λ_{b}^{0} \rightarrow Σ_c^{(*)++} D^{(*)-} K^{-}$, are observed for the first time using proton-proton collision data collected with the LHCb detector at a centre-of-mass energy of $13\,\rm{TeV}$, corresponding to an integrated luminosity of $6\,\rm{fb}^{-1}$. By considering the $Λ_b^0 \rightarrow Λ_c^{+} \overline{D}^0 K^{-}$ decay as reference channel, the following branching fraction ratios are measured to be, $$\frac{\cal{B} (Λ_{b}^{0} \rightarrow Σ_{c}^{++} \rm{D}^{-} {K}^{-})}{\cal{B}(Λ_{b}^{0} \rightarrow Λ_c^{+} \rm \overline{D}^0 {K}^{-})} = {0.282}\pm{0.016}\pm{0.016}\pm{0.005}, \frac{\cal{B}(Λ_{b}^{0} \rightarrow Σ_{c}^{*++} \rm {D}^{-} {K}^{-})}{\cal{B}(Λ_{b}^{0} \rightarrow Σ_c^{++} \rm {D}^{-} {K}^{-})} = {0.460}\pm{0.052}\pm{0.028}, \frac{\cal{B}(Λ_{b}^{0} \rightarrow Σ_{c}^{++} \rm {D}^{*-} {K}^{-})}{\cal{B}(Λ_{b}^{0} \rightarrow Σ_c^{++} \rm {D}^{-} {K}^{-})} = {2.261}\pm{0.202}\pm{0.129}\pm{0.046}, \frac{\cal{B}(Λ_{b}^{0} \rightarrow Σ_{c}^{*++} \rm D^{*-} K^{-})}{\cal{B}(Λ_{b}^{0} \rightarrow Σ_c^{++} \rm D^{-} K^{-})} = {0.896}\pm{0.137}\pm{0.066}\pm{0.018},$$ where the first uncertainties are statistical, the second are systematic, and the third are due to uncertainties in the branching fractions of intermediate particle decays. These initial observations mark the beginning of pentaquark searches in these modes, with more data set to become available following the LHCb upgrade. △ Less

Submitted 11 June, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-044.html (LHCb public pages)

Report number: LHCb-PAPER-2023-044, CERN-EP-2024-098

arXiv:2404.19205 [pdf, other]

TableVQA-Bench: A Visual Question Answering Benchmark on Multiple Table Domains

Authors: Yoonsik Kim, Moonbin Yim, Ka Yeon Song

Abstract: In this paper, we establish a benchmark for table visual question answering, referred to as the TableVQA-Bench, derived from pre-existing table question-answering (QA) and table structure recognition datasets. It is important to note that existing datasets have not incorporated images or QA pairs, which are two crucial components of TableVQA. As such, the primary objective of this paper is to obta… ▽ More In this paper, we establish a benchmark for table visual question answering, referred to as the TableVQA-Bench, derived from pre-existing table question-answering (QA) and table structure recognition datasets. It is important to note that existing datasets have not incorporated images or QA pairs, which are two crucial components of TableVQA. As such, the primary objective of this paper is to obtain these necessary components. Specifically, images are sourced either through the application of a \textit{stylesheet} or by employing the proposed table rendering system. QA pairs are generated by exploiting the large language model (LLM) where the input is a text-formatted table. Ultimately, the completed TableVQA-Bench comprises 1,500 QA pairs. We comprehensively compare the performance of various multi-modal large language models (MLLMs) on TableVQA-Bench. GPT-4V achieves the highest accuracy among commercial and open-sourced MLLMs from our experiments. Moreover, we discover that the number of vision queries plays a significant role in TableVQA performance. To further analyze the capabilities of MLLMs in comparison to their LLM backbones, we investigate by presenting image-formatted tables to MLLMs and text-formatted tables to LLMs, respectively. Our findings suggest that processing visual inputs is more challenging than text inputs, as evidenced by the lower performance of MLLMs, despite generally requiring higher computational costs than LLMs. The proposed TableVQA-Bench and evaluation codes are available at \href{https://github.com/naver-ai/tablevqabench}{https://github.com/naver-ai/tablevqabench}. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: Technical Report

arXiv:2404.18953 [pdf, other]

A Knowledge-driven Memetic Algorithm for the Energy-efficient Distributed Homogeneous Flow Shop Scheduling Problem

Authors: Yunbao Xu, Xuemei Jiang, Jun Li, Lining Xing, Yanjie Song

Abstract: The reduction of carbon emissions in the manufacturing industry holds significant importance in achieving the national "double carbon" target. Ensuring energy efficiency is a crucial factor to be incorporated into future generation manufacturing systems. In this study, energy consumption is considered in the distributed homogeneous flow shop scheduling problem (DHFSSP). A knowledge-driven memetic… ▽ More The reduction of carbon emissions in the manufacturing industry holds significant importance in achieving the national "double carbon" target. Ensuring energy efficiency is a crucial factor to be incorporated into future generation manufacturing systems. In this study, energy consumption is considered in the distributed homogeneous flow shop scheduling problem (DHFSSP). A knowledge-driven memetic algorithm (KDMA) is proposed to address the energy-efficient DHFSSP (EEDHFSSP). KDMA incorporates a collaborative initialization strategy to generate high-quality initial populations. Furthermore, several algorithmic improvements including update strategy, local search strategy, and carbon reduction strategy are employed to improve the search performance of the algorithm. The effectiveness of KDMA in solving EEDHFSSP is verified through extensive simulation experiments. It is evident that KDMA outperforms many state-of-the-art algorithms across various evaluation aspects. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Comments: 14 pages

arXiv:2404.18423 [pdf, other]

Unsupervised Dynamics Prediction with Object-Centric Kinematics

Authors: Yeon-Ji Song, Suhyung Choi, Jaein Kim, **-Hwa Kim, Byoung-Tak Zhang

Abstract: Human perception involves discerning complex multi-object scenes into time-static object appearance (ie, size, shape, color) and time-varying object motion (ie, location, velocity, acceleration). This innate ability to unconsciously understand the environment is the motivation behind the success of dynamics modeling. Object-centric representations have emerged as a promising tool for dynamics pred… ▽ More Human perception involves discerning complex multi-object scenes into time-static object appearance (ie, size, shape, color) and time-varying object motion (ie, location, velocity, acceleration). This innate ability to unconsciously understand the environment is the motivation behind the success of dynamics modeling. Object-centric representations have emerged as a promising tool for dynamics prediction, yet they primarily focus on the objects' appearance, often overlooking other crucial attributes. In this paper, we propose Object-Centric Kinematics (OCK), a framework for dynamics prediction leveraging object-centric representations. Our model utilizes a novel component named object kinematics, which comprises low-level structured states of objects' position, velocity, and acceleration. The object kinematics are obtained via either implicit or explicit approaches, enabling comprehensive spatiotemporal object reasoning, and integrated through various transformer mechanisms, facilitating effective object-centric dynamics modeling. Our model demonstrates superior performance when handling objects and backgrounds in complex scenes characterized by a wide range of object attributes and dynamic movements. Moreover, our model demonstrates generalization capabilities across diverse synthetic environments, highlighting its potential for broad applicability in vision-related tasks. △ Less

Submitted 6 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

Comments: 15 pages, 6 figures, 4 tables

arXiv:2404.18209 [pdf, other]

4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs

Authors: Minjie Wang, Quan Gan, David Wipf, Zhenkun Cai, Ning Li, Jianheng Tang, Yanlin Zhang, Zizhao Zhang, Zunyao Mao, Yakun Song, Yanbo Wang, Jiahang Li, Han Zhang, Guang Yang, Xiao Qin, Chuan Lei, Muhan Zhang, Weinan Zhang, Christos Faloutsos, Zheng Zhang

Abstract: Although RDBs store vast amounts of rich, informative data spread across interconnected tables, the progress of predictive machine learning models as applied to such tasks arguably falls well behind advances in other domains such as computer vision or natural language processing. This deficit stems, at least in part, from the lack of established/public RDB benchmarks as needed for training and eva… ▽ More Although RDBs store vast amounts of rich, informative data spread across interconnected tables, the progress of predictive machine learning models as applied to such tasks arguably falls well behind advances in other domains such as computer vision or natural language processing. This deficit stems, at least in part, from the lack of established/public RDB benchmarks as needed for training and evaluation purposes. As a result, related model development thus far often defaults to tabular approaches trained on ubiquitous single-table benchmarks, or on the relational side, graph-based alternatives such as GNNs applied to a completely different set of graph datasets devoid of tabular characteristics. To more precisely target RDBs lying at the nexus of these two complementary regimes, we explore a broad class of baseline models predicated on: (i) converting multi-table datasets into graphs using various strategies equipped with efficient subsampling, while preserving tabular characteristics; and (ii) trainable models with well-matched inductive biases that output predictions based on these input subgraphs. Then, to address the dearth of suitable public benchmarks and reduce siloed comparisons, we assemble a diverse collection of (i) large-scale RDB datasets and (ii) coincident predictive tasks. From a delivery standpoint, we operationalize the above four dimensions (4D) of exploration within a unified, scalable open-source toolbox called 4DBInfer. We conclude by presenting evaluations using 4DBInfer, the results of which highlight the importance of considering each such dimension in the design of RDB predictive models, as well as the limitations of more naive approaches such as simply joining adjacent tables. Our source code is released at https://github.com/awslabs/multi-table-benchmark . △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: Under review

arXiv:2404.18094 [pdf, other]

doi 10.1109/TASLP.2024.3393714

USAT: A Universal Speaker-Adaptive Text-to-Speech Approach

Authors: Wenbin Wang, Yang Song, Sanjay Jha

Abstract: Conventional text-to-speech (TTS) research has predominantly focused on enhancing the quality of synthesized speech for speakers in the training dataset. The challenge of synthesizing lifelike speech for unseen, out-of-dataset speakers, especially those with limited reference data, remains a significant and unresolved problem. While zero-shot or few-shot speaker-adaptive TTS approaches have been e… ▽ More Conventional text-to-speech (TTS) research has predominantly focused on enhancing the quality of synthesized speech for speakers in the training dataset. The challenge of synthesizing lifelike speech for unseen, out-of-dataset speakers, especially those with limited reference data, remains a significant and unresolved problem. While zero-shot or few-shot speaker-adaptive TTS approaches have been explored, they have many limitations. Zero-shot approaches tend to suffer from insufficient generalization performance to reproduce the voice of speakers with heavy accents. While few-shot methods can reproduce highly varying accents, they bring a significant storage burden and the risk of overfitting and catastrophic forgetting. In addition, prior approaches only provide either zero-shot or few-shot adaptation, constraining their utility across varied real-world scenarios with different demands. Besides, most current evaluations of speaker-adaptive TTS are conducted only on datasets of native speakers, inadvertently neglecting a vast portion of non-native speakers with diverse accents. Our proposed framework unifies both zero-shot and few-shot speaker adaptation strategies, which we term as "instant" and "fine-grained" adaptations based on their merits. To alleviate the insufficient generalization performance observed in zero-shot speaker adaptation, we designed two innovative discriminators and introduced a memory mechanism for the speech decoder. To prevent catastrophic forgetting and reduce storage implications for few-shot speaker adaptation, we designed two adapters and a unique adaptation procedure. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: 15 pages, 13 figures. Copyright has been transferred to IEEE

Journal ref: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2024

arXiv:2404.17591 [pdf, other]

doi 10.1145/3626772.3657840

Large Language Models for Next Point-of-Interest Recommendation

Authors: Peibo Li, Maarten de Rijke, Hao Xue, Shuang Ao, Yang Song, Flora D. Salim

Abstract: The next Point of Interest (POI) recommendation task is to predict users' immediate next POI visit given their historical data. Location-Based Social Network (LBSN) data, which is often used for the next POI recommendation task, comes with challenges. One frequently disregarded challenge is how to effectively use the abundant contextual information present in LBSN data. Previous methods are limite… ▽ More The next Point of Interest (POI) recommendation task is to predict users' immediate next POI visit given their historical data. Location-Based Social Network (LBSN) data, which is often used for the next POI recommendation task, comes with challenges. One frequently disregarded challenge is how to effectively use the abundant contextual information present in LBSN data. Previous methods are limited by their numerical nature and fail to address this challenge. In this paper, we propose a framework that uses pretrained Large Language Models (LLMs) to tackle this challenge. Our framework allows us to preserve heterogeneous LBSN data in its original format, hence avoiding the loss of contextual information. Furthermore, our framework is capable of comprehending the inherent meaning of contextual information due to the inclusion of commonsense knowledge. In experiments, we test our framework on three real-world LBSN datasets. Our results show that the proposed framework outperforms the state-of-the-art models in all three datasets. Our analysis demonstrates the effectiveness of the proposed framework in using contextual information as well as alleviating the commonly encountered cold-start and short trajectory problems. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.17099 [pdf, other]

Unleashing the Potential of Fractional Calculus in Graph Neural Networks with FROND

Authors: Qiyu Kang, Kai Zhao, Qinxu Ding, Feng Ji, Xuhao Li, Wenfei Liang, Yang Song, Wee Peng Tay

Abstract: We introduce the FRactional-Order graph Neural Dynamical network (FROND), a new continuous graph neural network (GNN) framework. Unlike traditional continuous GNNs that rely on integer-order differential equations, FROND employs the Caputo fractional derivative to leverage the non-local properties of fractional calculus. This approach enables the capture of long-term dependencies in feature update… ▽ More We introduce the FRactional-Order graph Neural Dynamical network (FROND), a new continuous graph neural network (GNN) framework. Unlike traditional continuous GNNs that rely on integer-order differential equations, FROND employs the Caputo fractional derivative to leverage the non-local properties of fractional calculus. This approach enables the capture of long-term dependencies in feature updates, moving beyond the Markovian update mechanisms in conventional integer-order models and offering enhanced capabilities in graph representation learning. We offer an interpretation of the node feature updating process in FROND from a non-Markovian random walk perspective when the feature updating is particularly governed by a diffusion process. We demonstrate analytically that oversmoothing can be mitigated in this setting. Experimentally, we validate the FROND framework by comparing the fractional adaptations of various established integer-order continuous GNNs, demonstrating their consistently improved performance and underscoring the framework's potential as an effective extension to enhance traditional continuous GNNs. The code is available at \url{https://github.com/zknus/ICLR2024-FROND}. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: The Twelfth International Conference on Learning Representations

arXiv:2404.16412 [pdf, ps, other]

Distributed Matrix Pencil Formulations for Prescribed-Time Leader-Following Consensus of MASs with Unknown Sensor Sensitivity

Authors: Hefu Ye, Changyun Wen, Yongduan Song

Abstract: In this paper, we address the problem of prescribed-time leader-following consensus of heterogeneous multi-agent systems (MASs) in the presence of unknown sensor sensitivity. Under a connected undirected topology, we propose a time-varying dual observer/controller design framework that makes use of regular local and inaccurate feedback to achieve consensus tracking within a prescribed time. In par… ▽ More In this paper, we address the problem of prescribed-time leader-following consensus of heterogeneous multi-agent systems (MASs) in the presence of unknown sensor sensitivity. Under a connected undirected topology, we propose a time-varying dual observer/controller design framework that makes use of regular local and inaccurate feedback to achieve consensus tracking within a prescribed time. In particular, the developed analysis framework is applicable to MASs equipped with sensors of different sensitivities. One of the design innovations involves constructing a distributed matrix pencil formulation based on worst-case sensors, yielding control parameters with sufficient robustness yet relatively low conservatism. Another novelty is the construction of the control gains, which consists of the product of a proportional coefficient obtained from the matrix pencil formulation and a classic time-varying function that grows to infinity or a novel bounded time-varying function. Furthermore, it is possible to extend the prescribed-time distributed protocol to infinite time domain by introducing the bounded time-varying gain technique without sacrificing the ultimate control accuracy, and the corresponding technical proof is comprehensive. The effectiveness of the method is demonstrated through a group of 5 single-link robot manipulators. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: 10 pages, 1 figure

Showing 101–150 of 3,160 results for author: Song, Y