Search | arXiv e-print repository

Task Grou**s Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models

Authors: Yongxian Wei, Zixuan Hu, Li Shen, Zhenyi Wang, Yu Li, Chun Yuan, Dacheng Tao

Abstract: Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data, enabling the rapid adaptation to new unseen tasks. Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts. In this paper, we empirically and theoretically identify and analyze the mode… ▽ More Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data, enabling the rapid adaptation to new unseen tasks. Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts. In this paper, we empirically and theoretically identify and analyze the model heterogeneity in DFML. We find that model heterogeneity introduces a heterogeneity-homogeneity trade-off, where homogeneous models reduce task conflicts but also increase the overfitting risk. Balancing this trade-off is crucial for learning shared representations across tasks. Based on our findings, we propose Task Grou**s Regularization, a novel approach that benefits from model heterogeneity by grou** and aligning conflicting tasks. Specifically, we embed pre-trained models into a task space to compute dissimilarity, and group heterogeneous models together based on this measure. Then, we introduce implicit gradient regularization within each group to mitigate potential conflicts. By encouraging a gradient direction suitable for all tasks, the meta-model captures shared representations that generalize across tasks. Comprehensive experiments showcase the superiority of our approach in multiple benchmarks, effectively tackling the model heterogeneity in challenging multi-domain and multi-architecture scenarios. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.15210 [pdf]

Spin chirality engineering induced giant topological Hall effect in a kagome magnet

Authors: Wei Xia, Shihao Zhang, Jian Yuan, Yurui Wei, Haonan Wang, Hong Du, Xiangqi Liu, Jiangteng Guo, Zicheng Tao, Ke Qu, Xia Wang, Xuerong Liu, Wenbo Wang, **guang Cheng, Yulin Chen, Jianpeng Liu, Ruidan Zhong, Xuewen Fu, Zhenzhong Yang, Yanfeng Guo

Abstract: The ferrimagnet TbMn6Sn6 has attracted vast attention, because its pristine Mn kagome lattice with strong spin-orbit coupling and out-of-plane Tb-Mn exchange supports quantum-limit Chern topological magnetism which can be described by the simple spinless Haldane model. We unveil herein that engineering the pristine kagome lattice through partial replacement of Mn by nonmagnetic Cr which tends to c… ▽ More The ferrimagnet TbMn6Sn6 has attracted vast attention, because its pristine Mn kagome lattice with strong spin-orbit coupling and out-of-plane Tb-Mn exchange supports quantum-limit Chern topological magnetism which can be described by the simple spinless Haldane model. We unveil herein that engineering the pristine kagome lattice through partial replacement of Mn by nonmagnetic Cr which tends to concentrate into the single Mn1 layer in a unit cell breaks the collinear configuration of Mn spins and reduces the D6h point group symmetry to the C2 one. The nearly isolated Tb networks result in easily polarized Tb spins even under a weak magnetic field, and simultaneously, different spin chirality of the Tb-Mn1-Mn1 and Mn1-Mn1-Mn1. Such a peculiar spin structure leads to a plateau-like topological Hall effect with a record resistivity of 19.1 μOhm cm among bulk systems. Our direct visualization of the domain-wall structure and its evolution under external magnetic field fully support the picture, thus highlighting the pivotal role of broken kagome lattice symmetry in generating the peculiar spin chirality in real space. Our results set a paradigm for exploration of exotic properties in kagome topological magnets and would be a proof-of-principle strategy for investigating the correlation between magnetism and exotic topological properties in kagome lattice. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 33 pages,4 main figures and 16 SI figures

arXiv:2405.15155 [pdf, other]

CLIP model is an Efficient Online Lifelong Learner

Authors: Leyuan Wang, Liuyu Xiang, Yujie Wei, Yunlong Wang, Zhaofeng He

Abstract: Online Lifelong Learning (OLL) addresses the challenge of learning from continuous and non-stationary data streams. Existing online lifelong learning methods based on image classification models often require preset conditions such as the total number of classes or maximum memory capacity, which hinders the realization of real never-ending learning and renders them impractical for real-world scena… ▽ More Online Lifelong Learning (OLL) addresses the challenge of learning from continuous and non-stationary data streams. Existing online lifelong learning methods based on image classification models often require preset conditions such as the total number of classes or maximum memory capacity, which hinders the realization of real never-ending learning and renders them impractical for real-world scenarios. In this work, we propose that vision-language models, such as Contrastive Language-Image Pretraining (CLIP), are more suitable candidates for online lifelong learning. We discover that maintaining symmetry between image and text is crucial during Parameter-Efficient Tuning (PET) for CLIP model in online lifelong learning. To this end, we introduce the Symmetric Image-Text (SIT) tuning strategy. We conduct extensive experiments on multiple lifelong learning benchmark datasets and elucidate the effectiveness of SIT through gradient analysis. Additionally, we assess the impact of lifelong learning on generalizability of CLIP and found that tuning the image encoder is beneficial for lifelong learning, while tuning the text encoder aids in zero-shot learning. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.13720 [pdf, other]

Spin-orbital excitations encoding the magnetic phase transition in the van der Waals antiferromagnet FePS$_{3}$

Authors: Yuan Wei, Yi Tseng, Hebatalla Elnaggar, Wenliang Zhang, Teguh Citra Asmara, Eugenio Paris, Gabriele Domaine, Vladimir N. Strocov, Luc Testa, Virgile Favre, Mario Di Luca, Mitali Banerjee, Andrew R. Wildes, Frank M. F. de Groot, Henrik M. Ronnow, Thorsten Schmitt

Abstract: In the rich phases of van der Waals (vdW) materials featuring intertwined electronic order and collective phenomena, characterizing elementary dynamics that entail the low-energy Hamiltonian and electronic degrees of freedom is of paramount importance. Here we performed resonant inelastic X-ray scattering (RIXS) to elaborate the spin-orbital ground and excited states of the vdW antiferromagnetic i… ▽ More In the rich phases of van der Waals (vdW) materials featuring intertwined electronic order and collective phenomena, characterizing elementary dynamics that entail the low-energy Hamiltonian and electronic degrees of freedom is of paramount importance. Here we performed resonant inelastic X-ray scattering (RIXS) to elaborate the spin-orbital ground and excited states of the vdW antiferromagnetic insulator FePS$_{3}$, as well as their relation to magnetism. We observed the spectral enhancement of spin-orbital multiplet transitions about $\sim$ 100 and $\sim$ 220 meV, as well as quasielastic response, when entering the zig-zag antiferromagnetic phase, where the spectral changes develop an order-parameter-like evolution with temperature. By comparing with ligand field theory calculations, we discovered the essential role of trigonal lattice distortion and negative metal-ligand charge-transfer to account for these emergent excitations. Such spectral profiles are further examined upon confinement by mechanical exfoliation. We reveal their spectral robustness down to the few atomic layer limit, in accordance with the persistent antiferromagnetic state previously reported in optical measurements. Our study demonstrates the versatile RIXS capability that resolves magneto-crystalline anisotropy and charge-transfer energetics. These provide the crucial insight to understand how the spontaneous magnetic symmetry-breaking stabilizes in the quasi-two-dimensional limit for the vdW magnet FePS$_{3}$. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.13103 [pdf, other]

Search for the lepton-flavor violating decay $B^0_s\toφμ^\pmτ^\mp$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

Abstract: A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper l… ▽ More A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper limit on the branching fraction is determined to be ${\cal B}( B^0_s\toφμ^\pmτ^\mp) < 1.0\times 10^{-5}$ at 90% confidence level. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-006.html (LHCb public pages)

Report number: LHCb-PAPER-2024-006, CERN-EP-2024-114

arXiv:2405.12688 [pdf, other]

Study of $b$-hadron decays to $Λ_c^+ h^- h^{\prime -}$ final states

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1072 additional authors not shown)

Abstract: Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and… ▽ More Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and $13\,\mathrm{Te\kern -0.1em V}$. The products of the relative branching fractions and fragmentation fractions for each signal mode, relative to the $B^- \to Λ_c^+ \overline{p} π^-$ mode, are measured, with $Ξ_{b}^- \toΛ_{c}^+ K^- π^-$, $Ξ_{b}^- \toΛ_{c}^+ K^- K^-$ and $Ω_{b}^- \toΛ_{c}^+ K^- K^-$ decays being observed at over $5\,σ$ significance. The $Ξ_{b}^- \toΛ_{c}^+ K^- π^-$ mode is also used to measure the $Ξ_{b}^-$ production asymmetry, which is found to be consistent with zero. In addition, the $B^- \to Λ_{c}^+ \overline{p} K^-$ decay is observed for the first time, and its branching fraction is measured relative to that of the $B^- \to Λ_{c}^+ \overline{p} π^-$ mode. △ Less

Submitted 22 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-013.html

Report number: CERN-EP-2024-116, LHCb-PAPER-2024-013

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.11324 [pdf, other]

Transverse polarization measurement of $Λ$ hyperons in $p$Ne collisions at $\sqrt{s_{NN}}$ = 68.4 GeV with the $\mbox{LHCb}$ detector

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1065 additional authors not shown)

Abstract: A measurement of the transverse polarization of the $Λ$ and $\barΛ$ hyperons in $p$Ne fixed-target collisions at $\sqrt{s_{NN}}$ = 68.4 GeV is presented using data collected by the LHCb detector. The polarization is studied using the decay $Λ\rightarrow p π^-$ together with its charge conjugated process, the integrated values measured are… ▽ More A measurement of the transverse polarization of the $Λ$ and $\barΛ$ hyperons in $p$Ne fixed-target collisions at $\sqrt{s_{NN}}$ = 68.4 GeV is presented using data collected by the LHCb detector. The polarization is studied using the decay $Λ\rightarrow p π^-$ together with its charge conjugated process, the integrated values measured are $$ P_Λ = 0.029 \pm 0.019 \, (\rm{stat}) \pm 0.012 \, (\rm{syst}) \, , $$ $$ P_{\barΛ} = 0.003 \pm 0.023 \, (\rm{stat}) \pm 0.014 \,(\rm{syst}) \,. $$ Furthermore, the results are shown as a function of the Feynman~$x$~variable, transverse momentum, pseudorapidity and rapidity of the hyperons, and are compared with previous measurements. △ Less

Submitted 24 May, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3120 (LHCb public pages)

Report number: CERN-EP-2024-121, LHCb-PAPER-2024-009

arXiv:2405.10791 [pdf, other]

Multiwavelength Radiation from the Interaction between Magnetar Bursts and Companion Star in a Binary System

Authors: Yu-Jia Wei, Yuan-Pei Yang, Da-Ming Wei, Zi-Gao Dai

Abstract: Magnetars are young, highly magnetized neutron stars that are associated with magnetar short bursts (MSBs), magnetar giant flares (MGFs), and at least a part of fast radio bursts (FRBs). In this work, we consider that a magnetar and a main sequence star are in a binary system and analyze the properties of the electromagnetic signals generated by the interaction between the magnetar bursts and the… ▽ More Magnetars are young, highly magnetized neutron stars that are associated with magnetar short bursts (MSBs), magnetar giant flares (MGFs), and at least a part of fast radio bursts (FRBs). In this work, we consider that a magnetar and a main sequence star are in a binary system and analyze the properties of the electromagnetic signals generated by the interaction between the magnetar bursts and the companion star. During the pre-burst period, the persistent radiation could be generated by the interaction between the $e^+e^-$-pair wind from the magnetar and the companion or its stellar wind. We find that for a newborn magnetar, the pre-burst persistent radiation from the strong magnetar wind can be dominant, and it is mainly at the optical and ultraviolet (UV) bands. For relatively older magnetars, the reemission from a burst interacting with the companion is larger than the pre-burst persistent radiation and the luminosity of the companion itself. The transient reemission produced by the heating process has a duration of $0.1 - 10^5 {\rm~s}$ at the optical, UV, and X-ray bands. Additionally, we find that if these phenomena occur in nearby galaxies within a few hundred kiloparsecs, they could be detected by current or future optical telescopes. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 15 pages, 8 figures; Accepted for publication in A&A

arXiv:2405.10775 [pdf, other]

doi 10.3847/2041-8213/ad4ce1

A Novel Model for the MeV Emission Line in GRB 221009A

Authors: Yu-Jia Wei, Jia Ren, Hao-Ning He, Yuan-Pei Yang, Da-Ming Wei, Zi-Gao Dai, B. Theodore Zhang

Abstract: Gamma-ray bursts (GRBs) have long been considered potential sources of ultra-high-energy cosmic rays (UHECRs; with energy $\gtrsim 10^{18} {\rm~eV}$). In this work, we propose a novel model generating MeV emission lines in GRB, which can constrain the properties of heavy nuclei that potentially exist in GRB jets. Specifically, we find that relativistic hydrogen-like high-atomic-number ions origina… ▽ More Gamma-ray bursts (GRBs) have long been considered potential sources of ultra-high-energy cosmic rays (UHECRs; with energy $\gtrsim 10^{18} {\rm~eV}$). In this work, we propose a novel model generating MeV emission lines in GRB, which can constrain the properties of heavy nuclei that potentially exist in GRB jets. Specifically, we find that relativistic hydrogen-like high-atomic-number ions originating from the $β$ decay of unstable nuclei and/or the recombination entrained in the GRB jet can generate narrow MeV emission lines through the de-excitation of excited-electrons. This model can successfully explain the MeV emission line observed in the most luminous GRB ever recorded, GRB~221009A, with suitable parameters including a Lorentz factor $γ\sim 820-1700$ and a total mass of heavy nuclei $M_{\rm tot} \sim 10^{23} - 10^{26}$~g. Especially, the emission line broadening can be reasonably attributed to both the expansion of the jet shell and the thermal motion of nuclei, naturally resulting in a narrow width ($σ_{\rm line} / E_{\rm line} \lesssim 0.2$) consistent with the observation. Furthermore, we predict that different GRBs can exhibit lines in different bands with various evolving behaviors, which might be confirmed with further observations. Finally, our model provides indirect evidence that GRBs may be one of the sources of UHECRs. △ Less

Submitted 8 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

Comments: 13 pages, 4 figures; Published in ApJL, https://doi.org/10.3847/2041-8213/ad4ce1

arXiv:2405.10663 [pdf, ps, other]

Instability of Circumnuclear Gas Supply as An Origin of "Changing-look" Phenomenon of Supermassive Blackholes

Authors: J. Wang, D. W. Xu, Xinwu Cao, C. Gao, C. H. Xie, J. Y. Wei

Abstract: The origin of the "Changing-look" (CL) phenomenon in supermassive black holes (SMBHs) remains an open issue. This study aims to shed light on this phenomenon by focusing on a sample that encompasses all known repeating CL active galactic nuclei (AGNs). Through the identification of a characteristic time scale for the CL phenomenon, it was observed that larger SMBHs possess shorter characteristic t… ▽ More The origin of the "Changing-look" (CL) phenomenon in supermassive black holes (SMBHs) remains an open issue. This study aims to shed light on this phenomenon by focusing on a sample that encompasses all known repeating CL active galactic nuclei (AGNs). Through the identification of a characteristic time scale for the CL phenomenon, it was observed that larger SMBHs possess shorter characteristic timescales, while smaller SMBHs exhibit longer timescales. These findings reveal a significant contrast to the traditional AGN variability that has been adequately explained by the AGN's disk instability model. This stark discrepancy highlights a distinct origin of the CL phenomenon, distinguishing it from traditional AGN variability. By properly predicting the characteristic time scale and its dependence on SMBH mass, we propose that the CL phenomenon is likely a result of a variation in accretion rate caused by a sudden change in the supply of circumnuclear gas during the transition between active and passive SMBH fueling stages. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 14 pages, 4 figures and 2 tables, accepted by ApJ

arXiv:2405.08639 [pdf, ps, other]

Upwards homogeneity in iterated symmetric extensions

Authors: Calliope Ryan-Smith, Jonathan Schilhan, Yujun Wei

Abstract: It is sometimes desirable in choiceless constructions of set theory that one iteratively extends some ground model without adding new sets of ordinals after the first extension. Pushing this further, one may wish to have models $V \subseteq M \subseteq N$ of $\mathsf{ZF}$ such that $N$ contains no subsets of $V$ that do not already appear in $M$. We isolate, in the case that $M$ and $N$ are symmet… ▽ More It is sometimes desirable in choiceless constructions of set theory that one iteratively extends some ground model without adding new sets of ordinals after the first extension. Pushing this further, one may wish to have models $V \subseteq M \subseteq N$ of $\mathsf{ZF}$ such that $N$ contains no subsets of $V$ that do not already appear in $M$. We isolate, in the case that $M$ and $N$ are symmetric extensions (particular inner models of a generic extension of $V$), the exact conditions that cause this behaviour and show how it can broadly be applied to many known constructions. We call this behaviour upwards homogeneity. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 16 pages, 1 figure

MSC Class: 03E25 (Primary) 03E35; 03E40 (Secondary)

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.07156 [pdf]

Direct visualization of the impurity occupancy roadmap in Ni-substituted van der Waals ferromagnet Fe3GaTe2

Authors: Jian Yuan, Haonan Wang, Xiaofei Hou, Binshuo Zhang, Yurui Wei, Jiangteng Guo, Lu Sun, Zhenhai Yu, Zhikai Li, Xiangqi Liu, Wei Xia, Xia Wang, Xuerong Liu, Yulin Chen, Shihao Zhang, Xuewen Fu, Ke Qu, Zhenzhong Yang, Yanfeng Guo

Abstract: Impurity substitution is a general strategy to study the intrinsic properties of a quantum material. However, when the target element has more than one Wyckoff position in the lattice, it is a big challenge but with extreme necessity to know the exact position and order of the occupancy of impurity atoms. Via comprehensive experimental and theoretical investigations, we establish herein the roadma… ▽ More Impurity substitution is a general strategy to study the intrinsic properties of a quantum material. However, when the target element has more than one Wyckoff position in the lattice, it is a big challenge but with extreme necessity to know the exact position and order of the occupancy of impurity atoms. Via comprehensive experimental and theoretical investigations, we establish herein the roadmap for Ni substitution in Fe3GaTe2, a van der Waals ferromagnet with the Curie temperature TC even reaching ~ 380 K. The results unambiguously reveal that in (Fe1-xNix)3GaTe2, Ni atoms initially form an van der Waals interlayer gap Ni3 sites when x < 0.1, and then gradually occupy the Fe2 sites. After replacing the Fe2 sites at x of ~ 0.75, they start to substitute for the Fe1 sites and eventually realize a full occupation at x = 1.0. Accordingly, TC and saturation magnetic moments of (Fe1-xNix)3GaTe2 both show nonlinear decrease, which is tightly tied to the Ni occupancy order as well as the different roles of Ni3, Fe1 and Fe2 sites in the spin Hamiltonian. The results not only yield fruitful insights into the essential roles of different Fe sites in producing the above room temperature high TC, but also set a paradigm for future impurity substitution study on other quantum materials. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 24 pages,5 main figures+4 SI figures+2 SI tables

arXiv:2405.07147 [pdf, ps, other]

Randomized algorithms for computing the tensor train approximation and their applications

Authors: Maolin Che, Yimin Wei, Hong Yan

Abstract: In this paper, we focus on the fixed TT-rank and precision problems of finding an approximation of the tensor train (TT) decomposition of a tensor. Note that the TT-SVD and TT-cross are two well-known algorithms for these two problems. Firstly, by combining the random projection technique with the power scheme, we obtain two types of randomized algorithms for the fixed TT-rank problem. Secondly, b… ▽ More In this paper, we focus on the fixed TT-rank and precision problems of finding an approximation of the tensor train (TT) decomposition of a tensor. Note that the TT-SVD and TT-cross are two well-known algorithms for these two problems. Firstly, by combining the random projection technique with the power scheme, we obtain two types of randomized algorithms for the fixed TT-rank problem. Secondly, by using the non-asymptotic theory of sub-random Gaussian matrices, we derive the upper bounds of the proposed randomized algorithms. Thirdly, we deduce a new deterministic strategy to estimate the desired TT-rank with a given tolerance and another adaptive randomized algorithm that finds a low TT-rank representation satisfying a given tolerance, and is beneficial when the target TT-rank is not known in advance. We finally illustrate the accuracy of the proposed algorithms via some test tensors from synthetic and real databases. In particular, for the fixed TT-rank problem, the proposed algorithms can be several times faster than the TT-SVD, and the accuracy of the proposed algorithms and the TT-SVD are comparable for several test tensors. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: 43 pages, 9 figures and 4 tables

MSC Class: 15A18; 15A69; 65F55; 68W20

arXiv:2405.07009 [pdf, ps, other]

Quantum search in many-body interacting system with long-range interaction

Authors: Fan Xing, Yan Wei, Zeyang Liao

Abstract: Continuous-time quantum walks provide an alternative method for quantum search problems. Most of the earlier studies confirmed that quadratic speedup exists in some synthetic Hamiltonians, but whether there is quadratic speedup in real physical systems is elusive. Here, we investigate three physical systems with long-range atom-atom interaction which are possible good candidates for realizing the… ▽ More Continuous-time quantum walks provide an alternative method for quantum search problems. Most of the earlier studies confirmed that quadratic speedup exists in some synthetic Hamiltonians, but whether there is quadratic speedup in real physical systems is elusive. Here, we investigate three physical systems with long-range atom-atom interaction which are possible good candidates for realizing the quantum search, including one-dimensional atom arrays either trapped in an optical lattice or coupled to waveguide near band edge or dispersively coupled to a good cavity. We find that all three systems can provide near-optimal quantum search if there is no dissipation. However, if the dissipation is considered only the latter two systems (i.e., waveguide-QED and cavity-QED systems) can still have high success probabilities because the latter two systems can significantly enhance the atom-atom interaction even if they are far apart and the spectra gap can be much larger which can reduce the search time and the effects of dissipation significantly. Our studies here can provide helpful instructions for realizing quantum search in real physical systems in the noisy intermediate-scale quantum era. △ Less

Submitted 11 May, 2024; originally announced May 2024.

arXiv:2405.06876 [pdf, other]

Enhancing Low-Energy Neutron and Gamma Ray Detection Using Convolutional Neural Networks with EJ-276 Scintillators

Authors: Fengzhao Shen, Tao Li, **gkui He, Shenghui Xie, Yuehuan Wei, Tuchen Huang, Wei Wang

Abstract: Organic scintillators, such as plastic scintillators, are widely used to detect fast neutrons and gamma rays. The EJ-276 scintillator offers a versatile solution for detecting fast neutrons and gamma rays simultaneously, making it ideal for mixed neutron-gamma field detection applications. This study evaluates the Pulse Shape Discrimination (PSD) capabilities of the EJ-276 scintillator paired with… ▽ More Organic scintillators, such as plastic scintillators, are widely used to detect fast neutrons and gamma rays. The EJ-276 scintillator offers a versatile solution for detecting fast neutrons and gamma rays simultaneously, making it ideal for mixed neutron-gamma field detection applications. This study evaluates the Pulse Shape Discrimination (PSD) capabilities of the EJ-276 scintillator paired with silicon photomultiplier (SiPM) array readouts. Integrating the 1-inch EJ-276 scintillator with SiPM arrays achieved a Figure of Merit (FOM) of 1.13 at an energy threshold of 200 keVee (electron equivalent). However, using the Charge Comparison Method (CCM) to distinguish between neutrons and gamma rays was challenging, especially at energies below 200 keVee. To improve low-energy resolution, the Convolutional Neural Network (CNN) approach was adopted. The InceptionTime and EfficientNetV2 models were developed, using one-dimensional time series and two-dimensional matrix image inputs, respectively. The transformation from one-dimensional arrays to two-dimensional images was achieved using three techniques: Gramian Angular Summation Field(GASF), Recurrence Plot(RP), and Relative Position Matrix(RPM). These methods demonstrated high accuracy at energy levels above 200 keVee. At lower energy regions, CNN methods, particularly the InceptionTime model, outperformed CCM methods. Notably, CNN methods reached accuracies of 96.79% and 98.33% in the 0-100 keVee and 100-200 keVee ranges, respectively, significantly higher than the 85.49% and 94.56% achieved by CCM, representing improvements of 13.22% and 3.99%. These results highlight the superior performance of CNN methods in differentiating between neutrons and gamma rays, especially in low-energy regions. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.06556 [pdf, other]

Search for time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

Abstract: A measurement of time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays using a $pp$ collision data sample collected by the LHCb experiment in 2012 and from 2015 to 2018, corresponding to an integrated luminosity of 7.7$\,\mathrm{fb}^{-1}$, is presented. The initial flavour of each $D^0$ candidate is determined from the charge of the pion produced in the… ▽ More A measurement of time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays using a $pp$ collision data sample collected by the LHCb experiment in 2012 and from 2015 to 2018, corresponding to an integrated luminosity of 7.7$\,\mathrm{fb}^{-1}$, is presented. The initial flavour of each $D^0$ candidate is determined from the charge of the pion produced in the $D^*(2010)^+ \rightarrow D^0 π^+$ decay. The decay $D^0 \rightarrow K^- π^+ π^0$ is used as a control channel to validate the measurement procedure. The gradient of the time-dependent $CP$ asymmetry, $ΔY$, in $D^0 \rightarrow π^+ π^- π^0$ decays is measured to be \begin{equation*} ΔY = (-1.3 \pm 6.3 \pm 2.4) \times 10^{-4}, \end{equation*} where the first uncertainty is statistical and the second is systematic, which is compatible with $CP$ conservation. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://lhcbproject.web.cern.ch/Publications/p/LHCb-PAPER-2024-003.html (LHCb public pages)

Report number: LHCb-PAPER-2024-003, CERN-EP-2024-111

arXiv:2405.05806 [pdf, other]

MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation

Authors: Yuxiang Wei, Zhilong Ji, **feng Bai, Hongzhi Zhang, Lei Zhang, Wangmeng Zuo

Abstract: Text-to-image (T2I) diffusion models have shown significant success in personalized text-to-image generation, which aims to generate novel images with human identities indicated by the reference images. Despite promising identity fidelity has been achieved by several tuning-free methods, they usually suffer from overfitting issues. The learned identity tends to entangle with irrelevant information… ▽ More Text-to-image (T2I) diffusion models have shown significant success in personalized text-to-image generation, which aims to generate novel images with human identities indicated by the reference images. Despite promising identity fidelity has been achieved by several tuning-free methods, they usually suffer from overfitting issues. The learned identity tends to entangle with irrelevant information, resulting in unsatisfied text controllability, especially on faces. In this work, we present MasterWeaver, a test-time tuning-free method designed to generate personalized images with both faithful identity fidelity and flexible editability. Specifically, MasterWeaver adopts an encoder to extract identity features and steers the image generation through additional introduced cross attention. To improve editability while maintaining identity fidelity, we propose an editing direction loss for training, which aligns the editing directions of our MasterWeaver with those of the original T2I model. Additionally, a face-augmented dataset is constructed to facilitate disentangled identity learning, and further improve the editability. Extensive experiments demonstrate that our MasterWeaver can not only generate personalized images with faithful identity, but also exhibit superiority in text controllability. Our code will be publicly available at https://github.com/csyxwei/MasterWeaver. △ Less

Submitted 10 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: 34 pages

arXiv:2405.05647 [pdf]

Letter to the Editor: What are the legal and ethical considerations of submitting radiology reports to ChatGPT?

Authors: Siddharth Agarwal, David Wood, Robin Carpenter, Yiran Wei, Marc Modat, Thomas C Booth

Abstract: This letter critically examines the recent article by Infante et al. assessing the utility of large language models (LLMs) like GPT-4, Perplexity, and Bard in identifying urgent findings in emergency radiology reports. While acknowledging the potential of LLMs in generating labels for computer vision, concerns are raised about the ethical implications of using patient data without explicit approva… ▽ More This letter critically examines the recent article by Infante et al. assessing the utility of large language models (LLMs) like GPT-4, Perplexity, and Bard in identifying urgent findings in emergency radiology reports. While acknowledging the potential of LLMs in generating labels for computer vision, concerns are raised about the ethical implications of using patient data without explicit approval, highlighting the necessity of stringent data protection measures under GDPR. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2405.05554 [pdf, other]

RELICS: a REactor neutrino LIquid xenon Coherent elastic Scattering experiment

Authors: Chang Cai, Guocai Chen, Jiangyu Chen, Rundong Fang, Fei Gao, Xiaoran Guo, Jiheng Guo, Tingyi He, Chengjie Jia, Gaojun **, Yipin **g, Gaojun Ju, Yang Lei, Jiayi Li, Kaihang Li, Meng Li, Minhua Li, Shengchao Li, Siyin Li, Tao Li, Qing Lin, Jiajun Liu, Minghao Liu, Sheng Lv, Guang Luo , et al. (24 additional authors not shown)

Abstract: Coherent elastic neutrino-nucleus scattering (CEvNS) provides a unique probe for neutrino properties Beyond the Standard Model (BSM) physics. REactor neutrino LIquid xenon Coherent Scattering experiment (RELICS), a proposed reactor neutrino program using liquid xenon time projection chamber (LXeTPC) technology, aims to investigate the CEvNS process of antineutrinos off xenon atomic nuclei. In this… ▽ More Coherent elastic neutrino-nucleus scattering (CEvNS) provides a unique probe for neutrino properties Beyond the Standard Model (BSM) physics. REactor neutrino LIquid xenon Coherent Scattering experiment (RELICS), a proposed reactor neutrino program using liquid xenon time projection chamber (LXeTPC) technology, aims to investigate the CEvNS process of antineutrinos off xenon atomic nuclei. In this work, the design of the experiment is studied and optimized based on Monte Carlo (MC) simulations. To achieve a sufficiently low energy threshold for CEvNS detection, an ionization-only analysis channel is adopted for RELICS. A high emission rate of delayed electrons after a big ionization signal is the major background, leading to an analysis threshold of 120 photo-electrons in the CEvNS search. The second largest background, nuclear recoils induced by cosmic-ray neutrons, is suppressed via a passive water shield. The physics potential of RELICS is explored with a 32 kg-yr exposure at a baseline of 25 m from a reactor core with a 3 GW thermal power. In an energy range of 120 to 240 PE, we expect 4902.4 CEvNS and 1318.4 background events. The sensitivity of RELICS to the weak mixing angle is investigated at a low momentum transfer. Our study shows that RELICS can further improve the constraints on the non-standard neutrino interaction (NSI) compared to the current best results. △ Less

Submitted 12 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

arXiv:2405.05120 [pdf, ps, other]

GRB afterglows with energy injections in AGN accretion disks

Authors: Bao-Quan Huang, Tong Liu, Xiao-Yan Li, Yun-Feng Wei

Abstract: Active galactic nucleus (AGN) disks are widely considered potential hosts for various high-energy transients, including gamma-ray bursts (GRBs). The reactivation of GRB central engines can provide additional energy to shocks formed during the interaction of the initially ejected GRB jets with the circumburst material, commonly referred to as energy injections. In this paper, we study GRBs occurrin… ▽ More Active galactic nucleus (AGN) disks are widely considered potential hosts for various high-energy transients, including gamma-ray bursts (GRBs). The reactivation of GRB central engines can provide additional energy to shocks formed during the interaction of the initially ejected GRB jets with the circumburst material, commonly referred to as energy injections. In this paper, we study GRBs occurring in AGN disks within the context of energy injections. We adopt the standard external forward shock (EFS) model and consider both short- and long-duration GRB scenarios. Light curves for two types of radiation, namely the radiation from the heated disk material (RHDM) and GRB afterglows, are computed. We find that the energy injection facilitates the EFS to break out from the photosphere of the low-density AGN disk at relativistic velocity. Moreover, the energy injection almost does not affect the RHDM but significantly enhances the peak flux of the GRB afterglows. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: 12 pages, 6 figures, 2 tables, accepted for publication in ApJ

arXiv:2405.04434 [pdf, other]

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation. Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock its potential. Evaluation results show that, even with only 21B activated parameters, DeepSeek-V2 and its chat versions still achieve top-tier performance among open-source models. △ Less

Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.03188 [pdf, other]

Hyperbolic Geometric Latent Diffusion Model for Graph Generation

Authors: Xingcheng Fu, Yisen Gao, Yuecen Wei, Qingyun Sun, Hao Peng, Jianxin Li, Xianxian Li

Abstract: Diffusion models have made significant contributions to computer vision, sparking a growing interest in the community recently regarding the application of them to graph generation. Existing discrete graph diffusion models exhibit heightened computational complexity and diminished training efficiency. A preferable and natural way is to directly diffuse the graph within the latent space. However, d… ▽ More Diffusion models have made significant contributions to computer vision, sparking a growing interest in the community recently regarding the application of them to graph generation. Existing discrete graph diffusion models exhibit heightened computational complexity and diminished training efficiency. A preferable and natural way is to directly diffuse the graph within the latent space. However, due to the non-Euclidean structure of graphs is not isotropic in the latent space, the existing latent diffusion models effectively make it difficult to capture and preserve the topological information of graphs. To address the above challenges, we propose a novel geometrically latent diffusion framework HypDiff. Specifically, we first establish a geometrically latent space with interpretability measures based on hyperbolic geometry, to define anisotropic latent diffusion processes for graphs. Then, we propose a geometrically latent diffusion process that is constrained by both radial and angular geometric properties, thereby ensuring the preservation of the original topological properties in the generative graphs. Extensive experimental results demonstrate the superior effectiveness of HypDiff for graph generation with various topologies. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: Accepted by the 41st International Conference on Machine Learning (ICML 2024)

arXiv:2405.02782 [pdf]

A self-supervised text-vision framework for automated brain abnormality detection

Authors: David A. Wood, Emily Guilhem, Sina Kafiabadi, Ayisha Al Busaidi, Kishan Dissanayake, Ahmed Hammam, Nina Mansoor, Matthew Townend, Siddharth Agarwal, Yiran Wei, Asif Mazumder, Gareth J. Barker, Peter Sasieni, Sebastien Ourselin, James H. Cole, Thomas C. Booth

Abstract: Artificial neural networks trained on large, expert-labelled datasets are considered state-of-the-art for a range of medical image recognition tasks. However, categorically labelled datasets are time-consuming to generate and constrain classification to a pre-defined, fixed set of classes. For neuroradiological applications in particular, this represents a barrier to clinical adoption. To address… ▽ More Artificial neural networks trained on large, expert-labelled datasets are considered state-of-the-art for a range of medical image recognition tasks. However, categorically labelled datasets are time-consuming to generate and constrain classification to a pre-defined, fixed set of classes. For neuroradiological applications in particular, this represents a barrier to clinical adoption. To address these challenges, we present a self-supervised text-vision framework that learns to detect clinically relevant abnormalities in brain MRI scans by directly leveraging the rich information contained in accompanying free-text neuroradiology reports. Our training approach consisted of two-steps. First, a dedicated neuroradiological language model - NeuroBERT - was trained to generate fixed-dimensional vector representations of neuroradiology reports (N = 50,523) via domain-specific self-supervised learning tasks. Next, convolutional neural networks (one per MRI sequence) learnt to map individual brain scans to their corresponding text vector representations by optimising a mean square error loss. Once trained, our text-vision framework can be used to detect abnormalities in unreported brain MRI examinations by scoring scans against suitable query sentences (e.g., 'there is an acute stroke', 'there is hydrocephalus' etc.), enabling a range of classification-based applications including automated triage. Potentially, our framework could also serve as a clinical decision support tool, not only by suggesting findings to radiologists and detecting errors in provisional reports, but also by retrieving and displaying examples of pathologies from historical examinations that could be relevant to the current case based on textual descriptors. △ Less

Submitted 11 June, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

Comments: Under Review

arXiv:2405.01400 [pdf, other]

Scalable Ab Initio Electronic Structure Methods with Near Chemical Accuracy for Main Group Chemistry

Authors: Yu**g Wei, Sibali Debnath, John L. Weber, Ankit Mahajan, David R. Reichman, Richard A. Friesner

Abstract: This study evaluates the precision of widely recognized quantum chemical methodologies, CCSD(T), DLPNO-CCSD(T) and localized ph-AFQMC, for determining the thermochemistry of main group elements. DLPNO-CCSD(T) and localized ph-AFQMC, which offer greater scalability compared to canonical CCSD(T), have emerged over the last decade as pivotal in producing precise benchmark chemical data. Our investiga… ▽ More This study evaluates the precision of widely recognized quantum chemical methodologies, CCSD(T), DLPNO-CCSD(T) and localized ph-AFQMC, for determining the thermochemistry of main group elements. DLPNO-CCSD(T) and localized ph-AFQMC, which offer greater scalability compared to canonical CCSD(T), have emerged over the last decade as pivotal in producing precise benchmark chemical data. Our investigation includes closed-shell, neutral molecules, focusing on their heat of formation and atomization energy sourced from four specific small molecule datasets. Firstly, we selected molecules from the G2 and G3 datasets, noted for their reliable experimental heat of formation data. Additionally, we incorporate molecules from the W4-11 and W4-17 sets, which provide high-level theoretical reference values for atomization energy at 0 K. Our findings reveal that both DLPNO-CCSD(T) and ph-AFQMC methods are capable of achieving a root-mean-square deviation (RMSD) of less than 1 kcal/mol across the combined dataset, aligning with the threshold for chemical accuracy. Moreover, we make efforts to confine the maximum deviations within 2 kcal/mol, a degree of precision that significantly broadens the applicability of these methods in fields such as biology and materials science. △ Less

Submitted 24 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

arXiv:2405.00984 [pdf, other]

FREE: Faster and Better Data-Free Meta-Learning

Authors: Yongxian Wei, Zixuan Hu, Zhenyi Wang, Li Shen, Chun Yuan, Dacheng Tao

Abstract: Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data, presenting practical benefits in contexts constrained by data privacy concerns. Current DFML methods primarily focus on the data recovery from these pre-trained models. However, they suffer from slow recovery speed and overlook gaps inherent in heterogeneous pre-tra… ▽ More Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data, presenting practical benefits in contexts constrained by data privacy concerns. Current DFML methods primarily focus on the data recovery from these pre-trained models. However, they suffer from slow recovery speed and overlook gaps inherent in heterogeneous pre-trained models. In response to these challenges, we introduce the Faster and Better Data-Free Meta-Learning (FREE) framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks. Specifically, within the module Faster Inversion via Meta-Generator, each pre-trained model is perceived as a distinct task. The meta-generator can rapidly adapt to a specific task in just five steps, significantly accelerating the data recovery. Furthermore, we propose Better Generalization via Meta-Learner and introduce an implicit gradient alignment algorithm to optimize the meta-learner. This is achieved as aligned gradient directions alleviate potential conflicts among tasks from heterogeneous pre-trained models. Empirical experiments on multiple benchmarks affirm the superiority of our approach, marking a notable speed-up (20$\times$) and performance enhancement (1.42\% $\sim$ 4.78\%) in comparison to the state-of-the-art. △ Less

Submitted 1 May, 2024; originally announced May 2024.

arXiv:2405.00098 [pdf, other]

Amplitude analysis and branching fraction measurement of $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1057 additional authors not shown)

Abstract: The decays of the $B^{+}$ meson to the final state $D^{*-}D^{+}_{s}π^{+}$ are studied in proton-proton collision data collected with the LHCb detector at centre-of-mass energies of 7, 8, and 13 TeV, corresponding to a total integrated luminosity of 9 fb$^{-1}$. The ratio of branching fractions of the $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ and $B^{0}\to D^{*-}D^{+}_{s}$ decays is measured to be… ▽ More The decays of the $B^{+}$ meson to the final state $D^{*-}D^{+}_{s}π^{+}$ are studied in proton-proton collision data collected with the LHCb detector at centre-of-mass energies of 7, 8, and 13 TeV, corresponding to a total integrated luminosity of 9 fb$^{-1}$. The ratio of branching fractions of the $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ and $B^{0}\to D^{*-}D^{+}_{s}$ decays is measured to be $0.173\pm 0.006\pm 0.010$, where the first uncertainty is statistical and the second is systematic. Using partially reconstructed $D^{*+}_{s}\to D^{+}_{s}γ$ and $D^{+}_{s}π^{0}$ decays, the ratio of branching fractions between the $B^{+}\to D^{*-}D^{*+}_{s}π^{+}$ and $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ decays is determined as $1.31\pm 0.07\pm 0.14$. An amplitude analysis of the $B^{+}\to D^{*-}D^{+}_{s}π^{+}$ decay is performed for the first time, revealing dominant contributions from known excited charm resonances decaying to the $D^{*-}π^{+}$ final state. No significant evidence of exotic contributions in the $D^{+}_{s}π^{+}$ or $D^{*-}D^{+}_{s}$ channels is found. The fit fraction of the scalar state $T_{c\bar{s} 0}^{\ast}(2900)^{++}$ observed in the $B^{+}\to D^{-}D^{+}_{s}π^{+}$ decay is determined to be less than 2.3% at a 90% confidence level. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-001.html (LHCb public pages)

Report number: LHCb-PAPER-2024-001, CERN-EP-2024-110

arXiv:2404.19534 [pdf, other]

MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results

Authors: Yuekun Dai, Dafeng Zhang, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Peiqing Yang, Zhezhu **, Guanqun Liu, Chen Change Loy, Lize Zhang, Shuai Liu, Chaoyu Feng, Luyang Wang, Shuan Chen, Guangqi Shao, Xiaotao Wang, Lei Lei, Qirui Yang, Qihua Cheng, Zhiqiang Xu, Yihao Liu, Huan**g Yue, **gyu Yang , et al. (38 additional authors not shown)

Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photography and imaging (MIPI). Building on the achievements of the previous MIPI Workshops held at ECCV 2022 and CVPR 2023, we introduce our third MIPI challenge including three tracks focusing on novel image sensors and imaging algorithms. In this paper, we summarize and review the Nighttime Flare Removal track on MIPI 2024. In total, 170 participants were successfully registered, and 14 teams submitted results in the final testing phase. The developed solutions in this challenge achieved state-of-the-art performance on Nighttime Flare Removal. More details of this challenge and the link to the dataset can be found at https://mipi-challenge.org/MIPI2024/. △ Less

Submitted 27 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Nighttime Flare Removal Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

arXiv:2404.19510 [pdf, other]

First observation of $Λ_{b}^{0} \rightarrow Σ_c^{(*)++} D^{(*)-} K^{-}$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1067 additional authors not shown)

Abstract: The four decays, $Λ_{b}^{0} \rightarrow Σ_c^{(*)++} D^{(*)-} K^{-}$, are observed for the first time using proton-proton collision data collected with the LHCb detector at a centre-of-mass energy of $13\,\rm{TeV}$, corresponding to an integrated luminosity of $6\,\rm{fb}^{-1}$. By considering the $Λ_b^0 \rightarrow Λ_c^{+} \overline{D}^0 K^{-}$ decay as reference channel, the following branching f… ▽ More The four decays, $Λ_{b}^{0} \rightarrow Σ_c^{(*)++} D^{(*)-} K^{-}$, are observed for the first time using proton-proton collision data collected with the LHCb detector at a centre-of-mass energy of $13\,\rm{TeV}$, corresponding to an integrated luminosity of $6\,\rm{fb}^{-1}$. By considering the $Λ_b^0 \rightarrow Λ_c^{+} \overline{D}^0 K^{-}$ decay as reference channel, the following branching fraction ratios are measured to be, $$\frac{\cal{B} (Λ_{b}^{0} \rightarrow Σ_{c}^{++} \rm{D}^{-} {K}^{-})}{\cal{B}(Λ_{b}^{0} \rightarrow Λ_c^{+} \rm \overline{D}^0 {K}^{-})} = {0.282}\pm{0.016}\pm{0.016}\pm{0.005}, \frac{\cal{B}(Λ_{b}^{0} \rightarrow Σ_{c}^{*++} \rm {D}^{-} {K}^{-})}{\cal{B}(Λ_{b}^{0} \rightarrow Σ_c^{++} \rm {D}^{-} {K}^{-})} = {0.460}\pm{0.052}\pm{0.028}, \frac{\cal{B}(Λ_{b}^{0} \rightarrow Σ_{c}^{++} \rm {D}^{*-} {K}^{-})}{\cal{B}(Λ_{b}^{0} \rightarrow Σ_c^{++} \rm {D}^{-} {K}^{-})} = {2.261}\pm{0.202}\pm{0.129}\pm{0.046}, \frac{\cal{B}(Λ_{b}^{0} \rightarrow Σ_{c}^{*++} \rm D^{*-} K^{-})}{\cal{B}(Λ_{b}^{0} \rightarrow Σ_c^{++} \rm D^{-} K^{-})} = {0.896}\pm{0.137}\pm{0.066}\pm{0.018},$$ where the first uncertainties are statistical, the second are systematic, and the third are due to uncertainties in the branching fractions of intermediate particle decays. These initial observations mark the beginning of pentaquark searches in these modes, with more data set to become available following the LHCb upgrade. △ Less

Submitted 11 June, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-044.html (LHCb public pages)

Report number: LHCb-PAPER-2023-044, CERN-EP-2024-098

arXiv:2404.19265 [pdf, other]

Map** New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation

Authors: Zhenglin Li, Bo Guan, Yuanzhou Wei, Yiming Zhou, **gyu Zhang, **xin Xu

Abstract: Generative Adversarial Networks (GANs) have significantly advanced image processing, with Pix2Pix being a notable framework for image-to-image translation. This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images, addressing the scarcity of such images crucial for domains like urban planning and autonomous vehicle training. We detail th… ▽ More Generative Adversarial Networks (GANs) have significantly advanced image processing, with Pix2Pix being a notable framework for image-to-image translation. This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images, addressing the scarcity of such images crucial for domains like urban planning and autonomous vehicle training. We detail the Pix2Pix model's utilization for generating high-fidelity datasets, supported by a dataset of paired map and aerial images, and enhanced by a tailored training regimen. The results demonstrate the model's capability to accurately render complex urban features, establishing its efficacy and potential for broad real-world applications. △ Less

Submitted 30 April, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

arXiv:2404.18947 [pdf, other]

Multimodal Fusion on Low-quality Data: A Comprehensive Survey

Authors: Qingyang Zhang, Yake Wei, Zongbo Han, Huazhu Fu, Xi Peng, Cheng Deng, Qinghua Hu, Cai Xu, Jie Wen, Di Hu, Changqing Zhang

Abstract: Multimodal fusion focuses on integrating information from multiple modalities with the goal of more accurate prediction, which has achieved remarkable progress in a wide range of scenarios, including autonomous driving and medical diagnosis. However, the reliability of multimodal fusion remains largely unexplored especially under low-quality data settings. This paper surveys the common challenges… ▽ More Multimodal fusion focuses on integrating information from multiple modalities with the goal of more accurate prediction, which has achieved remarkable progress in a wide range of scenarios, including autonomous driving and medical diagnosis. However, the reliability of multimodal fusion remains largely unexplored especially under low-quality data settings. This paper surveys the common challenges and recent advances of multimodal fusion in the wild and presents them in a comprehensive taxonomy. From a data-centric view, we identify four main challenges that are faced by multimodal fusion on low-quality data, namely (1) noisy multimodal data that are contaminated with heterogeneous noises, (2) incomplete multimodal data that some modalities are missing, (3) imbalanced multimodal data that the qualities or properties of different modalities are significantly different and (4) quality-varying multimodal data that the quality of each modality dynamically changes with respect to different samples. This new taxonomy will enable researchers to understand the state of the field and identify several potential directions. We also provide discussion for the open problems in this field together with interesting future research directions. △ Less

Submitted 5 May, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

Comments: Feel free to comment on our manuscript: [email protected]

arXiv:2404.18135 [pdf, other]

Dexterous Grasp Transformer

Authors: Guo-Hao Xu, Yi-Lin Wei, Dian Zheng, Xiao-Ming Wu, Wei-Shi Zheng

Abstract: In this work, we propose a novel discriminative framework for dexterous grasp generation, named Dexterous Grasp TRansformer (DGTR), capable of predicting a diverse set of feasible grasp poses by processing the object point cloud with only one forward pass. We formulate dexterous grasp generation as a set prediction task and design a transformer-based gras** model for it. However, we identify tha… ▽ More In this work, we propose a novel discriminative framework for dexterous grasp generation, named Dexterous Grasp TRansformer (DGTR), capable of predicting a diverse set of feasible grasp poses by processing the object point cloud with only one forward pass. We formulate dexterous grasp generation as a set prediction task and design a transformer-based gras** model for it. However, we identify that this set prediction paradigm encounters several optimization challenges in the field of dexterous gras** and results in restricted performance. To address these issues, we propose progressive strategies for both the training and testing phases. First, the dynamic-static matching training (DSMT) strategy is presented to enhance the optimization stability during the training phase. Second, we introduce the adversarial-balanced test-time adaptation (AB-TTA) with a pair of adversarial losses to improve gras** quality during the testing phase. Experimental results on the DexGraspNet dataset demonstrate the capability of DGTR to predict dexterous grasp poses with both high quality and diversity. Notably, while kee** high quality, the diversity of grasp poses predicted by DGTR significantly outperforms previous works in multiple metrics without any data pre-processing. Codes are available at https://github.com/iSEE-Laboratory/DGTR . △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: Accepted to CVPR 2024

arXiv:2404.16555 [pdf, other]

MMGRec: Multimodal Generative Recommendation with Transformer Model

Authors: Han Liu, Yinwei Wei, Xuemeng Song, Weili Guan, Yuan-Fang Li, Liqiang Nie

Abstract: Multimodal recommendation aims to recommend user-preferred candidates based on her/his historically interacted items and associated multimodal information. Previous studies commonly employ an embed-and-retrieve paradigm: learning user and item representations in the same embedding space, then retrieving similar candidate items for a user via embedding inner product. However, this paradigm suffers… ▽ More Multimodal recommendation aims to recommend user-preferred candidates based on her/his historically interacted items and associated multimodal information. Previous studies commonly employ an embed-and-retrieve paradigm: learning user and item representations in the same embedding space, then retrieving similar candidate items for a user via embedding inner product. However, this paradigm suffers from inference cost, interaction modeling, and false-negative issues. Toward this end, we propose a new MMGRec model to introduce a generative paradigm into multimodal recommendation. Specifically, we first devise a hierarchical quantization method Graph RQ-VAE to assign Rec-ID for each item from its multimodal and CF information. Consisting of a tuple of semantically meaningful tokens, Rec-ID serves as the unique identifier of each item. Afterward, we train a Transformer-based recommender to generate the Rec-IDs of user-preferred items based on historical interaction sequences. The generative paradigm is qualified since this model systematically predicts the tuple of tokens identifying the recommended item in an autoregressive manner. Moreover, a relation-aware self-attention mechanism is devised for the Transformer to handle non-sequential interaction sequences, which explores the element pairwise relation to replace absolute positional encoding. Extensive experiments evaluate MMGRec's effectiveness compared with state-of-the-art methods. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.16524 [pdf]

3D deep learning for enhanced atom probe tomography analysis of nanoscale microstructures

Authors: Jiwei Yu, Zhangwei Wang, Aparna Saksena, Shaolou Wei, Ye Wei, Timoteo Colnaghi, Andreas Marek, Markus Rampp, Min Song, Baptiste Gault, Yue Li

Abstract: Quantitative analysis of microstructural features on the nanoscale, including precipitates, local chemical orderings (LCOs) or structural defects (e.g. stacking faults) plays a pivotal role in understanding the mechanical and physical responses of engineering materials. Atom probe tomography (APT), known for its exceptional combination of chemical sensitivity and sub-nanometer resolution, primaril… ▽ More Quantitative analysis of microstructural features on the nanoscale, including precipitates, local chemical orderings (LCOs) or structural defects (e.g. stacking faults) plays a pivotal role in understanding the mechanical and physical responses of engineering materials. Atom probe tomography (APT), known for its exceptional combination of chemical sensitivity and sub-nanometer resolution, primarily identifies microstructures through compositional segregations. However, this fails when there is no significant segregation, as can be the case for LCOs and stacking faults. Here, we introduce a 3D deep learning approach, AtomNet, designed to process APT point cloud data at the single-atom level for nanoscale microstructure extraction, simultaneously considering compositional and structural information. AtomNet is showcased in segmenting L12-type nanoprecipitates from the matrix in an AlLiMg alloy, irrespective of crystallographic orientations, which outperforms previous methods. AtomNet also allows for 3D imaging of L10-type LCOs in an AuCu alloy, a challenging task for conventional analysis due to their small size and subtle compositional differences. Finally, we demonstrate the use of AtomNet for revealing 2D stacking faults in a Co-based superalloy, without any defected training data, expanding the capabilities of APT for automated exploration of hidden microstructures. AtomNet pushes the boundaries of APT analysis, and holds promise in establishing precise quantitative microstructure-property relationships across a diverse range of metallic materials. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.15875 [pdf, other]

doi 10.1145/3626772.3657727

Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval

Authors: Haokun Wen, Xuemeng Song, Xiaolin Chen, Yinwei Wei, Liqiang Nie, Tat-Seng Chua

Abstract: Composed image retrieval (CIR) aims to retrieve the target image based on a multimodal query, i.e., a reference image paired with corresponding modification text. Recent CIR studies leverage vision-language pre-trained (VLP) methods as the feature extraction backbone, and perform nonlinear feature-level multimodal query fusion to retrieve the target image. Despite the promising performance, we arg… ▽ More Composed image retrieval (CIR) aims to retrieve the target image based on a multimodal query, i.e., a reference image paired with corresponding modification text. Recent CIR studies leverage vision-language pre-trained (VLP) methods as the feature extraction backbone, and perform nonlinear feature-level multimodal query fusion to retrieve the target image. Despite the promising performance, we argue that their nonlinear feature-level multimodal fusion may lead to the fused feature deviating from the original embedding space, potentially hurting the retrieval performance. To address this issue, in this work, we propose shifting the multimodal fusion from the feature level to the raw-data level to fully exploit the VLP model's multimodal encoding and cross-modal alignment abilities. In particular, we introduce a Dual Query Unification-based Composed Image Retrieval framework (DQU-CIR), whose backbone simply involves a VLP model's image encoder and a text encoder. Specifically, DQU-CIR first employs two training-free query unification components: text-oriented query unification and vision-oriented query unification, to derive a unified textual and visual query based on the raw data of the multimodal query, respectively. The unified textual query is derived by concatenating the modification text with the extracted reference image's textual description, while the unified visual query is created by writing the key modification words onto the reference image. Ultimately, to address diverse search intentions, DQU-CIR linearly combines the features of the two unified queries encoded by the VLP model to retrieve the target image. Extensive experiments on four real-world datasets validate the effectiveness of our proposed method. △ Less

Submitted 24 April, 2024; originally announced April 2024.

Comments: ACM SIGIR 2024

arXiv:2404.15815 [pdf, other]

Single-View Scene Point Cloud Human Grasp Generation

Authors: Yan-Kang Wang, Chengyi Xing, Yi-Lin Wei, Xiao-Ming Wu, Wei-Shi Zheng

Abstract: In this work, we explore a novel task of generating human grasps based on single-view scene point clouds, which more accurately mirrors the typical real-world situation of observing objects from a single viewpoint. Due to the incompleteness of object point clouds and the presence of numerous scene points, the generated hand is prone to penetrating into the invisible parts of the object and the mod… ▽ More In this work, we explore a novel task of generating human grasps based on single-view scene point clouds, which more accurately mirrors the typical real-world situation of observing objects from a single viewpoint. Due to the incompleteness of object point clouds and the presence of numerous scene points, the generated hand is prone to penetrating into the invisible parts of the object and the model is easily affected by scene points. Thus, we introduce S2HGrasp, a framework composed of two key modules: the Global Perception module that globally perceives partial object point clouds, and the DiffuGrasp module designed to generate high-quality human grasps based on complex inputs that include scene points. Additionally, we introduce S2HGD dataset, which comprises approximately 99,000 single-object single-view scene point clouds of 1,668 unique objects, each annotated with one human grasp. Our extensive experiments demonstrate that S2HGrasp can not only generate natural human grasps regardless of scene points, but also effectively prevent penetration between the hand and invisible parts of the object. Moreover, our model showcases strong generalization capability when applied to unseen objects. Our code and dataset are available at https://github.com/iSEE-Laboratory/S2HGrasp. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.15247 [pdf, other]

XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts

Authors: Yifeng Ding, Jiawei Liu, Yuxiang Wei, Terry Yue Zhuo, Lingming Zhang

Abstract: We introduce XFT, a simple yet powerful training scheme, by simply merging upcycled Mixture-of-Experts (MoE) to unleash the performance limit of instruction-tuned code Large Language Models (LLMs). While vanilla sparse upcycling fails to improve instruction tuning, XFT introduces a shared expert mechanism with a novel routing weight normalization strategy into sparse upcycling, which significantly… ▽ More We introduce XFT, a simple yet powerful training scheme, by simply merging upcycled Mixture-of-Experts (MoE) to unleash the performance limit of instruction-tuned code Large Language Models (LLMs). While vanilla sparse upcycling fails to improve instruction tuning, XFT introduces a shared expert mechanism with a novel routing weight normalization strategy into sparse upcycling, which significantly boosts instruction tuning. After fine-tuning the upcycled MoE model, XFT introduces a learnable model merging mechanism to compile the upcycled MoE model back to a dense model, achieving upcycled MoE-level performance with only dense-model compute. By applying XFT to a 1.3B model, we create a new state-of-the-art tiny code LLM (<3B) with 67.1 and 64.6 pass@1 on HumanEval and HumanEval+ respectively. With the same data and model architecture, XFT improves supervised fine-tuning (SFT) by 13% on HumanEval+, along with consistent improvements from 2% to 13% on MBPP+, MultiPL-E, and DS-1000, demonstrating its generalizability. XFT is fully orthogonal to existing techniques such as Evol-Instruct and OSS-Instruct, opening a new dimension for improving code instruction tuning. Codes are available at https://github.com/ise-uiuc/xft. △ Less

Submitted 6 June, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.14682 [pdf, other]

Uncovering Name-Based Biases in Large Language Models Through Simulated Trust Game

Authors: Yumou Wei, Paulo F. Carvalho, John Stamper

Abstract: Gender and race inferred from an individual's name are a notable source of stereotypes and biases that subtly influence social interactions. Abundant evidence from human experiments has revealed the preferential treatment that one receives when one's name suggests a predominant gender or race. As large language models acquire more capabilities and begin to support everyday applications, it becomes… ▽ More Gender and race inferred from an individual's name are a notable source of stereotypes and biases that subtly influence social interactions. Abundant evidence from human experiments has revealed the preferential treatment that one receives when one's name suggests a predominant gender or race. As large language models acquire more capabilities and begin to support everyday applications, it becomes crucial to examine whether they manifest similar biases when encountering names in a complex social interaction. In contrast to previous work that studies name-based biases in language models at a more fundamental level, such as word representations, we challenge three prominent models to predict the outcome of a modified Trust Game, a well-publicized paradigm for studying trust and reciprocity. To ensure the internal validity of our experiments, we have carefully curated a list of racially representative surnames to identify players in a Trust Game and rigorously verified the construct validity of our prompts. The results of our experiments show that our approach can detect name-based biases in both base and instruction-tuned models. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.14248 [pdf, other]

NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and Results

Authors: Xiaoning Liu, Zongwei Wu, Ao Li, Florin-Alexandru Vasluianu, Yulun Zhang, Shuhang Gu, Le Zhang, Ce Zhu, Radu Timofte, Zhi **, Hongjun Wu, Chenxi Wang, Haitao Ling, Yuanhao Cai, Hao Bian, Yuxin Zheng, **g Lin, Alan Yuille, Ben Shao, ** Guo, Tianli Liu, Mohao Wu, Yixu Feng, Shuo Hou, Haotian Lin , et al. (87 additional authors not shown)

Abstract: This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlig… ▽ More This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlighting, extreme darkness, and night scenes. A notable total of 428 participants registered for the challenge, with 22 teams ultimately making valid submissions. This paper meticulously evaluates the state-of-the-art advancements in enhancing low-light images, reflecting the significant progress and creativity in this field. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: NTIRE 2024 Challenge Report

arXiv:2404.13707 [pdf, other]

Robust inference for the unification of confidence intervals in meta-analysis

Authors: Wei Liang, Haicheng Huang, Hongsheng Dai, Yinghui Wei

Abstract: Traditional meta-analysis assumes that the effect sizes estimated in individual studies follow a Gaussian distribution. However, this distributional assumption is not always satisfied in practice, leading to potentially biased results. In the situation when the number of studies, denoted as K, is large, the cumulative Gaussian approximation errors from each study could make the final estimation un… ▽ More Traditional meta-analysis assumes that the effect sizes estimated in individual studies follow a Gaussian distribution. However, this distributional assumption is not always satisfied in practice, leading to potentially biased results. In the situation when the number of studies, denoted as K, is large, the cumulative Gaussian approximation errors from each study could make the final estimation unreliable. In the situation when K is small, it is not realistic to assume the random-effect follows Gaussian distribution. In this paper, we present a novel empirical likelihood method for combining confidence intervals under the meta-analysis framework. This method is free of the Gaussian assumption in effect size estimates from individual studies and from the random-effects. We establish the large-sample properties of the non-parametric estimator, and introduce a criterion governing the relationship between the number of studies, K, and the sample size of each study, n_i. Our methodology supersedes conventional meta-analysis techniques in both theoretical robustness and computational efficiency. We assess the performance of our proposed methods using simulation studies, and apply our proposed methods to two examples. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2404.13525 [pdf, other]

QR Decomposition of Dual Matrices and its Application to Traveling Wave Identification in the Brain

Authors: Renjie Xu, Tong Wei, Yimin Wei, Pengpeng Xie

Abstract: Matrix decompositions in dual number representations have played an important role in fields such as kinematics and computer graphics in recent years. In this paper, we present a QR decomposition algorithm for dual number matrices, specifically geared towards its application in traveling wave identification, utilizing the concept of proper orthogonal decomposition. When dealing with large-scale pr… ▽ More Matrix decompositions in dual number representations have played an important role in fields such as kinematics and computer graphics in recent years. In this paper, we present a QR decomposition algorithm for dual number matrices, specifically geared towards its application in traveling wave identification, utilizing the concept of proper orthogonal decomposition. When dealing with large-scale problems, we present explicit solutions for the QR, thin QR, and randomized QR decompositions of dual number matrices, along with their respective algorithms with column pivoting. The QR decomposition of dual matrices is an accurate first-order perturbation, with the Q-factor satisfying rigorous perturbation bounds, leading to enhanced orthogonality. In numerical experiments, we discuss the suitability of different QR algorithms when confronted with various large-scale dual matrices, providing their respective domains of applicability. Subsequently, we employed the QR decomposition of dual matrices to compute the DMPGI, thereby attaining results of higher precision. Moreover, we apply the QR decomposition in the context of traveling wave identification, employing the notion of proper orthogonal decomposition to perform a validation analysis of large-scale functional magnetic resonance imaging (fMRI) data for brain functional circuits. Our approach significantly improves the identification of two types of wave signals compared to previous research, providing empirical evidence for cognitive neuroscience theories. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2404.13289 [pdf, other]

Double Mixture: Towards Continual Event Detection from Speech

Authors: **gqi Kang, Tongtong Wu, **ming Zhao, Guitao Wang, Yinwei Wei, Hao Yang, Guilin Qi, Yuan-Fang Li, Gholamreza Haffari

Abstract: Speech event detection is crucial for multimedia retrieval, involving the tagging of both semantic and acoustic events. Traditional ASR systems often overlook the interplay between these events, focusing solely on content, even though the interpretation of dialogue can vary with environmental context. This paper tackles two primary challenges in speech event detection: the continual integration of… ▽ More Speech event detection is crucial for multimedia retrieval, involving the tagging of both semantic and acoustic events. Traditional ASR systems often overlook the interplay between these events, focusing solely on content, even though the interpretation of dialogue can vary with environmental context. This paper tackles two primary challenges in speech event detection: the continual integration of new events without forgetting previous ones, and the disentanglement of semantic from acoustic events. We introduce a new task, continual event detection from speech, for which we also provide two benchmark datasets. To address the challenges of catastrophic forgetting and effective disentanglement, we propose a novel method, 'Double Mixture.' This method merges speech expertise with robust memory mechanisms to enhance adaptability and prevent forgetting. Our comprehensive experiments show that this task presents significant challenges that are not effectively addressed by current state-of-the-art methods in either computer vision or natural language processing. Our approach achieves the lowest rates of forgetting and the highest levels of generalization, proving robust across various continual learning sequences. Our code and data are available at https://anonymous.4open.science/status/Continual-SpeechED-6461. △ Less

Submitted 20 April, 2024; originally announced April 2024.

Comments: The first two authors contributed equally to this work

arXiv:2404.12916 [pdf, other]

Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models

Authors: Zhenyang Ni, Rui Ye, Yuxi Wei, Zhen Xiang, Yanfeng Wang, Siheng Chen

Abstract: Vision-Large-Language-models(VLMs) have great application prospects in autonomous driving. Despite the ability of VLMs to comprehend and make decisions in complex scenarios, their integration into safety-critical autonomous driving systems poses serious security risks. In this paper, we propose BadVLMDriver, the first backdoor attack against VLMs for autonomous driving that can be launched in prac… ▽ More Vision-Large-Language-models(VLMs) have great application prospects in autonomous driving. Despite the ability of VLMs to comprehend and make decisions in complex scenarios, their integration into safety-critical autonomous driving systems poses serious security risks. In this paper, we propose BadVLMDriver, the first backdoor attack against VLMs for autonomous driving that can be launched in practice using physical objects. Unlike existing backdoor attacks against VLMs that rely on digital modifications, BadVLMDriver uses common physical items, such as a red balloon, to induce unsafe actions like sudden acceleration, highlighting a significant real-world threat to autonomous vehicle safety. To execute BadVLMDriver, we develop an automated pipeline utilizing natural language instructions to generate backdoor training samples with embedded malicious behaviors. This approach allows for flexible trigger and behavior selection, enhancing the stealth and practicality of the attack in diverse scenarios. We conduct extensive experiments to evaluate BadVLMDriver for two representative VLMs, five different trigger objects, and two types of malicious backdoor behaviors. BadVLMDriver achieves a 92% attack success rate in inducing a sudden acceleration when coming across a pedestrian holding a red balloon. Thus, BadVLMDriver not only demonstrates a critical security risk but also emphasizes the urgent need for develo** robust defense mechanisms to protect against such vulnerabilities in autonomous driving technologies. △ Less

Submitted 22 April, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.12711 [pdf, other]

Dynamic Temperature Knowledge Distillation

Authors: Yukang Wei, Yu Bai

Abstract: Temperature plays a pivotal role in moderating label softness in the realm of knowledge distillation (KD). Traditional approaches often employ a static temperature throughout the KD process, which fails to address the nuanced complexities of samples with varying levels of difficulty and overlooks the distinct capabilities of different teacher-student pairings. This leads to a less-than-ideal trans… ▽ More Temperature plays a pivotal role in moderating label softness in the realm of knowledge distillation (KD). Traditional approaches often employ a static temperature throughout the KD process, which fails to address the nuanced complexities of samples with varying levels of difficulty and overlooks the distinct capabilities of different teacher-student pairings. This leads to a less-than-ideal transfer of knowledge. To improve the process of knowledge propagation, we proposed Dynamic Temperature Knowledge Distillation (DTKD) which introduces a dynamic, cooperative temperature control for both teacher and student models simultaneously within each training iterafion. In particular, we proposed "\textbf{sharpness}" as a metric to quantify the smoothness of a model's output distribution. By minimizing the sharpness difference between the teacher and the student, we can derive sample-specific temperatures for them respectively. Extensive experiments on CIFAR-100 and ImageNet-2012 demonstrate that DTKD performs comparably to leading KD techniques, with added robustness in Target Class KD and None-target Class KD scenarios.The code is available at https://github.com/**Yu1998/DTKD. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.12104 [pdf, other]

Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models

Authors: Yuzhu Cai, Sheng Yin, Yuxi Wei, Chenxin Xu, Weibo Mao, Felix Juefei-Xu, Siheng Chen, Yanfeng Wang

Abstract: The burgeoning landscape of text-to-image models, exemplified by innovations such as Midjourney and DALLE 3, has revolutionized content creation across diverse sectors. However, these advancements bring forth critical ethical concerns, particularly with the misuse of open-source models to generate content that violates societal norms. Addressing this, we introduce Ethical-Lens, a framework designe… ▽ More The burgeoning landscape of text-to-image models, exemplified by innovations such as Midjourney and DALLE 3, has revolutionized content creation across diverse sectors. However, these advancements bring forth critical ethical concerns, particularly with the misuse of open-source models to generate content that violates societal norms. Addressing this, we introduce Ethical-Lens, a framework designed to facilitate the value-aligned usage of text-to-image tools without necessitating internal model revision. Ethical-Lens ensures value alignment in text-to-image models across toxicity and bias dimensions by refining user commands and rectifying model outputs. Systematic evaluation metrics, combining GPT4-V, HEIM, and FairFace scores, assess alignment capability. Our experiments reveal that Ethical-Lens enhances alignment capabilities to levels comparable with or superior to commercial models like DALLE 3, ensuring user-generated content adheres to ethical standards while maintaining image quality. This study indicates the potential of Ethical-Lens to ensure the sustainable development of open-source text-to-image tools and their beneficial integration into society. Our code is available at https://github.com/yuzhu-cai/Ethical-Lens. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 42 pages, 17 figures, 29 tables

arXiv:2404.12006 [pdf, other]

Variational Multi-Modal Hypergraph Attention Network for Multi-Modal Relation Extraction

Authors: Qian Li, Cheng Ji, Shu Guo, Yong Zhao, Qianren Mao, Shangguang Wang, Yuntao Wei, Jianxin Li

Abstract: Multi-modal relation extraction (MMRE) is a challenging task that aims to identify relations between entities in text leveraging image information. Existing methods are limited by their neglect of the multiple entity pairs in one sentence sharing very similar contextual information (ie, the same text and image), resulting in increased difficulty in the MMRE task. To address this limitation, we pro… ▽ More Multi-modal relation extraction (MMRE) is a challenging task that aims to identify relations between entities in text leveraging image information. Existing methods are limited by their neglect of the multiple entity pairs in one sentence sharing very similar contextual information (ie, the same text and image), resulting in increased difficulty in the MMRE task. To address this limitation, we propose the Variational Multi-Modal Hypergraph Attention Network (VM-HAN) for multi-modal relation extraction. Specifically, we first construct a multi-modal hypergraph for each sentence with the corresponding image, to establish different high-order intra-/inter-modal correlations for different entity pairs in each sentence. We further design the Variational Hypergraph Attention Networks (V-HAN) to obtain representational diversity among different entity pairs using Gaussian distribution and learn a better hypergraph structure via variational attention. VM-HAN achieves state-of-the-art performance on the multi-modal relation extraction task, outperforming existing methods in terms of accuracy and efficiency. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.11991 [pdf, ps, other]

Pohozaev identities and Kelvin transformation of semilinear Grushin equation

Authors: Yawei Wei, Xiaodong Zhou

Abstract: In this paper, we study Pohozaev identities, Kelvin transformation and their applications of semilinear Grushin equation. First, we establish two Pohozaev identities generated from translations and determine the location of the concentration point for solution of a kind of Grushin equation by such identities. Next, we establish Pohozaev identity generated from scaling and prove the nonexistence of… ▽ More In this paper, we study Pohozaev identities, Kelvin transformation and their applications of semilinear Grushin equation. First, we establish two Pohozaev identities generated from translations and determine the location of the concentration point for solution of a kind of Grushin equation by such identities. Next, we establish Pohozaev identity generated from scaling and prove the nonexistence of nontrivial solutions of another kind of Grushin equation by such identity. Finally, we provide the change of Grushin operator by Kelvin transformation and obtain the decay rate of solution at infinity for a critical Grushin equation by Kelvin transformation. △ Less

Submitted 18 April, 2024; originally announced April 2024.

MSC Class: 35J70; 35A22; 35B40

arXiv:2404.10343 [pdf, other]

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such as runtime, parameters, and FLOPs, while still maintaining a peak signal-to-noise ratio (PSNR) of approximately 26.90 dB on the DIV2K_LSDIR_valid dataset and 26.99 dB on the DIV2K_LSDIR_test dataset. In addition, this challenge has 4 tracks including the main track (overall performance), sub-track 1 (runtime), sub-track 2 (FLOPs), and sub-track 3 (parameters). In the main track, all three metrics (ie runtime, FLOPs, and parameter count) were considered. The ranking of the main track is calculated based on a weighted sum-up of the scores of all other sub-tracks. In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking. In sub-track 2, the number of FLOPs was considered. The score calculated based on the corresponding FLOPs was used to determine the ranking. In sub-track 3, the number of parameters was considered. The score calculated based on the corresponding parameters was used to determine the ranking. RLFN is set as the baseline for efficiency measurement. The challenge had 262 registered participants, and 34 teams made valid submissions. They gauge the state-of-the-art in efficient single-image super-resolution. To facilitate the reproducibility of the challenge and enable other researchers to build upon these findings, the code and the pre-trained model of validated solutions are made publicly available at https://github.com/Amazingren/NTIRE2024_ESR/. △ Less

Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

arXiv:2404.10298 [pdf, ps, other]

Anisotropic Gauss curvature flow of complete non-compact graphs

Authors: Shu**g Pan, Yong Wei

Abstract: In this paper, we consider the anisotropic $α$-Gauss curvature flow for complete noncompact convex hypersurfaces in the Euclidean space with the anisotropy determined by a smooth closed uniformly convex Wulff shape. We show that for all positive power $α>0$, if the initial hypersurface is complete noncompact and locally uniformly convex, then the solution of the flow exists for all positive time. In this paper, we consider the anisotropic $α$-Gauss curvature flow for complete noncompact convex hypersurfaces in the Euclidean space with the anisotropy determined by a smooth closed uniformly convex Wulff shape. We show that for all positive power $α>0$, if the initial hypersurface is complete noncompact and locally uniformly convex, then the solution of the flow exists for all positive time. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: 23 pages. All comments are welcome

MSC Class: 53C44; 53C42

Showing 51–100 of 1,555 results for author: Wei, Y