Search | arXiv e-print repository

Test of light-lepton universality in $τ$ decays with the Belle II experiment

Authors: Belle II Collaboration, I. Adachi, K. Adamczyk, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer, J. Becker , et al. (406 additional authors not shown)

Abstract: We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimise… ▽ More We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimised event selection, a binned maximum likelihood fit is performed using the momentum spectra of the electron and muon candidates. The result, $R_μ= 0.9675 \pm 0.0007 \pm 0.0036$, where the first uncertainty is statistical and the second is systematic, is the most precise to date. It provides a stringent test of the light-lepton universality, translating to a ratio of the couplings of the muon and electron to the $W$ boson in $τ$ decays of $0.9974 \pm 0.0019$, in agreement with the standard model expectation of unity. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Report number: Belle II Preprint 2024-002, KEK Preprint 2023-49

arXiv:2405.14336 [pdf, other]

I$^2$VC: A Unified Framework for Intra- & Inter-frame Video Compression

Authors: Meiqin Liu, Chenming Xu, Yukai Gu, Chao Yao, Yao Zhao

Abstract: Video compression aims to reconstruct seamless frames by encoding the motion and residual information from existing frames. Previous neural video compression methods necessitate distinct codecs for three types of frames (I-frame, P-frame and B-frame), which hinders a unified approach and generalization across different video contexts. Intra-codec techniques lack the advanced Motion Estimation and… ▽ More Video compression aims to reconstruct seamless frames by encoding the motion and residual information from existing frames. Previous neural video compression methods necessitate distinct codecs for three types of frames (I-frame, P-frame and B-frame), which hinders a unified approach and generalization across different video contexts. Intra-codec techniques lack the advanced Motion Estimation and Motion Compensation (MEMC) found in inter-codec, leading to fragmented frameworks lacking uniformity. Our proposed Intra- & Inter-frame Video Compression (I$^2$VC) framework employs a single spatio-temporal codec that guides feature compression rates according to content importance. This unified codec transforms the dependence across frames into a conditional coding scheme, thus integrating intra- and inter-frame compression into one cohesive strategy. Given the absence of explicit motion data, achieving competent inter-frame compression with only a conditional codec poses a challenge. To resolve this, our approach includes an implicit inter-frame alignment mechanism. With the pre-trained diffusion denoising process, the utilization of a diffusion-inverted reference feature rather than random noise supports the initial compression state. This process allows for selective denoising of motion-rich regions based on decoded features, facilitating accurate alignment without the need for MEMC. Our experimental findings, across various compression configurations (AI, LD and RA) and frame types, prove that I$^2$VC outperforms the state-of-the-art perceptual learned codecs. Impressively, it exhibits a 58.4% enhancement in perceptual reconstruction performance when benchmarked against the H.266/VVC standard (VTM). Official implementation can be found at https://github.com/GYukai/I2VC. △ Less

Submitted 1 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: 19 pages, 10 figures

arXiv:2405.13646 [pdf]

A Transformer variant for multi-step forecasting of water level and hydrometeorological sensitivity analysis based on explainable artificial intelligence technology

Authors: Mingyu Liu, Nana Bao, Xingting Yan, Chenyang Li, Kai Peng

Abstract: Understanding the combined influences of meteorological and hydrological factors on water level and flood events is essential, particularly in today's changing climate environments. Transformer, as one kind of the cutting-edge deep learning methods, offers an effective approach to model intricate nonlinear processes, enables the extraction of key features and water level predictions. EXplainable A… ▽ More Understanding the combined influences of meteorological and hydrological factors on water level and flood events is essential, particularly in today's changing climate environments. Transformer, as one kind of the cutting-edge deep learning methods, offers an effective approach to model intricate nonlinear processes, enables the extraction of key features and water level predictions. EXplainable Artificial Intelligence (XAI) methods play important roles in enhancing the understandings of how different factors impact water level. In this study, we propose a Transformer variant by integrating sparse attention mechanism and introducing nonlinear output layer for the decoder module. The variant model is utilized for multi-step forecasting of water level, by considering meteorological and hydrological factors simultaneously. It is shown that the variant model outperforms traditional Transformer across different lead times with respect to various evaluation metrics. The sensitivity analyses based on XAI technology demonstrate the significant influence of meteorological factors on water level evolution, in which temperature is shown to be the most dominant meteorological factor. Therefore, incorporating both meteorological and hydrological factors is necessary for reliable hydrological prediction and flood prevention. In the meantime, XAI technology provides insights into certain predictions, which is beneficial for understanding the prediction results and evaluating the reasonability. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.13378 [pdf, other]

FedCache 2.0: Exploiting the Potential of Distilled Data in Knowledge Cache-driven Federated Learning

Authors: Quyang Pan, Sheng Sun, Zhiyuan Wu, Yuwei Wang, Min Liu, Bo Gao

Abstract: Federated Edge Learning (FEL) has emerged as a promising approach for enabling edge devices to collaboratively train machine learning models while preserving data privacy. Despite its advantages, practical FEL deployment faces significant challenges related to device constraints and device-server interactions, necessitating heterogeneous, user-adaptive model training with limited and uncertain com… ▽ More Federated Edge Learning (FEL) has emerged as a promising approach for enabling edge devices to collaboratively train machine learning models while preserving data privacy. Despite its advantages, practical FEL deployment faces significant challenges related to device constraints and device-server interactions, necessitating heterogeneous, user-adaptive model training with limited and uncertain communication. In this paper, we introduce FedCache 2.0, a novel personalized FEL architecture that simultaneously addresses these challenges. FedCache 2.0 incorporates the benefits of both dataset distillation and knowledge cache-driven federated learning by storing and organizing distilled data as knowledge in the server-side knowledge cache. Moreover, a device-centric cache sampling strategy is introduced to tailor transferred knowledge for individual devices within controlled communication bandwidth. Extensive experiments on five datasets covering image recognition, audio understanding, and mobile sensor data mining tasks demonstrate that (1) FedCache 2.0 significantly outperforms state-of-the-art methods regardless of model structures, data distributions, and modalities. (2) FedCache 2.0 can train splendid personalized on-device models with at least $\times$28.6 improvement in communication efficiency. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 20 pages, 8 figures, 10 tables

arXiv:2405.13315 [pdf, other]

Study of the decays $χ_{cJ}\toΛ\barΛω$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$,… ▽ More Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$, $\mathcal{B}(χ_{c1}\toΛ\barΛω)=({1.01 \pm 0.10 \pm 0.11}) \times 10^{-4}$, and $\mathcal{B}(χ_{c2}\toΛ\barΛω)=({1.40 \pm 0.13 \pm 0.17}) \times 10^{-4}$, where the first uncertainties are statistical and the second are systematic. We observe no clear intermediate structures. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 11 pages, 10 figures

arXiv:2405.13282 [pdf, other]

doi 10.1103/PhysRevE.109.054123

Quantum criticality of generalized Aubry-André models with exact mobility edges using fidelity susceptibility

Authors: Yu-Bin Liu, Wen-Yi Zhang, Tian-Cheng Yi, Liangsheng Li, Maoxin Liu, Wen-Long You

Abstract: In this study, we explore the quantum critical phenomena in generalized Aubry-André models, with a particular focus on the scaling behavior at various filling states. Our approach involves using quantum fidelity susceptibility to precisely identify the mobility edges in these systems. Through a finite-size scaling analysis of the fidelity susceptibility, we are able to determine both the correlati… ▽ More In this study, we explore the quantum critical phenomena in generalized Aubry-André models, with a particular focus on the scaling behavior at various filling states. Our approach involves using quantum fidelity susceptibility to precisely identify the mobility edges in these systems. Through a finite-size scaling analysis of the fidelity susceptibility, we are able to determine both the correlation-length critical exponent and the dynamical critical exponent at the critical point of the generalized Aubry-André model. Based on the Diophantine equation conjecture, we can determines the number of subsequences of the Fibonacci sequence and the corresponding scaling functions for a specific filling fraction, as well as the universality class. Our findings demonstrate the effectiveness of employing the generalized fidelity susceptibility for the analysis of unconventional quantum criticality and the associated universal information of quasiperiodic systems in cutting-edge quantum simulation experiments. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 9 pages, 5 figures

Journal ref: Phys. Rev. E 109, 054123 (2024)

arXiv:2405.12809 [pdf, other]

Precision measurement of the branching fraction of \boldmath $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (604 additional authors not shown)

Abstract: Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with sig… ▽ More Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with significantly improved precision. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: to be submitted to PRD

arXiv:2405.12503 [pdf, other]

CLRKDNet: Speeding up Lane Detection with Knowledge Distillation

Authors: Weiqing Qi, Guoyang Zhao, Fulong Ma, Linwei Zheng, Ming Liu

Abstract: Road lanes are integral components of the visual perception systems in intelligent vehicles, playing a pivotal role in safe navigation. In lane detection tasks, balancing accuracy with real-time performance is essential, yet existing methods often sacrifice one for the other. To address this trade-off, we introduce CLRKDNet, a streamlined model that balances detection accuracy with real-time perfo… ▽ More Road lanes are integral components of the visual perception systems in intelligent vehicles, playing a pivotal role in safe navigation. In lane detection tasks, balancing accuracy with real-time performance is essential, yet existing methods often sacrifice one for the other. To address this trade-off, we introduce CLRKDNet, a streamlined model that balances detection accuracy with real-time performance. The state-of-the-art model CLRNet has demonstrated exceptional performance across various datasets, yet its computational overhead is substantial due to its Feature Pyramid Network (FPN) and muti-layer detection head architecture. Our method simplifies both the FPN structure and detection heads, redesigning them to incorporate a novel teacher-student distillation process alongside a newly introduced series of distillation losses. This combination reduces inference time by up to 60% while maintaining detection accuracy comparable to CLRNet. This strategic balance of accuracy and speed makes CLRKDNet a viable solution for real-time lane detection tasks in autonomous driving applications. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.12488 [pdf, other]

First joint oscillation analysis of Super-Kamiokande atmospheric and T2K accelerator neutrino data

Authors: Super-Kamiokande, T2K collaborations, :, S. Abe, K. Abe, N. Akhlaq, R. Akutsu, H. Alarakia-Charles, A. Ali, Y. I. Alj Hakim, S. Alonso Monsalve, S. Amanai, C. Andreopoulos, L. H. V. Anthony, M. Antonova, S. Aoki, K. A. Apte, T. Arai, T. Arihara, S. Arimoto, Y. Asada, R. Asaka, Y. Ashida, E. T. Atkin, N. Babu , et al. (524 additional authors not shown)

Abstract: The Super-Kamiokande and T2K collaborations present a joint measurement of neutrino oscillation parameters from their atmospheric and beam neutrino data. It uses a common interaction model for events overlap** in neutrino energy and correlated detector systematic uncertainties between the two datasets, which are found to be compatible. Using 3244.4 days of atmospheric data and a beam exposure of… ▽ More The Super-Kamiokande and T2K collaborations present a joint measurement of neutrino oscillation parameters from their atmospheric and beam neutrino data. It uses a common interaction model for events overlap** in neutrino energy and correlated detector systematic uncertainties between the two datasets, which are found to be compatible. Using 3244.4 days of atmospheric data and a beam exposure of $19.7(16.3) \times 10^{20}$ protons on target in (anti)neutrino mode, the analysis finds a 1.9$σ$ exclusion of CP-conservation (defined as $J_{CP}=0$) and a preference for the normal mass ordering. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 10 pages, 3 figures

arXiv:2405.12328 [pdf, other]

Multi-dimension Transformer with Attention-based Filtering for Medical Image Segmentation

Authors: Wentao Wang, Xi Xiao, Mingjie Liu, Qing Tian, Xuanyao Huang, Qizhen Lan, Swalpa Kumar Roy, Tianyang Wang

Abstract: The accurate segmentation of medical images is crucial for diagnosing and treating diseases. Recent studies demonstrate that vision transformer-based methods have significantly improved performance in medical image segmentation, primarily due to their superior ability to establish global relationships among features and adaptability to various inputs. However, these methods struggle with the low s… ▽ More The accurate segmentation of medical images is crucial for diagnosing and treating diseases. Recent studies demonstrate that vision transformer-based methods have significantly improved performance in medical image segmentation, primarily due to their superior ability to establish global relationships among features and adaptability to various inputs. However, these methods struggle with the low signal-to-noise ratio inherent to medical images. Additionally, the effective utilization of channel and spatial information, which are essential for medical image segmentation, is limited by the representation capacity of self-attention. To address these challenges, we propose a multi-dimension transformer with attention-based filtering (MDT-AF), which redesigns the patch embedding and self-attention mechanism for medical image segmentation. MDT-AF incorporates an attention-based feature filtering mechanism into the patch embedding blocks and employs a coarse-to-fine process to mitigate the impact of low signal-to-noise ratio. To better capture complex structures in medical images, MDT-AF extends the self-attention mechanism to incorporate spatial and channel dimensions, enriching feature representation. Moreover, we introduce an interaction mechanism to improve the feature aggregation between spatial and channel dimensions. Experimental results on three public medical image segmentation benchmarks show that MDT-AF achieves state-of-the-art (SOTA) performance. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.12285 [pdf, other]

In-situ Measurements of Dark Photon Dark Matter using Parker Solar Probe: Going beyond the Radio Window

Authors: Haipeng An, Shuailiang Ge, Jia Liu, Mingzhe Liu

Abstract: Dark photon dark matter emerges as a compelling candidate for ultralight bosonic dark matter, detectable through resonant conversion into photons within a plasma environment. This study employs in-situ measurements from the Parker Solar Probe (PSP), the first spacecraft to venture into the solar corona, to probe for DPDM signatures. The PSP in-situ measurements go beyond the traditional radio wind… ▽ More Dark photon dark matter emerges as a compelling candidate for ultralight bosonic dark matter, detectable through resonant conversion into photons within a plasma environment. This study employs in-situ measurements from the Parker Solar Probe (PSP), the first spacecraft to venture into the solar corona, to probe for DPDM signatures. The PSP in-situ measurements go beyond the traditional radio window, spanning frequencies between about 10 kHz and 20 MHz, a challenging range inaccessible to Earth-based radio astronomy. Additionally, the proximity of PSP to the resonant conversion location enhances the signal flux, providing a distinct advantage over ground-based observations. As a result, the PSP data establishes the most stringent constraints on the kinetic mixing parameter $ε$ for DPDM frequencies between 70 kHz and 20 MHz, with values of $ε\lesssim 10^{-14}-10^{-13}$. Investigating the data from STEREO satellites resulted in weaker constraints compared to those obtained from PSP. By utilizing state-of-the-art solar observations from space, we have surpassed the cosmic microwave background limits established in the early universe. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 13 pages, 6 figures

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.11585 [pdf, other]

Improved measurement of the branching fraction of $h_{c}\rightarrowγη^\prime/η$ and search for $h_{c}\rightarrowγπ^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (645 additional authors not shown)

Abstract: The processes $h_c\rightarrowγP(P = η^\prime,~η,~π^{0}))$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The branching fractions of $h_c\rightarrowγη^\prime$ and $h_c\rightarrowγη$ are measured to be $(1.40\pm0.11\pm0.04\pm0.10)\times10^{-3}$ and $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, respectively, where the… ▽ More The processes $h_c\rightarrowγP(P = η^\prime,~η,~π^{0}))$ are studied with a sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. The branching fractions of $h_c\rightarrowγη^\prime$ and $h_c\rightarrowγη$ are measured to be $(1.40\pm0.11\pm0.04\pm0.10)\times10^{-3}$ and $(3.77\pm0.55\pm0.13\pm0.26)\times10^{-4}$, respectively, where the first uncertainties are statistical, the second systematic, and the third from the branching fraction of $ψ(3686)\rightarrowπ^{0}h_c$. The ratio $R_{h_c}=\frac{\mathscr{B}(h_c\rightarrowγη)}{\mathscr{B}(h_c\rightarrowγη^\prime)}$ is calculated to be $(27.0\pm4.4\pm1.0)\%$. The measurements are consistent with the previous results with improved precision by a factor of 2. The results are valuable for gaining a deeper understanding of $η-η^\prime$ mixing, and its manifestation within quantum chromodynamics. No significant signal is found for the decay $h_c\rightarrowγπ^{0}$, and an upper limit is placed on its branching fraction of $\mathscr{B}(h_c\rightarrowγπ^{0})<5.0\times10^{-5}$, at the 90\% confidence level. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.11160 [pdf, other]

Estimating Photometric Distances to Ultracool Dwarfs in Next Generation Space-based Infrared Surveys: Synthetic Photometry and New Absolute Magnitude Versus Spectral Type Relations for JWST, Euclid, and Roman Filters

Authors: Aniket Sanghi, Michael C. Liu, Trent J. Dupuy, William M. Best, Robert J. Siverd, Zhoujian Zhang

Abstract: We synthesize JWST NIRCam photometry for the F164N, F187N, F212N narrow filters, F140M, F162M, F182M, F210M medium filters, and F115W, F150W, F200W wide filters, Euclid Near Infrared Spectrometer and Photometer (NISP) photometry for the $Y_E J_E H_E$ filters, and Roman Wide Field Instrument (WFI) photometry for the F106, F129, F146, F158, F184 and F213 filters using SpeX prism spectra and parallax… ▽ More We synthesize JWST NIRCam photometry for the F164N, F187N, F212N narrow filters, F140M, F162M, F182M, F210M medium filters, and F115W, F150W, F200W wide filters, Euclid Near Infrared Spectrometer and Photometer (NISP) photometry for the $Y_E J_E H_E$ filters, and Roman Wide Field Instrument (WFI) photometry for the F106, F129, F146, F158, F184 and F213 filters using SpeX prism spectra and parallaxes of 688 field-age and 151 young ($\lesssim$ 200 Myr) ultracool dwarfs (spectral types M6-T9). We derive absolute magnitude-spectral type polynomial relations that enable the calculation of photometric distances for ultracool dwarfs observed with JWST, and to be observed with Euclid and Roman, in the absence of parallax measurements. Additionally, using the synthesized photometry to generate color-color figures can help distinguish high-redshift galaxies from brown dwarf interlopers in survey datasets. In particular, anticipating the upcoming Euclid Early Release Observations, we provide synthetic Euclid colors for ultracool dwarfs in our sample. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 6 pages, 3 figures, 1 table

arXiv:2405.10961 [pdf, other]

Simplified discrete model for axisymmetric dielectric elastomer membranes with robotic applications

Authors: Zhaowei Liu, Mingchao Liu, K. Jimmy Hsia, Xiaonan Huang, Weicheng Huang

Abstract: Soft robots utilizing inflatable dielectric membranes can realize intricate functionalities through the application of non-mechanical fields. However, given the current limitations in simulations, including low computational efficiency and difficulty in dealing with complex external interactions, the design and control of such soft robots often require trial and error. Thus, a novel one-dimensiona… ▽ More Soft robots utilizing inflatable dielectric membranes can realize intricate functionalities through the application of non-mechanical fields. However, given the current limitations in simulations, including low computational efficiency and difficulty in dealing with complex external interactions, the design and control of such soft robots often require trial and error. Thus, a novel one-dimensional (1D) discrete differential geometry (DDG)-based numerical model is developed for analyzing the highly nonlinear mechanics in axisymmetric inflatable dielectric membranes. The model captures the intricate dynamics of these membranes under both inflationary pressure and electrical stimulation. Comprehensive validations using hyperelastic benchmarks demonstrate the model's accuracy and reliability. Additionally, the focus on the electro-mechanical coupling elucidates critical insights into the membrane's behavior under varying internal pressures and electrical loads. The research further translates these findings into innovative soft robotic applications, including a spherical soft actuator, a soft circular fluid pump, and a soft toroidal gripper, where the snap-through of electroelastic membrane plays a crucial role. Our analyses reveal that the functional ranges of soft robots are amplified by the snap-through of an electroelastic membrane upon electrical stimuli. This study underscores the potential of DDG-based simulations to advance the understanding of the nonlinear mechanics of electroelastic membranes and guide the design of electroelastic actuators in soft robotics applications. △ Less

Submitted 23 April, 2024; originally announced May 2024.

Comments: 27 pages, 8 figures

arXiv:2405.10530 [pdf, other]

CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation

Authors: Mushui Liu, Jun Dan, Ziqian Lu, Yunlong Yu, Yingming Li, Xi Li

Abstract: Due to the large-scale image size and object variations, current CNN-based and Transformer-based approaches for remote sensing image semantic segmentation are suboptimal for capturing the long-range dependency or limited to the complex computational complexity. In this paper, we propose CM-UNet, comprising a CNN-based encoder for extracting local image features and a Mamba-based decoder for aggreg… ▽ More Due to the large-scale image size and object variations, current CNN-based and Transformer-based approaches for remote sensing image semantic segmentation are suboptimal for capturing the long-range dependency or limited to the complex computational complexity. In this paper, we propose CM-UNet, comprising a CNN-based encoder for extracting local image features and a Mamba-based decoder for aggregating and integrating global information, facilitating efficient semantic segmentation of remote sensing images. Specifically, a CSMamba block is introduced to build the core segmentation decoder, which employs channel and spatial attention as the gate activation condition of the vanilla Mamba to enhance the feature interaction and global-local information fusion. Moreover, to further refine the output features from the CNN encoder, a Multi-Scale Attention Aggregation (MSAA) module is employed to merge the different scale features. By integrating the CSMamba block and MSAA module, CM-UNet effectively captures the long-range dependencies and multi-scale global contextual information of large-scale remote-sensing images. Experimental results obtained on three benchmarks indicate that the proposed CM-UNet outperforms existing methods in various performance metrics. The codes are available at https://github.com/XiaoBuL/CM-UNet. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 5 pages, 6 figures

arXiv:2405.10474 [pdf, ps, other]

Rethinking ChatGPT's Success: Usability and Cognitive Behaviors Enabled by Auto-regressive LLMs' Prompting

Authors: Xinzhe Li, Ming Liu

Abstract: Over the last decade, a wide range of training and deployment strategies for Large Language Models (LLMs) have emerged. Among these, the prompting paradigms of Auto-regressive LLMs (AR-LLMs) have catalyzed a significant surge in Artificial Intelligence (AI). This paper aims to emphasize the significance of utilizing free-form modalities (forms of input and output) and verbal free-form contexts as… ▽ More Over the last decade, a wide range of training and deployment strategies for Large Language Models (LLMs) have emerged. Among these, the prompting paradigms of Auto-regressive LLMs (AR-LLMs) have catalyzed a significant surge in Artificial Intelligence (AI). This paper aims to emphasize the significance of utilizing free-form modalities (forms of input and output) and verbal free-form contexts as user-directed channels (methods for transforming modalities) for downstream deployment. Specifically, we analyze the structure of modalities within both two types of LLMs and six task-specific channels during deployment. From the perspective of users, our analysis introduces and applies the analytical metrics of task customizability, transparency, and complexity to gauge their usability, highlighting the superior nature of AR-LLMs' prompting paradigms. Moreover, we examine the stimulation of diverse cognitive behaviors in LLMs through the adoption of free-form text and verbal contexts, mirroring human linguistic expressions of such behaviors. We then detail four common cognitive behaviors to underscore how AR-LLMs' prompting successfully imitate human-like behaviors using this free-form modality and channel. Lastly, the potential for improving LLM deployment, both as autonomous agents and within multi-agent systems, is identified via cognitive behavior concepts and principles. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.10053 [pdf, other]

SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection

Authors: Mingxuan Liu, Tyler L. Hayes, Elisa Ricci, Gabriela Csurka, Riccardo Volpi

Abstract: Open-vocabulary object detection (OvOD) has transformed detection into a language-guided task, empowering users to freely define their class vocabularies of interest during inference. However, our initial investigation indicates that existing OvOD detectors exhibit significant variability when dealing with vocabularies across various semantic granularities, posing a concern for real-world deployme… ▽ More Open-vocabulary object detection (OvOD) has transformed detection into a language-guided task, empowering users to freely define their class vocabularies of interest during inference. However, our initial investigation indicates that existing OvOD detectors exhibit significant variability when dealing with vocabularies across various semantic granularities, posing a concern for real-world deployment. To this end, we introduce Semantic Hierarchy Nexus (SHiNe), a novel classifier that uses semantic knowledge from class hierarchies. It runs offline in three steps: i) it retrieves relevant super-/sub-categories from a hierarchy for each target class; ii) it integrates these categories into hierarchy-aware sentences; iii) it fuses these sentence embeddings to generate the nexus classifier vector. Our evaluation on various detection benchmarks demonstrates that SHiNe enhances robustness across diverse vocabulary granularities, achieving up to +31.9% mAP50 with ground truth hierarchies, while retaining improvements using hierarchies generated by large language models. Moreover, when applied to open-vocabulary classification on ImageNet-1k, SHiNe improves the CLIP zero-shot baseline by +2.8% accuracy. SHiNe is training-free and can be seamlessly integrated with any off-the-shelf OvOD detector, without incurring additional computational overhead during inference. The code is open source. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: Accepted as a conference paper (highlight) at CVPR 2024

arXiv:2405.09910 [pdf, other]

Performance testing of a novel short axis photomultiplier tube for the HUNT project

Authors: Yijiang Peng, Zike Wang, Bo Gao, Yiyue Tang, Mingjun Chen, Kai Li, Ling Ren, Xiaohao You, Maoyuan Liu

Abstract: Photomultiplier tubes (PMTs) with large-area cathodes are increasingly being used in cosmic-ray experiments to enhance detection efficiency. The optical modules (OMs) of the High-Energy Underwater Neutrino Telescope (HUNT) have employed a brand new N6205 20-inch microchannel plate photomultiplier tube (MCP-PMT) developed by the North Night Vision Science & Technology (Nan**g) Research Institute C… ▽ More Photomultiplier tubes (PMTs) with large-area cathodes are increasingly being used in cosmic-ray experiments to enhance detection efficiency. The optical modules (OMs) of the High-Energy Underwater Neutrino Telescope (HUNT) have employed a brand new N6205 20-inch microchannel plate photomultiplier tube (MCP-PMT) developed by the North Night Vision Science & Technology (Nan**g) Research Institute Co. Ltd. (NNVT). In order to make the 20-inch PMT fit into the 23-inch diameter pressure-resistant glass sphere, NNVT improved the internal structure of PMT and shortened the height of PMT by more than 10~cm. The first batch of these PMTs has been delivered for preliminary research work. This paper describes a specific PMT testing platform built for the first batch of 15 MCP-PMTs, and some performance parameters of PMT, such as P/V ratio, TTS and nonliniearity, are measured.The measurement results show that the new PMT still has good performance and can meet the requirements of HUNT project. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.09776 [pdf]

doi 10.1103/PhysRevB.109.184106

Magnetic structure and magnetoelectric coupling in antiferromagnet Co5(TeO3)4Cl2

Authors: B. Yu, L. Huang, J. S. Li, L. Lin, V. Ovidiu Garlea, Q. Zhang, T. Zou, J. C. Zhang, J. Peng, Y. S. Tang, G. Z. Zhou, J. H. Zhang, S. H. Zheng, M. F. Liu, Z. B. Yan, X. H. Zhou, S. Dong, J. G. Wan, J. -M. Liu

Abstract: The van der Waals (vdW) layered multiferroics, which host simultaneous ferroelectric and magnetic orders, have attracted attention not only for their potentials to be utilized in nanoelectric devices and spintronics, but also offer alternative opportunities for emergent physical phenomena. To date, the vdW layered multiferroic materials are still very rare. In this work, we have investigated the m… ▽ More The van der Waals (vdW) layered multiferroics, which host simultaneous ferroelectric and magnetic orders, have attracted attention not only for their potentials to be utilized in nanoelectric devices and spintronics, but also offer alternative opportunities for emergent physical phenomena. To date, the vdW layered multiferroic materials are still very rare. In this work, we have investigated the magnetic structure and magnetoelectric effects in Co5(TeO3)4Cl2, a promising new multiferroic compound with antiferromagnetic (AFM) Neel point TN = 18 K. The neutron powder diffraction reveals the non-coplanar AFM state with preferred Neel vector along the c-axis, while a spin re-orientation occurring between 8 K and 15 K is identified, which results from the distinct temperature dependence of the non-equivalent Co sites moment in Co5(TeO3)4Cl2. What is more, it is found that Co5(TeO3)4Cl2 is one of the best vdW multiferroics studied so far in terms of the multiferroic performance. The measured linear ME coefficient exhibits the emergent oscillation dependence of the angle between magnetic field and electric field, and the maximal value is as big as 45 ps/m. It is suggested that Co5(TeO3)4Cl2 is an appreciated platform for exploring the emergent multiferroicity in vdW layered compounds. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 31 pages, 9 figures

Journal ref: Phys. Rev. B 109, 184106(2024)

arXiv:2405.09586 [pdf, other]

Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation

Authors: Kang Liu, Zhuoqi Ma, Mengmeng Liu, Zhicheng Jiao, Xiaolu Kang, Qiguang Miao, Kun Xie

Abstract: The automation of writing imaging reports is a valuable tool for alleviating the workload of radiologists. Crucial steps in this process involve the cross-modal alignment between medical images and reports, as well as the retrieval of similar historical cases. However, the presence of presentation-style vocabulary (e.g., sentence structure and grammar) in reports poses challenges for cross-modal a… ▽ More The automation of writing imaging reports is a valuable tool for alleviating the workload of radiologists. Crucial steps in this process involve the cross-modal alignment between medical images and reports, as well as the retrieval of similar historical cases. However, the presence of presentation-style vocabulary (e.g., sentence structure and grammar) in reports poses challenges for cross-modal alignment. Additionally, existing methods for similar historical cases retrieval face suboptimal performance owing to the modal gap issue. In response, this paper introduces a novel method, named Factual Serialization Enhancement (FSE), for chest X-ray report generation. FSE begins with the structural entities approach to eliminate presentation-style vocabulary in reports, providing specific input for our model. Then, uni-modal features are learned through cross-modal alignment between images and factual serialization in reports. Subsequently, we present a novel approach to retrieve similar historical cases from the training set, leveraging aligned image features. These features implicitly preserve semantic similarity with their corresponding reference reports, enabling us to calculate similarity solely among aligned features. This effectively eliminates the modal gap issue for knowledge retrieval without the requirement for disease labels. Finally, the cross-modal fusion network is employed to query valuable information from these cases, enriching image features and aiding the text decoder in generating high-quality reports. Experiments on MIMIC-CXR and IU X-ray datasets from both specific and general scenarios demonstrate the superiority of FSE over state-of-the-art approaches in both natural language generation and clinical efficacy metrics. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.09546 [pdf, other]

BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation

Authors: Yunhao Ge, Yihe Tang, Jiashu Xu, Cem Gokmen, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu

Abstract: The systematic evaluation and understanding of computer vision models under varying conditions require large amounts of data with comprehensive and customized labels, which real-world vision datasets rarely satisfy. While current synthetic data generators offer a promising alternative, particularly for embodied AI tasks, they often fall short for computer vision tasks due to low asset and renderin… ▽ More The systematic evaluation and understanding of computer vision models under varying conditions require large amounts of data with comprehensive and customized labels, which real-world vision datasets rarely satisfy. While current synthetic data generators offer a promising alternative, particularly for embodied AI tasks, they often fall short for computer vision tasks due to low asset and rendering quality, limited diversity, and unrealistic physical properties. We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models, based on the newly developed embodied AI benchmark, BEHAVIOR-1K. BVS supports a large number of adjustable parameters at the scene level (e.g., lighting, object placement), the object level (e.g., joint configuration, attributes such as "filled" and "folded"), and the camera level (e.g., field of view, focal length). Researchers can arbitrarily vary these parameters during data generation to perform controlled experiments. We showcase three example application scenarios: systematically evaluating the robustness of models across different continuous axes of domain shift, evaluating scene understanding models on the same set of images, and training and evaluating simulation-to-real transfer for a novel vision task: unary and binary state prediction. Project website: https://behavior-vision-suite.github.io/ △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: CVPR 2024 (Highlight). Project website: https://behavior-vision-suite.github.io/

arXiv:2405.09066 [pdf, other]

Search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, V. Batozskaya, D. Becker, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko , et al. (559 additional authors not shown)

Abstract: We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for… ▽ More We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ are set to be $1.1 \times 10^{-5}$ and $4.3 \times 10^{-6}$ at 90\% confidence level, respectively. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 14 pages, 7 figures

arXiv:2405.07978 [pdf, other]

Unveiling the Pockels Coefficient of Ferroelectric Nitride ScAlN

Authors: Guangcanlan Yang, Haochen Wang, Sai Mu, Hao Xie, Tyler Wang, Chengxing He, Mohan Shen, Mengxia Liu, Chris G. Van de Walle, Hong X. Tang

Abstract: Nitride ferroelectrics have recently emerged as promising alternatives to oxide ferroelectrics due to their compatibility with mainstream semiconductor processing. ScAlN, in particular, has exhibited remarkable piezoelectric coupling strength ($K^2$) comparable to that of lithium niobate (LN), making it a valuable choice for RF filters in wireless communications. Recently, ScAlN has sparked intere… ▽ More Nitride ferroelectrics have recently emerged as promising alternatives to oxide ferroelectrics due to their compatibility with mainstream semiconductor processing. ScAlN, in particular, has exhibited remarkable piezoelectric coupling strength ($K^2$) comparable to that of lithium niobate (LN), making it a valuable choice for RF filters in wireless communications. Recently, ScAlN has sparked interest in its use for nanophotonic devices, chiefly due to its large bandgap facilitating operation in blue wavelengths coupled with promises of enhanced nonlinear optical properties such as a large second-order susceptibility ($χ^{(2)}$). It is still an open question whether ScAlN can outperform oxide ferroelectrics concerning the Pockels effect -- an electro-optic coupling extensively utilized in optical communications devices. In this paper, we present a comprehensive theoretical analysis and experimental demonstration of ScAlN's Pockels effect. Our findings reveal that the electro-optic coupling of ScAlN, despite being weak at low Sc concentration, may be significantly enhanced at high levels of Sc do**, which points the direction of continued research efforts to unlock the full potential of ScAlN. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.07741 [pdf, other]

Search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (635 additional authors not shown)

Abstract: Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions… ▽ More Using 9.0 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies from 4.178 to 4.278 GeV with the BESIII detector at the BEPCII collider, we perform the first search for the radiative transition $χ_{c1}(3872)\toγψ_2(3823)$. No $χ_{c1}(3872)\toγψ_2(3823)$ signal is observed. The upper limit on the ratio of branching fractions $\mathcal{B}(χ_{c1}(3872)\toγψ_2(3823), ψ_2(3823)\toγχ_{c1})/\mathcal{B}(χ_{c1}(3872)\toπ^+π^- J/ψ)$ is set as 0.075 at the 90\% confidence level. Our result contradicts theoretical predictions under the assumption that the $χ_{c1}(3872)$ is the pure charmonium state $χ_{c1}(2P)$. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 8 pages, 2 figures

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.07386 [pdf, other]

Search for lepton-flavor-violating $τ^- \to μ^-μ^+μ^-$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer, J. Becker , et al. (407 additional authors not shown)

Abstract: We present the result of a search for the charged-lepton-flavor violating decay $τ^- \to μ^-μ^+μ^-$ using a $424fb^{-1}$ sample of data recorded by the Belle II experiment at the SuperKEKB $e^{-}e^{+}$ collider. The selection of $e^{-}e^{+}\toτ^+τ^-$ events is based on an inclusive reconstruction of the non-signal tau decay, and on a boosted decision tree to suppress background. We observe one sig… ▽ More We present the result of a search for the charged-lepton-flavor violating decay $τ^- \to μ^-μ^+μ^-$ using a $424fb^{-1}$ sample of data recorded by the Belle II experiment at the SuperKEKB $e^{-}e^{+}$ collider. The selection of $e^{-}e^{+}\toτ^+τ^-$ events is based on an inclusive reconstruction of the non-signal tau decay, and on a boosted decision tree to suppress background. We observe one signal candidate, which is compatible with the expectation from background processes. We set a $90\%$ confidence level upper limit of $1.9 \times 10^{-8}$ on the branching fraction of the \taumu decay, which is the most stringent bound to date. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Report number: Belle II Preprint 2024-012 KEK Preprint 2024-6

arXiv:2405.07283 [pdf, other]

BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps

Authors: Mingkai Jia, Qingwen Zhang, Bowen Yang, ** Wu, Ming Liu, Patric Jensfelt

Abstract: Global point clouds that correctly represent the static environment features can facilitate accurate localization and robust path planning. However, dynamic objects introduce undesired ghost tracks that are mixed up with the static environment. Existing dynamic removal methods normally fail to balance the performance in computational efficiency and accuracy. In response, we present BeautyMap to ef… ▽ More Global point clouds that correctly represent the static environment features can facilitate accurate localization and robust path planning. However, dynamic objects introduce undesired ghost tracks that are mixed up with the static environment. Existing dynamic removal methods normally fail to balance the performance in computational efficiency and accuracy. In response, we present BeautyMap to efficiently remove the dynamic points while retaining static features for high-fidelity global maps. Our approach utilizes a binary-encoded matrix to efficiently extract the environment features. With a bit-wise comparison between matrices of each frame and the corresponding map region, we can extract potential dynamic regions. Then we use coarse to fine hierarchical segmentation of the $z$-axis to handle terrain variations. The final static restoration module accounts for the range-visibility of each single scan and protects static points out of sight. Comparative experiments underscore BeautyMap's superior performance in both accuracy and efficiency against other dynamic points removal methods. The code is open-sourced at https://github.com/MKJia/BeautyMap. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: The first two authors are co-first authors. 8 pages, accepted by RA-L

arXiv:2405.06782 [pdf, other]

GraphRelate3D: Context-Dependent 3D Object Detection with Inter-Object Relationship Graphs

Authors: Mingyu Liu, Ekim Yurtsever, Marc Brede, Jun Meng, Walter Zimmer, Xingcheng Zhou, Bare Luka Zagar, Yuning Cui, Alois Knoll

Abstract: Accurate and effective 3D object detection is critical for ensuring the driving safety of autonomous vehicles. Recently, state-of-the-art two-stage 3D object detectors have exhibited promising performance. However, these methods refine proposals individually, ignoring the rich contextual information in the object relationships between the neighbor proposals. In this study, we introduce an object r… ▽ More Accurate and effective 3D object detection is critical for ensuring the driving safety of autonomous vehicles. Recently, state-of-the-art two-stage 3D object detectors have exhibited promising performance. However, these methods refine proposals individually, ignoring the rich contextual information in the object relationships between the neighbor proposals. In this study, we introduce an object relation module, consisting of a graph generator and a graph neural network (GNN), to learn the spatial information from certain patterns to improve 3D object detection. Specifically, we create an inter-object relationship graph based on proposals in a frame via the graph generator to connect each proposal with its neighbor proposals. Afterward, the GNN module extracts edge features from the generated graph and iteratively refines proposal features with the captured edge features. Ultimately, we leverage the refined features as input to the detection head to obtain detection results. Our approach improves upon the baseline PV-RCNN on the KITTI validation set for the car class across easy, moderate, and hard difficulty levels by 0.82%, 0.74%, and 0.58%, respectively. Additionally, our method outperforms the baseline by more than 1% under the moderate and hard levels BEV AP on the test server. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.06393 [pdf, other]

Measurement of the ${e}^{+}{e}^{-}\to p \bar{p}π^{0}$ cross section at $\sqrt{s}=2.1000-3.0800$ GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the… ▽ More The process $e^{+}e^{-}\to p\bar{p}π^{0}$ is studied at 20 center-of-mass energies ranging from 2.1000 to 3.0800 GeV using 636.8 pb$^{-1}$ of data collected with the BESIII detector operating at the BEPCII collider. The Born cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$ are measured with high precision. Since the lowest center-of-mass energy, 2.1000 GeV, is less than 90 MeV above the $p\bar{p}π^0$ energy threshold, we can probe the threshold behavior for this reaction. However, no anomalous threshold enhancement is found in the cross sections for $e^{+}e^{-}\to p\bar{p}π^{0}$. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.06329 [pdf]

ChatGPTest: opportunities and cautionary tales of utilizing AI for questionnaire pretesting

Authors: Francisco Olivos, Minhui Liu

Abstract: The rapid advancements in generative artificial intelligence have opened up new avenues for enhancing various aspects of research, including the design and evaluation of survey questionnaires. However, the recent pioneering applications have not considered questionnaire pretesting. This article explores the use of GPT models as a useful tool for pretesting survey questionnaires, particularly in th… ▽ More The rapid advancements in generative artificial intelligence have opened up new avenues for enhancing various aspects of research, including the design and evaluation of survey questionnaires. However, the recent pioneering applications have not considered questionnaire pretesting. This article explores the use of GPT models as a useful tool for pretesting survey questionnaires, particularly in the early stages of survey design. Illustrated with two applications, the article suggests incorporating GPT feedback as an additional stage before human pretesting, potentially reducing successive iterations. The article also emphasizes the indispensable role of researchers' judgment in interpreting and implementing AI-generated feedback. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: 11 pages, 2 Figures

arXiv:2405.06178 [pdf, other]

ACTION: Augmentation and Computation Toolbox for Brain Network Analysis with Functional MRI

Authors: Yuqi Fang, Junhao Zhang, Linmin Wang, Qianqian Wang, Mingxia Liu

Abstract: Functional magnetic resonance imaging (fMRI) has been increasingly employed to investigate functional brain activity. Many fMRI-related software/toolboxes have been developed, providing specialized algorithms for fMRI analysis. However, existing toolboxes seldom consider fMRI data augmentation, which is quite useful, especially in studies with limited or imbalanced data. Moreover, current studies… ▽ More Functional magnetic resonance imaging (fMRI) has been increasingly employed to investigate functional brain activity. Many fMRI-related software/toolboxes have been developed, providing specialized algorithms for fMRI analysis. However, existing toolboxes seldom consider fMRI data augmentation, which is quite useful, especially in studies with limited or imbalanced data. Moreover, current studies usually focus on analyzing fMRI using conventional machine learning models that rely on human-engineered fMRI features, without investigating deep learning models that can automatically learn data-driven fMRI representations. In this work, we develop an open-source toolbox, called Augmentation and Computation Toolbox for braIn netwOrk aNalysis (ACTION), offering comprehensive functions to streamline fMRI analysis. The ACTION is a Python-based and cross-platform toolbox with graphical user-friendly interfaces. It enables automatic fMRI augmentation, covering blood-oxygen-level-dependent (BOLD) signal augmentation and brain network augmentation. Many popular methods for brain network construction and network feature extraction are included. In particular, it supports constructing deep learning models, which leverage large-scale auxiliary unlabeled data (3,800+ resting-state fMRI scans) for model pretraining to enhance model performance for downstream tasks. To facilitate multi-site fMRI studies, it is also equipped with several popular federated learning strategies. Furthermore, it enables users to design and test custom algorithms through scripting, greatly improving its utility and extensibility. We demonstrate the effectiveness and user-friendliness of ACTION on real fMRI data and present the experimental results. The software, along with its source code and manual, can be accessed online. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: 14 pages, 5 figures, 5 tables

arXiv:2405.05584 [pdf, other]

A Survey on Backbones for Deep Video Action Recognition

Authors: Zixuan Tang, Youjun Zhao, Yuhang Wen, Mengyuan Liu

Abstract: Action recognition is a key technology in building interactive metaverses. With the rapid development of deep learning, methods in action recognition have also achieved great advancement. Researchers design and implement the backbones referring to multiple standpoints, which leads to the diversity of methods and encountering new challenges. This paper reviews several action recognition methods bas… ▽ More Action recognition is a key technology in building interactive metaverses. With the rapid development of deep learning, methods in action recognition have also achieved great advancement. Researchers design and implement the backbones referring to multiple standpoints, which leads to the diversity of methods and encountering new challenges. This paper reviews several action recognition methods based on deep neural networks. We introduce these methods in three parts: 1) Two-Streams networks and their variants, which, specifically in this paper, use RGB video frame and optical flow modality as input; 2) 3D convolutional networks, which make efforts in taking advantage of RGB modality directly while extracting different motion information is no longer necessary; 3) Transformer-based methods, which introduce the model from natural language processing into computer vision and video understanding. We offer objective sights in this review and hopefully provide a reference for future research. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: This paper has been accepted by ICME workshop

arXiv:2405.05554 [pdf, other]

RELICS: a REactor neutrino LIquid xenon Coherent elastic Scattering experiment

Authors: Chang Cai, Guocai Chen, Jiangyu Chen, Rundong Fang, Fei Gao, Xiaoran Guo, Jiheng Guo, Tingyi He, Chengjie Jia, Gaojun **, Yipin **g, Gaojun Ju, Yang Lei, Jiayi Li, Kaihang Li, Meng Li, Minhua Li, Shengchao Li, Siyin Li, Tao Li, Qing Lin, Jiajun Liu, Minghao Liu, Sheng Lv, Guang Luo , et al. (24 additional authors not shown)

Abstract: Coherent elastic neutrino-nucleus scattering (CEvNS) provides a unique probe for neutrino properties Beyond the Standard Model (BSM) physics. REactor neutrino LIquid xenon Coherent Scattering experiment (RELICS), a proposed reactor neutrino program using liquid xenon time projection chamber (LXeTPC) technology, aims to investigate the CEvNS process of antineutrinos off xenon atomic nuclei. In this… ▽ More Coherent elastic neutrino-nucleus scattering (CEvNS) provides a unique probe for neutrino properties Beyond the Standard Model (BSM) physics. REactor neutrino LIquid xenon Coherent Scattering experiment (RELICS), a proposed reactor neutrino program using liquid xenon time projection chamber (LXeTPC) technology, aims to investigate the CEvNS process of antineutrinos off xenon atomic nuclei. In this work, the design of the experiment is studied and optimized based on Monte Carlo (MC) simulations. To achieve a sufficiently low energy threshold for CEvNS detection, an ionization-only analysis channel is adopted for RELICS. A high emission rate of delayed electrons after a big ionization signal is the major background, leading to an analysis threshold of 120 photo-electrons in the CEvNS search. The second largest background, nuclear recoils induced by cosmic-ray neutrons, is suppressed via a passive water shield. The physics potential of RELICS is explored with a 32 kg-yr exposure at a baseline of 25 m from a reactor core with a 3 GW thermal power. In an energy range of 120 to 240 PE, we expect 4902.4 CEvNS and 1318.4 background events. The sensitivity of RELICS to the weak mixing angle is investigated at a low momentum transfer. Our study shows that RELICS can further improve the constraints on the non-standard neutrino interaction (NSI) compared to the current best results. △ Less

Submitted 12 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

arXiv:2405.05523 [pdf, other]

Prompt When the Animal is: Temporal Animal Behavior Grounding with Positional Recovery Training

Authors: Sheng Yan, Xin Du, Zongying Li, Yi Wang, Hongcang **, Mengyuan Liu

Abstract: Temporal grounding is crucial in multimodal learning, but it poses challenges when applied to animal behavior data due to the sparsity and uniform distribution of moments. To address these challenges, we propose a novel Positional Recovery Training framework (Port), which prompts the model with the start and end times of specific animal behaviors during training. Specifically, Port enhances the ba… ▽ More Temporal grounding is crucial in multimodal learning, but it poses challenges when applied to animal behavior data due to the sparsity and uniform distribution of moments. To address these challenges, we propose a novel Positional Recovery Training framework (Port), which prompts the model with the start and end times of specific animal behaviors during training. Specifically, Port enhances the baseline model with a Recovering part to predict flipped label sequences and align distributions with a Dual-alignment method. This allows the model to focus on specific temporal regions prompted by ground-truth information. Extensive experiments on the Animal Kingdom dataset demonstrate the effectiveness of Port, achieving an [email protected] of 38.52. It emerges as one of the top performers in the sub-track of MMVRAC in ICME 2024 Grand Challenges. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted by ICMEW 2024. arXiv admin note: text overlap with arXiv:2404.13657

arXiv:2405.04341 [pdf, other]

Investigations on the weak decays of $D\bar{B}$ molecules

Authors: Ming-Zhu Liu, Li-Sheng Geng

Abstract: The decays of exotic states discovered experimentally always proceed via the strong and electromagnetic interactions. Recently, a tetraquark state with the quark content $bc\bar{q}\bar{q}$ was predicted by Lattice QCD simulations. It is below the mass threshold of $D\bar{B}$, which can only decay via the weak interaction. In this work, based on the decay mechanism of $T_{cc}$ as a $DD^*$ molecule,… ▽ More The decays of exotic states discovered experimentally always proceed via the strong and electromagnetic interactions. Recently, a tetraquark state with the quark content $bc\bar{q}\bar{q}$ was predicted by Lattice QCD simulations. It is below the mass threshold of $D\bar{B}$, which can only decay via the weak interaction. In this work, based on the decay mechanism of $T_{cc}$ as a $DD^*$ molecule, we propose that the decays of the $bc\bar{q}\bar{q}$ tertaquark state as a $D\bar{B}$ molecule proceed via the Cabibbo-favored weak decays of the $\bar{B}$ or $D$ meson, accompanied by the tree-level decay modes and the triangle decay modes. Our results indicate that the branching fraction of the $D\bar{B}$ molecule decaying into $π^+ K^{-} \bar{B}^0$ is sizable, which is a good channel to observe the $D\bar{B}$ molecule in future experiments. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.03806 [pdf, other]

In Situ AI Prototy**: Infusing Multimodal Prompts into Mobile Settings with MobileMaker

Authors: Savvas Petridis, Michael Xieyang Liu, Alexander J. Fiannaca, Vivian Tsai, Michael Terry, Carrie J. Cai

Abstract: Recent advances in multimodal large language models (LLMs) have lowered the barriers to rapidly prototy** AI-powered features via prompting, especially for mobile-intended use cases. Despite the value of situated user feedback, the process of soliciting early, mobile-situated user feedback on AI prototypes remains challenging. The broad scope and flexibility of LLMs means that, for a given use-c… ▽ More Recent advances in multimodal large language models (LLMs) have lowered the barriers to rapidly prototy** AI-powered features via prompting, especially for mobile-intended use cases. Despite the value of situated user feedback, the process of soliciting early, mobile-situated user feedback on AI prototypes remains challenging. The broad scope and flexibility of LLMs means that, for a given use-case-specific prototype, there is a crucial need to understand the wide range of in-the-wild input likely to be provided by the user, as well as their in-context expectations of the AI's behavior. To explore the concept of in situ AI prototy** and testing, we created MobileMaker: an AI prototy** tool that enables designers to rapidly create mobile AI prototypes that can be tested on-device, and enables testers to make on-device, in-the-field revisions of the prototype through natural language. In an exploratory study with 16 users, we explored how user feedback on prototypes created with MobileMaker compares to that of existing prototy** tools (e.g., Figma, prompt editors). We found that MobileMaker prototypes enabled more serendipitous discovery of: model input edge cases, discrepancies between AI's and user's in-context interpretation of the task, and contextual signals missed by the AI. Furthermore, we learned that while the ability to make in-the-wild revisions led users to feel more fulfilled as active participants in the design process, it might also constrain their feedback to the subset of changes perceived as more actionable or implementable by the prototy** tool. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.02881 [pdf, other]

FedConPE: Efficient Federated Conversational Bandits with Heterogeneous Clients

Authors: Zhuohua Li, Maoli Liu, John C. S. Lui

Abstract: Conversational recommender systems have emerged as a potent solution for efficiently eliciting user preferences. These systems interactively present queries associated with "key terms" to users and leverage user feedback to estimate user preferences more efficiently. Nonetheless, most existing algorithms adopt a centralized approach. In this paper, we introduce FedConPE, a phase elimination-based… ▽ More Conversational recommender systems have emerged as a potent solution for efficiently eliciting user preferences. These systems interactively present queries associated with "key terms" to users and leverage user feedback to estimate user preferences more efficiently. Nonetheless, most existing algorithms adopt a centralized approach. In this paper, we introduce FedConPE, a phase elimination-based federated conversational bandit algorithm, where $M$ agents collaboratively solve a global contextual linear bandit problem with the help of a central server while ensuring secure data management. To effectively coordinate all the clients and aggregate their collected data, FedConPE uses an adaptive approach to construct key terms that minimize uncertainty across all dimensions in the feature space. Furthermore, compared with existing federated linear bandit algorithms, FedConPE offers improved computational and communication efficiency as well as enhanced privacy protections. Our theoretical analysis shows that FedConPE is minimax near-optimal in terms of cumulative regret. We also establish upper bounds for communication costs and conversation frequency. Comprehensive evaluations demonstrate that FedConPE outperforms existing conversational bandit algorithms while using fewer conversations. △ Less

Submitted 20 June, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

Comments: Accepted to the 33rd International Joint Conference on Artificial Intelligence (IJCAI), 2024

arXiv:2405.02504 [pdf, other]

Functional Imaging Constrained Diffusion for Brain PET Synthesis from Structural MRI

Authors: Minhui Yu, Mengqi Wu, Ling Yue, Andrea Bozoki, Mingxia Liu

Abstract: Magnetic resonance imaging (MRI) and positron emission tomography (PET) are increasingly used in multimodal analysis of neurodegenerative disorders. While MRI is broadly utilized in clinical settings, PET is less accessible. Many studies have attempted to use deep generative models to synthesize PET from MRI scans. However, they often suffer from unstable training and inadequately preserve brain f… ▽ More Magnetic resonance imaging (MRI) and positron emission tomography (PET) are increasingly used in multimodal analysis of neurodegenerative disorders. While MRI is broadly utilized in clinical settings, PET is less accessible. Many studies have attempted to use deep generative models to synthesize PET from MRI scans. However, they often suffer from unstable training and inadequately preserve brain functional information conveyed by PET. To this end, we propose a functional imaging constrained diffusion (FICD) framework for 3D brain PET image synthesis with paired structural MRI as input condition, through a new constrained diffusion model (CDM). The FICD introduces noise to PET and then progressively removes it with CDM, ensuring high output fidelity throughout a stable training phase. The CDM learns to predict denoised PET with a functional imaging constraint introduced to ensure voxel-wise alignment between each denoised PET and its ground truth. Quantitative and qualitative analyses conducted on 293 subjects with paired T1-weighted MRI and 18F-fluorodeoxyglucose (FDG)-PET scans suggest that FICD achieves superior performance in generating FDG-PET data compared to state-of-the-art methods. We further validate the effectiveness of the proposed FICD on data from a total of 1,262 subjects through three downstream tasks, with experimental results suggesting its utility and generalizability. △ Less

Submitted 8 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

arXiv:2405.01750 [pdf, other]

PointCompress3D -- A Point Cloud Compression Framework for Roadside LiDARs in Intelligent Transportation Systems

Authors: Walter Zimmer, Ramandika Pranamulia, Xingcheng Zhou, Mingyu Liu, Alois C. Knoll

Abstract: In the context of Intelligent Transportation Systems (ITS), efficient data compression is crucial for managing large-scale point cloud data acquired by roadside LiDAR sensors. The demand for efficient storage, streaming, and real-time object detection capabilities for point cloud data is substantial. This work introduces PointCompress3D, a novel point cloud compression framework tailored specifica… ▽ More In the context of Intelligent Transportation Systems (ITS), efficient data compression is crucial for managing large-scale point cloud data acquired by roadside LiDAR sensors. The demand for efficient storage, streaming, and real-time object detection capabilities for point cloud data is substantial. This work introduces PointCompress3D, a novel point cloud compression framework tailored specifically for roadside LiDARs. Our framework addresses the challenges of compressing high-resolution point clouds while maintaining accuracy and compatibility with roadside LiDAR sensors. We adapt, extend, integrate, and evaluate three cutting-edge compression methods using our real-world-based TUMTraf dataset family. We achieve a frame rate of 10 FPS while kee** compression sizes below 105 Kb, a reduction of 50 times, and maintaining object detection performance on par with the original data. In extensive experiments and ablation studies, we finally achieved a PSNR d2 of 94.46 and a BPP of 6.54 on our dataset. Future work includes the deployment on the live system. The code is available on our project website: https://pointcompress3d.github.io. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2405.01461 [pdf, other]

SATO: Stable Text-to-Motion Framework

Authors: Wenshuo Chen, Hongru Xiao, Erhang Zhang, Lijie Hu, Lei Wang, Mengyuan Liu, Chen Chen

Abstract: Is the Text to Motion model robust? Recent advancements in Text to Motion models primarily stem from more accurate predictions of specific actions. However, the text modality typically relies solely on pre-trained Contrastive Language-Image Pretraining (CLIP) models. Our research has uncovered a significant issue with the text-to-motion model: its predictions often exhibit inconsistent outputs, re… ▽ More Is the Text to Motion model robust? Recent advancements in Text to Motion models primarily stem from more accurate predictions of specific actions. However, the text modality typically relies solely on pre-trained Contrastive Language-Image Pretraining (CLIP) models. Our research has uncovered a significant issue with the text-to-motion model: its predictions often exhibit inconsistent outputs, resulting in vastly different or even incorrect poses when presented with semantically similar or identical text inputs. In this paper, we undertake an analysis to elucidate the underlying causes of this instability, establishing a clear link between the unpredictability of model outputs and the erratic attention patterns of the text encoder module. Consequently, we introduce a formal framework aimed at addressing this issue, which we term the Stable Text-to-Motion Framework (SATO). SATO consists of three modules, each dedicated to stable attention, stable prediction, and maintaining a balance between accuracy and robustness trade-off. We present a methodology for constructing an SATO that satisfies the stability of attention and prediction. To verify the stability of the model, we introduced a new textual synonym perturbation dataset based on HumanML3D and KIT-ML. Results show that SATO is significantly more stable against synonyms and other slight perturbations while kee** its high accuracy performance. △ Less

Submitted 3 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

arXiv:2405.00961 [pdf, ps, other]

Right-handed neutrino dark matter in $U(1)_X$SSM

Authors: Ming-Yue Liu, Shu-Min Zhao, Song Gao, Long Ruan, Tai-Fu Feng

Abstract: There is strong evidence for the existence of dark matter in a number of current experiments. We study dark matter using the $U(1)_X$SSM obtained from the $U(1)_X$ extension of the minimal supersymmetric standard model (MSSM). In the $U(1)_X$SSM, we use the right-handed neutrino as a dark matter candidate, whose lightest mass eigenstate has cold dark matter features. In this paper, the relic densi… ▽ More There is strong evidence for the existence of dark matter in a number of current experiments. We study dark matter using the $U(1)_X$SSM obtained from the $U(1)_X$ extension of the minimal supersymmetric standard model (MSSM). In the $U(1)_X$SSM, we use the right-handed neutrino as a dark matter candidate, whose lightest mass eigenstate has cold dark matter features. In this paper, the relic density of right-handed neutrino as dark matter is investigated. For dark matter scattering, both spin-independent and spin-dependent cross sections are studied. In the final numerical results obtained, some parameter spaces can satisfy the constraints of the relic density and dark matter direct detection experiments. △ Less

Submitted 1 May, 2024; originally announced May 2024.

arXiv:2405.00704 [pdf, ps, other]

A Survey on the Real Power of ChatGPT

Authors: Ming Liu, Ran Liu, Ye Zhu, Hua Wang, Youyang Qu, Rongsheng Li, Yongpan Sheng, Wray Buntine

Abstract: ChatGPT has changed the AI community and an active research line is the performance evaluation of ChatGPT. A key challenge for the evaluation is that ChatGPT is still closed-source and traditional benchmark datasets may have been used by ChatGPT as the training data. In this paper, (i) we survey recent studies which uncover the real performance levels of ChatGPT in seven categories of NLP tasks, (… ▽ More ChatGPT has changed the AI community and an active research line is the performance evaluation of ChatGPT. A key challenge for the evaluation is that ChatGPT is still closed-source and traditional benchmark datasets may have been used by ChatGPT as the training data. In this paper, (i) we survey recent studies which uncover the real performance levels of ChatGPT in seven categories of NLP tasks, (ii) review the social implications and safety issues of ChatGPT, and (iii) emphasize key challenges and opportunities for its evaluation. We hope our survey can shed some light on its blackbox manner, so that researchers are not misleaded by its surface generation. △ Less

Submitted 9 May, 2024; v1 submitted 22 April, 2024; originally announced May 2024.

Comments: 18 pages, 2 tables

arXiv:2405.00700 [pdf]

Oxygen vacancies modulated VO2 for neurons and Spiking Neural Network construction

Authors: Liang Li, Ting Zhou, Tong Liu, Zhiwei Liu, Ya** Li, Shuo Wu, Shanguang Zhao, **glin Zhu, Meiling Liu, Zhihan Lin, Bowen Sun, Jianjun Li, Fangwen Sun, Chongwen Zou

Abstract: Artificial neuronal devices are the basic building blocks for neuromorphic computing systems, which have been motivated by realistic brain emulation. Aiming for these applications, various device concepts have been proposed to mimic the neuronal dynamics and functions. While till now, the artificial neuron devices with high efficiency, high stability and low power consumption are still far from pr… ▽ More Artificial neuronal devices are the basic building blocks for neuromorphic computing systems, which have been motivated by realistic brain emulation. Aiming for these applications, various device concepts have been proposed to mimic the neuronal dynamics and functions. While till now, the artificial neuron devices with high efficiency, high stability and low power consumption are still far from practical application. Due to the special insulator-metal phase transition, Vanadium Dioxide (VO2) has been considered as an idea candidate for neuronal device fabrication. However, its intrinsic insulating state requires the VO2 neuronal device to be driven under large bias voltage, resulting in high power consumption and low frequency. Thus in the current study, we have addressed this challenge by preparing oxygen vacancies modulated VO2 film(VO2-x) and fabricating the VO2-x neuronal devices for Spiking Neural Networks (SNNs) construction. Results indicate the neuron devices can be operated under lower voltage with improved processing speed. The proposed VO2-x based back-propagation SNNs (BP-SNNs) system, trained with the MNIST dataset, demonstrates excellent accuracy in image recognition. Our study not only demonstrates the VO2-x based neurons and SNN system for practical application, but also offers an effective way to optimize the future neuromorphic computing systems by defect engineering strategy. △ Less

Submitted 16 April, 2024; originally announced May 2024.

Comments: 18 pages,4 figures

arXiv:2405.00254 [pdf, other]

RLHF from Heterogeneous Feedback via Personalization and Preference Aggregation

Authors: Chanwoo Park, Mingyang Liu, Dingwen Kong, Kaiqing Zhang, Asuman Ozdaglar

Abstract: Reinforcement learning from human feedback (RLHF) has been an effective technique for aligning AI systems with human values, with remarkable successes in fine-tuning large-language models recently. Most existing RLHF paradigms make the underlying assumption that human preferences are relatively homogeneous, and can be encoded by a single reward model. In this paper, we focus on addressing the issu… ▽ More Reinforcement learning from human feedback (RLHF) has been an effective technique for aligning AI systems with human values, with remarkable successes in fine-tuning large-language models recently. Most existing RLHF paradigms make the underlying assumption that human preferences are relatively homogeneous, and can be encoded by a single reward model. In this paper, we focus on addressing the issues due to the inherent heterogeneity in human preferences, as well as their potential strategic behavior in providing feedback. Specifically, we propose two frameworks to address heterogeneous human feedback in principled ways: personalization-based one and aggregation-based one. For the former, we propose two approaches based on representation learning and clustering, respectively, for learning multiple reward models that trades off the bias (due to preference heterogeneity) and variance (due to the use of fewer data for learning each model by personalization). We then establish sample complexity guarantees for both approaches. For the latter, we aim to adhere to the single-model framework, as already deployed in the current RLHF paradigm, by carefully aggregating diverse and truthful preferences from humans. We propose two approaches based on reward and preference aggregation, respectively: the former utilizes both utilitarianism and Leximin approaches to aggregate individual reward models, with sample complexity guarantees; the latter directly aggregates the human feedback in the form of probabilistic opinions. Under the probabilistic-opinion-feedback model, we also develop an approach to handle strategic human labelers who may bias and manipulate the aggregated preferences with untruthful feedback. Based on the ideas in mechanism design, our approach ensures truthful preference reporting, with the induced aggregation rule maximizing social welfare functions. △ Less

Submitted 27 May, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

Comments: Added experiments

arXiv:2404.19752 [pdf, other]

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Authors: Yunhao Ge, Xiaohui Zeng, Jacob Samuel Huffman, Tsung-Yi Lin, Ming-Yu Liu, Yin Cui

Abstract: Existing automatic captioning methods for visual content face challenges such as lack of detail, content hallucination, and poor instruction following. In this work, we propose VisualFactChecker (VFC), a flexible training-free pipeline that generates high-fidelity and detailed captions for both 2D images and 3D objects. VFC consists of three steps: 1) proposal, where image-to-text captioning model… ▽ More Existing automatic captioning methods for visual content face challenges such as lack of detail, content hallucination, and poor instruction following. In this work, we propose VisualFactChecker (VFC), a flexible training-free pipeline that generates high-fidelity and detailed captions for both 2D images and 3D objects. VFC consists of three steps: 1) proposal, where image-to-text captioning models propose multiple initial captions; 2) verification, where a large language model (LLM) utilizes tools such as object detection and VQA models to fact-check proposed captions; 3) captioning, where an LLM generates the final caption by summarizing caption proposals and the fact check verification results. In this step, VFC can flexibly generate captions in various styles following complex instructions. We conduct comprehensive captioning evaluations using four metrics: 1) CLIP-Score for image-text similarity; 2) CLIP-Image-Score for measuring the image-image similarity between the original and the reconstructed image generated by a text-to-image model using the caption. 3) human study on Amazon Mechanical Turk; 4) GPT-4V for fine-grained evaluation. Evaluation results show that VFC outperforms state-of-the-art open-sourced captioning methods for 2D images on the COCO dataset and 3D assets on the Objaverse dataset. Our study demonstrates that by combining open-source models into a pipeline, we can attain captioning capability comparable to proprietary models such as GPT-4V, despite being over 10x smaller in model size. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: CVPR 2024

arXiv:2404.19615 [pdf, other]

SemiPL: A Semi-supervised Method for Event Sound Source Localization

Authors: Yue Li, Baiqiao Yin, **fu Liu, Jiajun Wen, Jiaying Lin, Mengyuan Liu

Abstract: In recent years, Event Sound Source Localization has been widely applied in various fields. Recent works typically relying on the contrastive learning framework show impressive performance. However, all work is based on large relatively simple datasets. It's also crucial to understand and analyze human behaviors (actions and interactions of people), voices, and sounds in chaotic events in many app… ▽ More In recent years, Event Sound Source Localization has been widely applied in various fields. Recent works typically relying on the contrastive learning framework show impressive performance. However, all work is based on large relatively simple datasets. It's also crucial to understand and analyze human behaviors (actions and interactions of people), voices, and sounds in chaotic events in many applications, e.g., crowd management, and emergency response services. In this paper, we apply the existing model to a more complex dataset, explore the influence of parameters on the model, and propose a semi-supervised improvement method SemiPL. With the increase in data quantity and the influence of label quality, self-supervised learning will be an unstoppable trend. The experiment shows that the parameter adjustment will positively affect the existing model. In particular, SSPL achieved an improvement of 12.2% cIoU and 0.56% AUC in Chaotic World compared to the results provided. The code is available at: https://github.com/ly245422/SSPL △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2404.19518 [pdf, other]

MGCBS: An Optimal and Efficient Algorithm for Solving Multi-Goal Multi-Agent Path Finding Problem

Authors: Mingkai Tang, Yuanhang Li, Hongji Liu, Yingbing Chen, Ming Liu, Lujia Wang

Abstract: With the expansion of the scale of robotics applications, the multi-goal multi-agent pathfinding (MG-MAPF) problem began to gain widespread attention. This problem requires each agent to visit pre-assigned multiple goal points at least once without conflict. Some previous methods have been proposed to solve the MG-MAPF problem based on Decoupling the goal Vertex visiting order search and the Singl… ▽ More With the expansion of the scale of robotics applications, the multi-goal multi-agent pathfinding (MG-MAPF) problem began to gain widespread attention. This problem requires each agent to visit pre-assigned multiple goal points at least once without conflict. Some previous methods have been proposed to solve the MG-MAPF problem based on Decoupling the goal Vertex visiting order search and the Single-agent pathfinding (DVS). However, this paper demonstrates that the methods based on DVS cannot always obtain the optimal solution. To obtain the optimal result, we propose the Multi-Goal Conflict-Based Search (MGCBS), which is based on Decoupling the goal Safe interval visiting order search and the Single-agent pathfinding (DSS). Additionally, we present the Time-Interval-Space Forest (TIS Forest) to enhance the efficiency of MGCBS by maintaining the shortest paths from any start point at any start time step to each safe interval at the goal points. The experiment demonstrates that our method can consistently obtain optimal results and execute up to 7 times faster than the state-of-the-art method in our evaluation. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: to be published in IJCAI2024

arXiv:2404.19321 [pdf, other]

Observation of strain-rate softening behavior in jammed granular media

Authors: Mingchao Liu, Weining Mao, Yiqiu Zhao, Qin Xu, Yixiang Gan, Yifan Wang, K Jimmy Hsia

Abstract: The strain-rate sensitivity of confined granular materials has been widely explored, with most findings exhibiting rate-strengthening behaviors. This study, however, reveals a distinct rate-softening behavior across a certain strain rate range based on triaxial tests on particle clusters of various materials with different surface properties, particle sizes, shapes, and stiffness. This softening e… ▽ More The strain-rate sensitivity of confined granular materials has been widely explored, with most findings exhibiting rate-strengthening behaviors. This study, however, reveals a distinct rate-softening behavior across a certain strain rate range based on triaxial tests on particle clusters of various materials with different surface properties, particle sizes, shapes, and stiffness. This softening effect is especially pronounced in the case of common rice particles. By examining the behavior of rice particles under different confining pressure and surface conditions, and directly measuring the frictional coefficient across various loading rates, we find that the reduction in surface frictional coefficient with the increasing strain rate predominantly contributes to this rate-softening behavior. This conclusion is validated by results from Finite Element Method (FEM) simulations. Additionally, we identify confining pressure as a critical factor regulating the normal stress between particles, and thereby enhancing frictional behavior. Rheometer tests reveal that the shear modulus exhibits a similar rate-softening trend. This study of rate-softening behavior in granular materials enhances our understanding of the mechanisms during their deformation under confining pressure. It also suggests that local inter-particle tribology significantly impacts overall granular behavior. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: 16 pages, 12 figures

arXiv:2404.19232 [pdf, other]

GRAMMAR: Grounded and Modular Methodology for Assessment of Domain-Specific Retrieval-Augmented Language Model

Authors: Xinzhe Li, Ming Liu, Shang Gao

Abstract: Retrieval-augmented Generation (RAG) systems have been actively studied and deployed across various industries to query on domain-specific knowledge base. However, evaluating these systems presents unique challenges due to the scarcity of domain-specific queries and corresponding ground truths, as well as a lack of systematic approaches to diagnosing the cause of failure cases -- whether they stem… ▽ More Retrieval-augmented Generation (RAG) systems have been actively studied and deployed across various industries to query on domain-specific knowledge base. However, evaluating these systems presents unique challenges due to the scarcity of domain-specific queries and corresponding ground truths, as well as a lack of systematic approaches to diagnosing the cause of failure cases -- whether they stem from knowledge deficits or issues related to system robustness. To address these challenges, we introduce GRAMMAR (GRounded And Modular Methodology for Assessment of RAG), an evaluation framework comprising two key elements: 1) a data generation process that leverages relational databases and LLMs to efficiently produce scalable query-answer pairs. This method facilitates the separation of query logic from linguistic variations for enhanced debugging capabilities; and 2) an evaluation framework that differentiates knowledge gaps from robustness and enables the identification of defective modules. Our empirical results underscore the limitations of current reference-free evaluation approaches and the reliability of GRAMMAR to accurately identify model vulnerabilities. △ Less

Submitted 29 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

Showing 101–150 of 3,833 results for author: Liu, M