Search | arXiv e-print repository

GLID: Pre-training a Generalist Encoder-Decoder Vision Model

Authors: Jihao Liu, **liang Zheng, Yu Liu, Hongsheng Li

Abstract: This paper proposes a GeneraLIst encoder-Decoder (GLID) pre-training method for better handling various downstream computer vision tasks. While self-supervised pre-training approaches, e.g., Masked Autoencoder, have shown success in transfer learning, task-specific sub-architectures are still required to be appended for different downstream tasks, which cannot enjoy the benefits of large-scale pre… ▽ More This paper proposes a GeneraLIst encoder-Decoder (GLID) pre-training method for better handling various downstream computer vision tasks. While self-supervised pre-training approaches, e.g., Masked Autoencoder, have shown success in transfer learning, task-specific sub-architectures are still required to be appended for different downstream tasks, which cannot enjoy the benefits of large-scale pre-training. GLID overcomes this challenge by allowing the pre-trained generalist encoder-decoder to be fine-tuned on various vision tasks with minimal task-specific architecture modifications. In the GLID training scheme, pre-training pretext task and other downstream tasks are modeled as "query-to-answer" problems, including the pre-training pretext task and other downstream tasks. We pre-train a task-agnostic encoder-decoder with query-mask pairs. During fine-tuning, GLID maintains the pre-trained encoder-decoder and queries, only replacing the topmost linear transformation layer with task-specific linear heads. This minimizes the pretrain-finetune architecture inconsistency and enables the pre-trained model to better adapt to downstream tasks. GLID achieves competitive performance on various vision tasks, including object detection, image segmentation, pose estimation, and depth estimation, outperforming or matching specialist models such as Mask2Former, DETR, ViTPose, and BinsFormer. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: CVPR 2024

arXiv:2404.07436 [pdf, other]

Measurement of $e^{+}e^{-}\to ωη^{\prime}$ cross sections at $\sqrt{s}=$ 2.000 to 3.080 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (599 additional authors not shown)

Abstract: The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be… ▽ More The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be $Γ_{R}=(167\pm77\pm7)~\rm{MeV}$, where the first uncertainties are statistical and the second are systematic. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.07149 [pdf, other]

Tianyu: search for the second solar system and explore the dynamic universe

Authors: Fabo Feng, Yicheng Rui, Zhimao Du, Qing Lin, Congcong Zhang, Dan Zhou, Kaiming Cui, Masahiro Ogihara, Ming Yang, Jie Lin, Yongzhi Cai, Taozhi Yang, Xiaoying Pang, Mingjie Jian, Wenxiong Li, Hengxiao Guo, Xian Shi, Jianchun Shi, Jianyang Li, Kangrou Guo, Song Yao, Aming Chen, Peng Jia, Xianyu Tan, James S. Jenkins , et al. (10 additional authors not shown)

Abstract: Giant planets like Jupiter and Saturn, play important roles in the formation and habitability of Earth-like planets. The detection of solar system analogs that have multiple cold giant planets is essential for our understanding of planet habitability and planet formation. Although transit surveys such as Kepler and TESS have discovered thousands of exoplanets, these missions are not sensitive to l… ▽ More Giant planets like Jupiter and Saturn, play important roles in the formation and habitability of Earth-like planets. The detection of solar system analogs that have multiple cold giant planets is essential for our understanding of planet habitability and planet formation. Although transit surveys such as Kepler and TESS have discovered thousands of exoplanets, these missions are not sensitive to long period planets due to their limited observation baseline. The Tianyu project, comprising two 1-meter telescopes (Tianyu-I and II), is designed to detect transiting cold giant planets in order to find solar system analogs. Featuring a large field of view and equipped with a high-speed CMOS camera, Tianyu-I will perform a high-precision photometric survey of about 100 million stars, measuring light curves at hour-long cadence. The candidates found by Tianyu-I will be confirmed by Tianyu-II and other surveys and follow-up facilities through multi-band photometry, spectroscopy, and high resolution imaging. Tianyu telescopes will be situated at an elevation about 4000 meters in Lenghu, China. With a photometric precision of 1% for stars with V < 18 mag, Tianyu is expected to find more than 300 transiting exoplanets, including about 12 cold giant planets, over five years. A five-year survey of Tianyu would discover 1-2 solar system analogs. Moreover, Tianyu is also designed for non-exoplanetary exploration, incorporating multiple survey modes covering timescales from sub-seconds to months, with a particular emphasis on events occurring within the sub-second to hour range. It excels in observing areas such as infant supernovae, rare variable stars and binaries, tidal disruption events, Be stars, cometary activities, and interstellar objects. These discoveries not only enhance our comprehension of the universe but also offer compelling opportunities for public engagement in scientific exploration. △ Less

Submitted 10 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: 48 pages, 16 figures, accepted by Acta Astronomica Sinica

arXiv:2404.06809 [pdf, other]

Not All Contexts Are Equal: Teaching LLMs Credibility-aware Generation

Authors: Ruotong Pan, Boxi Cao, Hongyu Lin, Xianpei Han, Jia Zheng, Sirui Wang, Xunliang Cai, Le Sun

Abstract: The rapid development of large language models has led to the widespread adoption of Retrieval-Augmented Generation (RAG), which integrates external knowledge to alleviate knowledge bottlenecks and mitigate hallucinations. However, the existing RAG paradigm inevitably suffers from the impact of flawed information introduced during the retrieval phrase, thereby diminishing the reliability and corre… ▽ More The rapid development of large language models has led to the widespread adoption of Retrieval-Augmented Generation (RAG), which integrates external knowledge to alleviate knowledge bottlenecks and mitigate hallucinations. However, the existing RAG paradigm inevitably suffers from the impact of flawed information introduced during the retrieval phrase, thereby diminishing the reliability and correctness of the generated outcomes. In this paper, we propose Credibility-aware Generation (CAG), a universally applicable framework designed to mitigate the impact of flawed information in RAG. At its core, CAG aims to equip models with the ability to discern and process information based on its credibility. To this end, we propose an innovative data transformation framework that generates data based on credibility, thereby effectively endowing models with the capability of CAG. Furthermore, to accurately evaluate the models' capabilities of CAG, we construct a comprehensive benchmark covering three critical real-world scenarios. Experimental results demonstrate that our model can effectively understand and utilize credibility for generation, significantly outperform other models with retrieval augmentation, and exhibit resilience against the disruption caused by noisy documents, thereby maintaining robust performance. Moreover, our model supports customized credibility, offering a wide range of potential applications. △ Less

Submitted 8 May, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: Our code, benchmark, and models are available at https://github.com/panruotong/CAG

arXiv:2404.06718 [pdf, other]

Measurement of the Born cross section for $e^{+}e^{-}\to ηh_c $ at center-of-mass energies between 4.1 and 4.6\,GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: We measure the Born cross section for the reaction $e^{+}e^{-} \rightarrow ηh_c$ from $\sqrt{s} = 4.129$ to $4.600$~GeV using data sets collected by the BESIII detector running at the BEPCII collider. A resonant structure in the cross section line shape near 4.200~GeV is observed with a statistical significance of 7$σ$. The parameters of this resonance are measured to be \MeasMass\ and \MeasWidth,… ▽ More We measure the Born cross section for the reaction $e^{+}e^{-} \rightarrow ηh_c$ from $\sqrt{s} = 4.129$ to $4.600$~GeV using data sets collected by the BESIII detector running at the BEPCII collider. A resonant structure in the cross section line shape near 4.200~GeV is observed with a statistical significance of 7$σ$. The parameters of this resonance are measured to be \MeasMass\ and \MeasWidth, where the first uncertainties are statistical and the second systematic. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.06211 [pdf, other]

Unified Physical-Digital Attack Detection Challenge

Authors: Haocheng Yuan, Ajian Liu, Junze Zheng, Jun Wan, Jiankang Deng, Sergio Escalera, Hugo Jair Escalante, Isabelle Guyon, Zhen Lei

Abstract: Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attac… ▽ More Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attack Detection (UAD) algorithms, a large-scale UniAttackData dataset has been collected. UniAttackData is the largest public dataset for Unified Attack Detection, with a total of 28,706 videos, where each unique identity encompasses all advanced attack types. Based on this dataset, we organized a Unified Physical-Digital Face Attack Detection Challenge to boost the research in Unified Attack Detections. It attracted 136 teams for the development phase, with 13 qualifying for the final round. The results re-verified by the organizing team were used for the final ranking. This paper comprehensively reviews the challenge, detailing the dataset introduction, protocol definition, evaluation criteria, and a summary of published results. Finally, we focus on the detailed analysis of the highest-performing algorithms and offer potential directions for unified physical-digital attack detection inspired by this competition. Challenge Website: https://sites.google.com/view/face-anti-spoofing-challenge/welcome/challengecvpr2024. △ Less

Submitted 18 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: 11 pages, 10 figures

arXiv:2404.05973 [pdf, ps, other]

Search for the Rare Decays $D_s^+\to h^+(h^{0})e^+e^-$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (618 additional authors not shown)

Abstract: Using 7.33~fb$^{-1}$ of $e^{+}e^{-}$ collision data collected by the BESIII detector at center-of-mass energies in the range of $\sqrt{s}=4.128 - 4.226$~GeV, we search for the rare decays $D_{s}^+\to h^+(h^{0})e^{+}e^{-}$, where $h$ represents a kaon or pion. By requiring the $e^{+}e^{-}$ invariant mass to be consistent with a $φ(1020)$, $0.98<M(e^{+}e^{-})<1.04$ ~GeV/$c^2$, the decay… ▽ More Using 7.33~fb$^{-1}$ of $e^{+}e^{-}$ collision data collected by the BESIII detector at center-of-mass energies in the range of $\sqrt{s}=4.128 - 4.226$~GeV, we search for the rare decays $D_{s}^+\to h^+(h^{0})e^{+}e^{-}$, where $h$ represents a kaon or pion. By requiring the $e^{+}e^{-}$ invariant mass to be consistent with a $φ(1020)$, $0.98<M(e^{+}e^{-})<1.04$ ~GeV/$c^2$, the decay $D_s^+\toπ^+φ,φ\to e^{+}e^{-}$ is observed with a statistical significance of 7.8$σ$, and evidence for the decay $D_s^+\toρ^+φ,φ\to e^{+}e^{-}$ is found for the first time with a statistical significance of 4.4$σ$. The decay branching fractions are measured to be $\mathcal{B}(D_s^+\toπ^+φ, φ\to e^{+}e^{-} )=(1.17^{+0.23}_{-0.21}\pm0.03)\times 10^{-5}$, and $\mathcal{B}(D_s^+\toρ^+φ, φ\to e^{+}e^{-} )=(2.44^{+0.67}_{-0.62}\pm 0.16)\times 10^{-5}$, where the first uncertainties are statistical and the second systematic. No significant signal for the three four-body decays of $D_{s}^{+}\to π^{+}π^{0}e^{+}e^{-},\ D_{s}^{+}\to K^{+}π^{0}e^{+}e^{-}$, and $D_{s}^{+}\to K_{S}^{0}π^{+}e^{+}e^{-}$ is observed. For $D_{s}^{+}\to π^{+}π^{0}e^{+}e^{-}$, the $φ$ mass region is vetoed to minimize the long-distance effects. The 90$\%$ confidence level upper limits set on the branching fractions of these decays are in the range of $(7.0-8.1)\times 10^{-5}$. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 10 pages, 2 figures, 1 table

arXiv:2404.05105 [pdf, other]

VMambaMorph: a Multi-Modality Deformable Image Registration Framework based on Visual State Space Model with Cross-Scan Module

Authors: Ziyang Wang, Jian-Qing Zheng, Chao Ma, Tao Guo

Abstract: Image registration, a critical process in medical imaging, involves aligning different sets of medical imaging data into a single unified coordinate system. Deep learning networks, such as the Convolutional Neural Network (CNN)-based VoxelMorph, Vision Transformer (ViT)-based TransMorph, and State Space Model (SSM)-based MambaMorph, have demonstrated effective performance in this domain. The recen… ▽ More Image registration, a critical process in medical imaging, involves aligning different sets of medical imaging data into a single unified coordinate system. Deep learning networks, such as the Convolutional Neural Network (CNN)-based VoxelMorph, Vision Transformer (ViT)-based TransMorph, and State Space Model (SSM)-based MambaMorph, have demonstrated effective performance in this domain. The recent Visual State Space Model (VMamba), which incorporates a cross-scan module with SSM, has exhibited promising improvements in modeling global-range dependencies with efficient computational cost in computer vision tasks. This paper hereby introduces an exploration of VMamba with image registration, named VMambaMorph. This novel hybrid VMamba-CNN network is designed specifically for 3D image registration. Utilizing a U-shaped network architecture, VMambaMorph computes the deformation field based on target and source volumes. The VMamba-based block with 2D cross-scan module is redesigned for 3D volumetric feature processing. To overcome the complex motion and structure on multi-modality images, we further propose a fine-tune recursive registration framework. We validate VMambaMorph using a public benchmark brain MR-CT registration dataset, comparing its performance against current state-of-the-art methods. The results indicate that VMambaMorph achieves competitive registration quality. The code for VMambaMorph with all baseline methods is available on GitHub. △ Less

Submitted 14 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.04936 [pdf, other]

Bootstrap** Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models

Authors: Weiwei Cao, Jianpeng Zhang, Yingda Xia, Tony C. W. Mok, Zi Li, Xianghua Ye, Le Lu, Jian Zheng, Yuxing Tang, Ling Zhang

Abstract: Radiologists highly desire fully automated versatile AI for medical imaging interpretation. However, the lack of extensively annotated large-scale multi-disease datasets has hindered the achievement of this goal. In this paper, we explore the feasibility of leveraging language as a naturally high-quality supervision for chest CT imaging. In light of the limited availability of image-report pairs,… ▽ More Radiologists highly desire fully automated versatile AI for medical imaging interpretation. However, the lack of extensively annotated large-scale multi-disease datasets has hindered the achievement of this goal. In this paper, we explore the feasibility of leveraging language as a naturally high-quality supervision for chest CT imaging. In light of the limited availability of image-report pairs, we bootstrap the understanding of 3D chest CT images by distilling chest-related diagnostic knowledge from an extensively pre-trained 2D X-ray expert model. Specifically, we propose a language-guided retrieval method to match each 3D CT image with its semantically closest 2D X-ray image, and perform pair-wise and semantic relation knowledge distillation. Subsequently, we use contrastive learning to align images and reports within the same patient while distinguishing them from the other patients. However, the challenge arises when patients have similar semantic diagnoses, such as healthy patients, potentially confusing if treated as negatives. We introduce a robust contrastive learning that identifies and corrects these false negatives. We train our model with over 12,000 pairs of chest CT images and radiology reports. Extensive experiments across multiple scenarios, including zero-shot learning, report generation, and fine-tuning processes, demonstrate the model's feasibility in interpreting chest CT images. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: Accepted by CVPR 2024

arXiv:2404.04917 [pdf, ps, other]

Search for $η_c(2S)\to 2(π^+π^-)$ and improved measurement of $χ_{cJ}\to 2(π^+π^-)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: We search for the hadronic decay $η_c(2S)\to 2(π^+π^-)$ in the $ψ(3686)\toγη_c(2S)$ radiative decay using $(27.12\pm 0.14)\times 10^8$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. No significant signal is found, and the upper limit of $\mathcal{B}[ψ(3686)\toγη_c(2S)]\mathcal{B}[η_c(2S)\to 2(π^+π^-)]$ is determined to be $0.78\times 10^{-6}$ at the 90\% confidence level… ▽ More We search for the hadronic decay $η_c(2S)\to 2(π^+π^-)$ in the $ψ(3686)\toγη_c(2S)$ radiative decay using $(27.12\pm 0.14)\times 10^8$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. No significant signal is found, and the upper limit of $\mathcal{B}[ψ(3686)\toγη_c(2S)]\mathcal{B}[η_c(2S)\to 2(π^+π^-)]$ is determined to be $0.78\times 10^{-6}$ at the 90\% confidence level. Using $ψ(3686)\toγχ_{cJ}$ transitions, we also measure the branching fractions of $\mathcal{B}[χ_{cJ(J=0,1,2)}\to 2(π^+π^-)]$, which are $\mathcal{B}[χ_{c0}\to 2(π^+π^-)]=(2.127\pm 0.002~(\mathrm{stat.})\pm 0.101~(\mathrm{syst.}))$\%, $\mathcal{B}[χ_{c1}\to 2(π^+π^-)]=(0.685\pm 0.001~(\mathrm{stat.})\pm 0.031~\mathrm{syst.}))$\%, and $\mathcal{B}[χ_{c2}\to 2(π^+π^-)]=(1.153\pm 0.001~(\mathrm{stat.})\pm 0.063~(\mathrm{syst.}))$\%. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.04823 [pdf, other]

3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions

Authors: Weijia Li, Haote Yang, Zhenghao Hu, Juepeng Zheng, Gui-Song Xia, Conghui He

Abstract: 3D building reconstruction from monocular remote sensing images is an important and challenging research problem that has received increasing attention in recent years, owing to its low cost of data acquisition and availability for large-scale applications. However, existing methods rely on expensive 3D-annotated samples for fully-supervised training, restricting their application to large-scale c… ▽ More 3D building reconstruction from monocular remote sensing images is an important and challenging research problem that has received increasing attention in recent years, owing to its low cost of data acquisition and availability for large-scale applications. However, existing methods rely on expensive 3D-annotated samples for fully-supervised training, restricting their application to large-scale cross-city scenarios. In this work, we propose MLS-BRN, a multi-level supervised building reconstruction network that can flexibly utilize training samples with different annotation levels to achieve better reconstruction results in an end-to-end manner. To alleviate the demand on full 3D supervision, we design two new modules, Pseudo Building Bbox Calculator and Roof-Offset guided Footprint Extractor, as well as new tasks and training strategies for different types of samples. Experimental results on several public and new datasets demonstrate that our proposed MLS-BRN achieves competitive performance using much fewer 3D-annotated samples, and significantly improves the footprint extraction and 3D reconstruction performance compared with current state-of-the-art. The code and datasets of this work will be released at https://github.com/opendatalab/MLS-BRN.git. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: accepted by CVPR 2024

arXiv:2404.04801 [pdf, ps, other]

doi 10.1007/s41605-024-00467-8

LHAASO-KM2A detector simulation using Geant4

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (254 additional authors not shown)

Abstract: KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with… ▽ More KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with large altitude difference (30 m) and huge coverage (1.3 km^2). In this paper, the design of the KM2A simulation code G4KM2A based on Geant4 is introduced. The process of G4KM2A is optimized mainly in memory consumption to avoid memory overffow. Some simpliffcations are used to signiffcantly speed up the execution of G4KM2A. The running time is reduced by at least 30 times compared to full detector simulation. The particle distributions and the core/angle resolution comparison between simulation and experimental data of the full KM2A array are also presented, which show good agreement. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.04640 [pdf, other]

Search for di-photon decays of an axion-like particle in radiative J/ψdecays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (604 additional authors not shown)

Abstract: We search for the di-photon decay of a light pseudoscalar axion-like particle, $a$, in radiative $J/ψ$ decays, using 10 billion $J/ψ$ events collected with the BESIII detector. We find no evidence of a signal and set upper limits at the $95\%$ confidence level on the product branching fraction $\mathcal{B}(J/ψ\to γa) \times \mathcal{B}(a \to γγ)$ and the axion-like particle photon coupling constan… ▽ More We search for the di-photon decay of a light pseudoscalar axion-like particle, $a$, in radiative $J/ψ$ decays, using 10 billion $J/ψ$ events collected with the BESIII detector. We find no evidence of a signal and set upper limits at the $95\%$ confidence level on the product branching fraction $\mathcal{B}(J/ψ\to γa) \times \mathcal{B}(a \to γγ)$ and the axion-like particle photon coupling constant $g_{a γγ}$ in the ranges of $(3.7-48.5) \times 10^{-8}$ and $(2.2 -101.8)\times 10^{-4}$ GeV$^{-1}$, respectively, for $0.18 \le m_a \le 2.85$ GeV/$c^2$. These are the most stringent limits to date in this mass region. △ Less

Submitted 3 July, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

Comments: 9 pages, 5 figures, To be published in Phys. Rev. D (Letter)

Report number: BESIII Analysis Memo - 671

arXiv:2404.03248 [pdf, other]

Learning Transferable Negative Prompts for Out-of-Distribution Detection

Authors: Tianqi Li, Guansong Pang, Xiao Bai, Wenjun Miao, ** Zheng

Abstract: Existing prompt learning methods have shown certain capabilities in Out-of-Distribution (OOD) detection, but the lack of OOD images in the target dataset in their training can lead to mismatches between OOD images and In-Distribution (ID) categories, resulting in a high false positive rate. To address this issue, we introduce a novel OOD detection method, named 'NegPrompt', to learn a set of negat… ▽ More Existing prompt learning methods have shown certain capabilities in Out-of-Distribution (OOD) detection, but the lack of OOD images in the target dataset in their training can lead to mismatches between OOD images and In-Distribution (ID) categories, resulting in a high false positive rate. To address this issue, we introduce a novel OOD detection method, named 'NegPrompt', to learn a set of negative prompts, each representing a negative connotation of a given class label, for delineating the boundaries between ID and OOD images. It learns such negative prompts with ID data only, without any reliance on external outlier data. Further, current methods assume the availability of samples of all ID classes, rendering them ineffective in open-vocabulary learning scenarios where the inference stage can contain novel ID classes not present during training. In contrast, our learned negative prompts are transferable to novel class labels. Experiments on various ImageNet benchmarks show that NegPrompt surpasses state-of-the-art prompt-learning-based OOD detection methods and maintains a consistent lead in hard OOD detection in closed- and open-vocabulary classification scenarios. Code is available at https://github.com/mala-lab/negprompt. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: Accepted at CVPR 2024

arXiv:2404.03217 [pdf, other]

Evidence of the $h_c\to K_S^0 K^+π^-+c.c.$ decay

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Based on $(2.712\pm0.014)\times10^9$ $ψ(3686)$ events collected by the BESIII collaboration, evidence of the hadronic decay $h_c\to K_S^0K^+π^-+c.c.$ is found with a significance of $4.3σ$ in the $ψ(3686)\toπ^0 h_c$ process. The branching fraction of $h_c\to K_S^0 K^+π^- +c.c.$ is measured to be $(7.3\pm0.8\pm1.8)\times10^{-4}$, where the first and second uncertainties are statistical and systemat… ▽ More Based on $(2.712\pm0.014)\times10^9$ $ψ(3686)$ events collected by the BESIII collaboration, evidence of the hadronic decay $h_c\to K_S^0K^+π^-+c.c.$ is found with a significance of $4.3σ$ in the $ψ(3686)\toπ^0 h_c$ process. The branching fraction of $h_c\to K_S^0 K^+π^- +c.c.$ is measured to be $(7.3\pm0.8\pm1.8)\times10^{-4}$, where the first and second uncertainties are statistical and systematic, respectively. Combining with the exclusive decay width of $η_c\to K\bar{K}π$, our result indicates inconsistencies with both pQCD and NRQCD predictions. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2404.02880 [pdf, other]

doi 10.1145/3613904.3642608

Fragmented Moments, Balanced Choices: How Do People Make Use of Their Waiting Time?

Authors: Jian Zheng, Ge Gao

Abstract: Everyone spends some time waiting every day. HCI research has developed tools for boosting productivity while waiting. However, little is known about how people naturally spend their waiting time. We conducted an experience sampling study with 21 working adults who used a mobile app to report their daily waiting time activities over two weeks. The aim of this study is to understand the activities… ▽ More Everyone spends some time waiting every day. HCI research has developed tools for boosting productivity while waiting. However, little is known about how people naturally spend their waiting time. We conducted an experience sampling study with 21 working adults who used a mobile app to report their daily waiting time activities over two weeks. The aim of this study is to understand the activities people do while waiting and the effect of situational factors. We found that participants spent about 60% of their waiting time on leisure activities, 20% on productive activities, and 20% on maintenance activities. These choices are sensitive to situational factors, including accessible device, location, and certain routines of the day. Our study complements previous ones by demonstrating that people purpose waiting time for various goals beyond productivity and to maintain work-life balance. Our findings shed light on future empirical research and system design for time management. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: 14 pages. 6 figures. Published at ACM CHI'24

ACM Class: H.5.m

arXiv:2404.02686 [pdf, other]

Design2Cloth: 3D Cloth Generation from 2D Masks

Authors: Jiali Zheng, Rolandos Alexandros Potamias, Stefanos Zafeiriou

Abstract: In recent years, there has been a significant shift in the field of digital avatar research, towards modeling, animating and reconstructing clothed human representations, as a key step towards creating realistic avatars. However, current 3D cloth generation methods are garment specific or trained completely on synthetic data, hence lacking fine details and realism. In this work, we make a step tow… ▽ More In recent years, there has been a significant shift in the field of digital avatar research, towards modeling, animating and reconstructing clothed human representations, as a key step towards creating realistic avatars. However, current 3D cloth generation methods are garment specific or trained completely on synthetic data, hence lacking fine details and realism. In this work, we make a step towards automatic realistic garment design and propose Design2Cloth, a high fidelity 3D generative model trained on a real world dataset from more than 2000 subject scans. To provide vital contribution to the fashion industry, we developed a user-friendly adversarial model capable of generating diverse and detailed clothes simply by drawing a 2D cloth mask. Under a series of both qualitative and quantitative experiments, we showcase that Design2Cloth outperforms current state-of-the-art cloth generative models by a large margin. In addition to the generative properties of our network, we showcase that the proposed method can be used to achieve high quality reconstructions from single in-the-wild images and 3D scans. Dataset, code and pre-trained model will become publicly available. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: Accepted to CVPR 2024, Project page: https://jiali-zheng.github.io/Design2Cloth/

arXiv:2404.02033 [pdf, other]

Search for $C$-even states decaying to $D_{s}^{\pm}D_{s}^{*\mp}$ with masses between $4.08$ and $4.32$ $\rm GeV/{\it c}^{2}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Six $C$-even states, denoted as $X$, with quantum numbers $J^{PC}=0^{-+}$, $1^{\pm+}$, or $2^{\pm+}$, are searched for via the $e^+e^-\toγD_{s}^{\pm}D_{s}^{*\mp}$ process using $(1667.39\pm8.84)~\mathrm{pb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII storage ring at center-of-mass energy of $\sqrt{s}=(4681.92\pm0.30)~\mathrm{MeV}$. No statistically s… ▽ More Six $C$-even states, denoted as $X$, with quantum numbers $J^{PC}=0^{-+}$, $1^{\pm+}$, or $2^{\pm+}$, are searched for via the $e^+e^-\toγD_{s}^{\pm}D_{s}^{*\mp}$ process using $(1667.39\pm8.84)~\mathrm{pb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII storage ring at center-of-mass energy of $\sqrt{s}=(4681.92\pm0.30)~\mathrm{MeV}$. No statistically significant signal is observed in the mass range from $4.08$ to $4.32~\mathrm{GeV}/c^{2}$. The upper limits of $σ[e^+e^-\toγX]\cdot \mathcal{B}[X \to D_{s}^{\pm}D_{s}^{*\mp}]$ at a $90\%$ confidence level are determined. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.01139 [pdf, other]

Structured Initialization for Attention in Vision Transformers

Authors: Jianqiao Zheng, Xueqian Li, Simon Lucey

Abstract: The training of vision transformer (ViT) networks on small-scale datasets poses a significant challenge. By contrast, convolutional neural networks (CNNs) have an architectural inductive bias enabling them to perform well on such problems. In this paper, we argue that the architectural bias inherent to CNNs can be reinterpreted as an initialization bias within ViT. This insight is significant as i… ▽ More The training of vision transformer (ViT) networks on small-scale datasets poses a significant challenge. By contrast, convolutional neural networks (CNNs) have an architectural inductive bias enabling them to perform well on such problems. In this paper, we argue that the architectural bias inherent to CNNs can be reinterpreted as an initialization bias within ViT. This insight is significant as it empowers ViTs to perform equally well on small-scale problems while maintaining their flexibility for large-scale applications. Our inspiration for this ``structured'' initialization stems from our empirical observation that random impulse filters can achieve comparable performance to learned filters within CNNs. Our approach achieves state-of-the-art performance for data-efficient ViT learning across numerous benchmarks including CIFAR-10, CIFAR-100, and SVHN. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 20 pages, 5 figures, 8 tables

arXiv:2404.01003 [pdf, other]

On the Brun--Titchmarsh theorem

Authors: ** Xi, Junren Zheng

Abstract: The classical Brun--Titchmarsh theorem gives an upper bound, which is of correct order of magnitude in the full range, for the number of primes $p\leqslant x$ satisfying $p\equiv a\bmod q$. We strength this inequality for $\log q/\log x$ in different ranges, improving upon previous works by Motohashi, Goldfeld, Iwaniec, Friedlander and Iwaniec, and Maynard for general or special moduli. In particu… ▽ More The classical Brun--Titchmarsh theorem gives an upper bound, which is of correct order of magnitude in the full range, for the number of primes $p\leqslant x$ satisfying $p\equiv a\bmod q$. We strength this inequality for $\log q/\log x$ in different ranges, improving upon previous works by Motohashi, Goldfeld, Iwaniec, Friedlander and Iwaniec, and Maynard for general or special moduli. In particular, we obtain a Burgess-like constant in the full range $q<x^{1/2-}.$ The proof is based on various estimates for character and exponential sums, by appealing to arithmetic exponent pairs and bilinear forms with algebraic trace functions from $\ell$-adic cohomology, and sums of Kloosterman sums from spectral theory of automorphic forms, as well as large value theorem for Dirichlet polynomials due to Huxley. △ Less

Submitted 7 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

Comments: Bounded by 50 pages. All comments are welcome!

arXiv:2404.00451 [pdf, other]

Thin-Shell Object Manipulations With Differentiable Physics Simulations

Authors: Yian Wang, Juntian Zheng, Zhehuan Chen, Zhou Xian, Gu Zhang, Chao Liu, Chuang Gan

Abstract: In this work, we aim to teach robots to manipulate various thin-shell materials. Prior works studying thin-shell object manipulation mostly rely on heuristic policies or learn policies from real-world video demonstrations, and only focus on limited material types and tasks (e.g., cloth unfolding). However, these approaches face significant challenges when extended to a wider variety of thin-shell… ▽ More In this work, we aim to teach robots to manipulate various thin-shell materials. Prior works studying thin-shell object manipulation mostly rely on heuristic policies or learn policies from real-world video demonstrations, and only focus on limited material types and tasks (e.g., cloth unfolding). However, these approaches face significant challenges when extended to a wider variety of thin-shell materials and a diverse range of tasks. While virtual simulations are shown to be effective in diverse robot skill learning and evaluation, prior thin-shell simulation environments only support a subset of thin-shell materials, which also limits their supported range of tasks. We introduce ThinShellLab - a fully differentiable simulation platform tailored for robotic interactions with diverse thin-shell materials possessing varying material properties, enabling flexible thin-shell manipulation skill learning and evaluation. Our experiments suggest that manipulating thin-shell objects presents several unique challenges: 1) thin-shell manipulation relies heavily on frictional forces due to the objects' co-dimensional nature, 2) the materials being manipulated are highly sensitive to minimal variations in interaction actions, and 3) the constant and frequent alteration in contact pairs makes trajectory optimization methods susceptible to local optima, and neither standard reinforcement learning algorithms nor trajectory optimization methods (either gradient-based or gradient-free) are able to solve the tasks alone. To overcome these challenges, we present an optimization scheme that couples sampling-based trajectory optimization and gradient-based optimization, boosting both learning efficiency and converged performance across various proposed tasks. In addition, the differentiable nature of our platform facilitates a smooth sim-to-real transition. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Comments: ICLR 2024

arXiv:2403.19877 [pdf]

Superconductivity from On-Chip Metallization on 2D Topological Chalcogenides

Authors: Yanyu Jia, Guo Yu, Tiancheng Song, Fang Yuan, Ayelet J Uzan, Yue Tang, Pengjie Wang, Ratnadwip Singha, Michael Onyszczak, Zhaoyi Joy Zheng, Kenji Watanabe, Takashi Taniguchi, Leslie M Schoop, Sanfeng Wu

Abstract: Two-dimensional (2D) transition metal dichalcogenides (TMDs) is a versatile class of quantum materials of interest to various fields including, e.g., nanoelectronics, optical devices, and topological and correlated quantum matter. Tailoring the electronic properties of TMDs is essential to their applications in many directions. Here, we report that a highly controllable and uniform on-chip 2D meta… ▽ More Two-dimensional (2D) transition metal dichalcogenides (TMDs) is a versatile class of quantum materials of interest to various fields including, e.g., nanoelectronics, optical devices, and topological and correlated quantum matter. Tailoring the electronic properties of TMDs is essential to their applications in many directions. Here, we report that a highly controllable and uniform on-chip 2D metallization process converts a class of atomically thin TMDs into robust superconductors, a property belonging to none of the starting materials. As examples, we demonstrate the introduction of superconductivity into a class of 2D air-sensitive topological TMDs, including monolayers of Td-WTe2, 1T'-MoTe2 and 2H-MoTe2, as well as their natural and twisted bilayers, metalized with an ultrathin layer of Palladium. This class of TMDs are known to exhibit intriguing topological phases ranging from topological insulator, Weyl semimetal to fractional Chern insulator. The unique, high-quality two-dimensional metallization process is based on our recent findings of the long-distance, non-Fickian in-plane mass transport and chemistry in 2D that occur at relatively low temperatures and in devices fully encapsulated with inert insulating layers. Highly compatible with existing nanofabrication techniques for van der Waals (vdW) stacks, our results offer a route to designing and engineering superconductivity and topological phases in a class of correlated 2D materials. △ Less

Submitted 30 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

Comments: 22 pages, 12 figures. Accepted to Physical Review X

arXiv:2403.19615 [pdf, other]

SA-GS: Scale-Adaptive Gaussian Splatting for Training-Free Anti-Aliasing

Authors: Xiaowei Song, Jv Zheng, Shiran Yuan, Huan-ang Gao, **gwei Zhao, Xiang He, Weihao Gu, Hao Zhao

Abstract: In this paper, we present a Scale-adaptive method for Anti-aliasing Gaussian Splatting (SA-GS). While the state-of-the-art method Mip-Splatting needs modifying the training procedure of Gaussian splatting, our method functions at test-time and is training-free. Specifically, SA-GS can be applied to any pretrained Gaussian splatting field as a plugin to significantly improve the field's anti-alisin… ▽ More In this paper, we present a Scale-adaptive method for Anti-aliasing Gaussian Splatting (SA-GS). While the state-of-the-art method Mip-Splatting needs modifying the training procedure of Gaussian splatting, our method functions at test-time and is training-free. Specifically, SA-GS can be applied to any pretrained Gaussian splatting field as a plugin to significantly improve the field's anti-alising performance. The core technique is to apply 2D scale-adaptive filters to each Gaussian during test time. As pointed out by Mip-Splatting, observing Gaussians at different frequencies leads to mismatches between the Gaussian scales during training and testing. Mip-Splatting resolves this issue using 3D smoothing and 2D Mip filters, which are unfortunately not aware of testing frequency. In this work, we show that a 2D scale-adaptive filter that is informed of testing frequency can effectively match the Gaussian scale, thus making the Gaussian primitive distribution remain consistent across different testing frequencies. When scale inconsistency is eliminated, sampling rates smaller than the scene frequency result in conventional jaggedness, and we propose to integrate the projected 2D Gaussian within each pixel during testing. This integration is actually a limiting case of super-sampling, which significantly improves anti-aliasing performance over vanilla Gaussian Splatting. Through extensive experiments using various settings and both bounded and unbounded scenes, we show SA-GS performs comparably with or better than Mip-Splatting. Note that super-sampling and integration are only effective when our scale-adaptive filtering is activated. Our codes, data and models are available at https://github.com/zsy1987/SA-GS. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: Project page: https://kevinsong729.github.io/project-pages/SA-GS/ Code: https://github.com/zsy1987/SA-GS

arXiv:2403.19256 [pdf, other]

Measurement of absolute branching fractions of $D_s^+$ hadronic decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (632 additional authors not shown)

Abstract: Using $e^+ e^-$ collision data collected at the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of $7.33~{\rm fb}^{-1}$, we determine the absolute branching fractions of fifteen hadronic $D_s^{+}$ decays with a double-tag technique. In particular, we make precise measurements of the branching fractions… ▽ More Using $e^+ e^-$ collision data collected at the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of $7.33~{\rm fb}^{-1}$, we determine the absolute branching fractions of fifteen hadronic $D_s^{+}$ decays with a double-tag technique. In particular, we make precise measurements of the branching fractions $\mathcal{B}(D_s^+ \to K^+ K^- π^+)=(5.49 \pm 0.04 \pm 0.07)\%$, $\mathcal{B}(D_s^+ \to K_S^0 K^+)=(1.50 \pm 0.01 \pm 0.01)\%$ and $\mathcal{B}(D_s^+ \to K^+ K^- π^+ π^0)=(5.50 \pm 0.05 \pm 0.11)\%$, where the first uncertainties are statistical and the second ones are systematic. The \emph{CP} asymmetries in these decays are also measured and all are found to be compatible with zero. △ Less

Submitted 30 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.19191 [pdf, ps, other]

doi 10.1103/PhysRevA.109.043312

Superfluid Oscillator Circuit with Quantum Current Regulator

Authors: Xue Yang, Wenkai Bai, Chen Jiao, Wu-Ming Liu, Jun-Hui Zheng, Tao Yang

Abstract: We examine the properties of atomic current in a superfluid oscillating circuit consisting of a mesoscopic channel that connects two reservoirs of a Bose-Einstein condensate. We investigate the presence of a critical current in the channel and examine how the amplitude of the oscillations in the number imbalance between the two reservoirs varies with system parameters. In addition to highlighting… ▽ More We examine the properties of atomic current in a superfluid oscillating circuit consisting of a mesoscopic channel that connects two reservoirs of a Bose-Einstein condensate. We investigate the presence of a critical current in the channel and examine how the amplitude of the oscillations in the number imbalance between the two reservoirs varies with system parameters. In addition to highlighting that the dissipative resistance stems from the formation of vortex pairs, we also illustrate the role of these vortex pairs as a quantum current regulator. The dissipation strength is discrete based on the number imbalance, which corresponds to the emergence of vortex pairs in the system. Our findings indicate that the circuit demonstrates characteristics of both voltage-limiting and current-limiting mechanisms. To model the dam** behavior of the atomic superfluid circuit, we develop an equivalent LC oscillator circuit with a quantum current regulator. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 6 figures

Journal ref: Physical Review A, 109 (2024) 043312

arXiv:2403.19091 [pdf, other]

Observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (600 additional authors not shown)

Abstract: By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fra… ▽ More By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fractions are measured to be $\mathcal{B}(D^0\rightarrow {K}_1(1270)^-(\to K^0_Sπ^-π^0)e^+ν_e)=(1.69^{+0.53}_{-0.46}\pm0.15)\times10^{-4}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0(\to K^0_Sπ^+π^-)e^+ν_e)=(1.47^{+0.45}_{-0.40}\pm0.20)\times10^{-4}$ with statistical significance of 5.4$σ$ and 5.6$σ$, respectively. When combined with measurements of the $K_1(1270)\to K^+π^-π$ decays, the absolute branching fractions are determined to be $\mathcal{B}(D^0\to K_1(1270)^-e^+ν_e)=(1.05^{+0.33}_{-0.28}\pm0.12\pm0.12)\times10^{-3}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0e^+ν_e)=(1.29^{+0.40}_{-0.35}\pm0.18\pm0.15)\times10^{-3}$. The first and second uncertainties are statistical and systematic, respectively, and the third uncertainties originate from the assumed branching fractions of the $K_1(1270)\to Kππ$ decays. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: 19pages

arXiv:2403.18876 [pdf, ps, other]

doi 10.1088/1674-1056/20/6/067802

Electromagnetic chirality-induced negative refraction with the same amplitude and anti-phase of the two chirality coefficients

Authors: Shun-Cai Zhao, Zheng-Dong Liu, Jun Zheng, Gen Li

Abstract: We suggest a scheme of electromagnetic chirality-induced negative refraction utilizing magneto-electric cross coupling in a four-level atomic system. The negative refraction can be achieved with the two chirality coefficients having the same amplitude but the opposite phase,and without requiring the simultaneous presence of an electric-dipole and a magnetic-dipole transition near the same transiti… ▽ More We suggest a scheme of electromagnetic chirality-induced negative refraction utilizing magneto-electric cross coupling in a four-level atomic system. The negative refraction can be achieved with the two chirality coefficients having the same amplitude but the opposite phase,and without requiring the simultaneous presence of an electric-dipole and a magnetic-dipole transition near the same transition frequency. The simultaneously negative electric permittivity and magnetic permeability does not require, either. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: 8 pages, 4 figures

Journal ref: Chin. Phys. B, 20 (6): 067802 (2011)

arXiv:2403.17460 [pdf, other]

Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model

Authors: Runmin Dong, Shuai Yuan, Bin Luo, Mengxuan Chen, **xiao Zhang, Lixian Zhang, Weijia Li, Juepeng Zheng, Haohuan Fu

Abstract: Reference-based super-resolution (RefSR) has the potential to build bridges across spatial and temporal resolutions of remote sensing images. However, existing RefSR methods are limited by the faithfulness of content reconstruction and the effectiveness of texture transfer in large scaling factors. Conditional diffusion models have opened up new opportunities for generating realistic high-resoluti… ▽ More Reference-based super-resolution (RefSR) has the potential to build bridges across spatial and temporal resolutions of remote sensing images. However, existing RefSR methods are limited by the faithfulness of content reconstruction and the effectiveness of texture transfer in large scaling factors. Conditional diffusion models have opened up new opportunities for generating realistic high-resolution images, but effectively utilizing reference images within these models remains an area for further exploration. Furthermore, content fidelity is difficult to guarantee in areas without relevant reference information. To solve these issues, we propose a change-aware diffusion model named Ref-Diff for RefSR, using the land cover change priors to guide the denoising process explicitly. Specifically, we inject the priors into the denoising model to improve the utilization of reference information in unchanged areas and regulate the reconstruction of semantically relevant content in changed areas. With this powerful guidance, we decouple the semantics-guided denoising and reference texture-guided denoising processes to improve the model performance. Extensive experiments demonstrate the superior effectiveness and robustness of the proposed method compared with state-of-the-art RefSR methods in both quantitative and qualitative evaluations. The code and data are available at https://github.com/dongrunmin/RefDiff. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR2024

arXiv:2403.17301 [pdf, other]

Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving

Authors: Junhao Zheng, Chenhao Lin, Jiahao Sun, Zhengyu Zhao, Qian Li, Chao Shen

Abstract: Deep learning-based monocular depth estimation (MDE), extensively applied in autonomous driving, is known to be vulnerable to adversarial attacks. Previous physical attacks against MDE models rely on 2D adversarial patches, so they only affect a small, localized region in the MDE map but fail under various viewpoints. To address these limitations, we propose 3D Depth Fool (3D$^2$Fool), the first 3… ▽ More Deep learning-based monocular depth estimation (MDE), extensively applied in autonomous driving, is known to be vulnerable to adversarial attacks. Previous physical attacks against MDE models rely on 2D adversarial patches, so they only affect a small, localized region in the MDE map but fail under various viewpoints. To address these limitations, we propose 3D Depth Fool (3D$^2$Fool), the first 3D texture-based adversarial attack against MDE models. 3D$^2$Fool is specifically optimized to generate 3D adversarial textures agnostic to model types of vehicles and to have improved robustness in bad weather conditions, such as rain and fog. Experimental results validate the superior performance of our 3D$^2$Fool across various scenarios, including vehicles, MDE models, weather conditions, and viewpoints. Real-world experiments with printed 3D textures on physical vehicle models further demonstrate that our 3D$^2$Fool can cause an MDE error of over 10 meters. △ Less

Submitted 27 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024

arXiv:2403.16811 [pdf, ps, other]

Cross section measurement of $e^+e^-\to ηψ(2S)$ and search for $e^+e^-\toη\tilde{X}(3872)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: The energy-dependent cross section for $e^+e^-\to ηψ(2S)$ is measured at eighteen center of mass energies from 4.288 GeV to 4.951 GeV using the BESIII detector. Using the same data samples, we also perform the first search for the reaction $e^+e^-\toη\tilde{X}(3872)$, but no evidence is found for the $\tilde{X}(3872)$ in the $π^+π^- J/ψ$ mass distribution. At each of the eighteen center of mass en… ▽ More The energy-dependent cross section for $e^+e^-\to ηψ(2S)$ is measured at eighteen center of mass energies from 4.288 GeV to 4.951 GeV using the BESIII detector. Using the same data samples, we also perform the first search for the reaction $e^+e^-\toη\tilde{X}(3872)$, but no evidence is found for the $\tilde{X}(3872)$ in the $π^+π^- J/ψ$ mass distribution. At each of the eighteen center of mass energies, upper limits at the 90\% confidence level on the cross section for $e^+e^-\toηψ(2S)$ and on the product of the $e^+e^-\toη\tilde{X}(3872)$ cross section with the branching fraction of $\tilde{X}(3872)\toπ^+π^- J/ψ$ are reported. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.14998 [pdf, other]

Precise measurement of the $e^+e^-\to D_s^+D_s^-$ cross sections at center-of-mass energies from threshold to 4.95 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using the $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, at center-of-mass energies from the threshold to $4.95$~GeV, we present precise measurements of the cross sections for the process $e^+e^-\to D_s^+D_s^-$ using a single tag method. The resulting cross section lineshape exhibits several new structures, thereby offering an input for coupled channel… ▽ More Using the $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, at center-of-mass energies from the threshold to $4.95$~GeV, we present precise measurements of the cross sections for the process $e^+e^-\to D_s^+D_s^-$ using a single tag method. The resulting cross section lineshape exhibits several new structures, thereby offering an input for coupled channel analysis and model tests, which are critical to understand vector charmonium-like states with masses between 4 and 5~GeV. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: 9 pages, 4 figures, published to PRL

arXiv:2403.14442 [pdf, other]

RoDLA: Benchmarking the Robustness of Document Layout Analysis Models

Authors: Yufan Chen, Jiaming Zhang, Kunyu Peng, Junwei Zheng, Rui** Liu, Philip Torr, Rainer Stiefelhagen

Abstract: Before develo** a Document Layout Analysis (DLA) model in real-world applications, conducting comprehensive robustness testing is essential. However, the robustness of DLA models remains underexplored in the literature. To address this, we are the first to introduce a robustness benchmark for DLA models, which includes 450K document images of three datasets. To cover realistic corruptions, we pr… ▽ More Before develo** a Document Layout Analysis (DLA) model in real-world applications, conducting comprehensive robustness testing is essential. However, the robustness of DLA models remains underexplored in the literature. To address this, we are the first to introduce a robustness benchmark for DLA models, which includes 450K document images of three datasets. To cover realistic corruptions, we propose a perturbation taxonomy with 36 common document perturbations inspired by real-world document processing. Additionally, to better understand document perturbation impacts, we propose two metrics, Mean Perturbation Effect (mPE) for perturbation assessment and Mean Robustness Degradation (mRD) for robustness evaluation. Furthermore, we introduce a self-titled model, i.e., Robust Document Layout Analyzer (RoDLA), which improves attention mechanisms to boost extraction of robust features. Experiments on the proposed benchmarks (PubLayNet-P, DocLayNet-P, and M$^6$Doc-P) demonstrate that RoDLA obtains state-of-the-art mRD scores of 115.7, 135.4, and 150.4, respectively. Compared to previous methods, RoDLA achieves notable improvements in mAP of +3.8%, +7.1% and +12.1%, respectively. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024. Project page: https://yufanchen96.github.io/projects/RoDLA

arXiv:2403.13437 [pdf, other]

Search for $ΔS=2$ nonleptonic hyperon decays $Ω^-\toΣ^{0}π^{-}$ and $Ω^-\to nK^{-}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the center-of-mass energy of $\sqrt{s} = 3.686$ GeV, we search for the first time for two nonleptonic hyperon decays that change strangeness by two units, $Ω^-\toΣ^{0}π^-$ and $Ω^-\to nK^{-}$. No significant signal is observed. The upper limits on their decay branching fractions are determined to be… ▽ More Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the center-of-mass energy of $\sqrt{s} = 3.686$ GeV, we search for the first time for two nonleptonic hyperon decays that change strangeness by two units, $Ω^-\toΣ^{0}π^-$ and $Ω^-\to nK^{-}$. No significant signal is observed. The upper limits on their decay branching fractions are determined to be $\mathcal{B}(Ω^-\toΣ^{0}π^-) < 5.4\times 10^{-4}$ and $\mathcal{B}(Ω^-\to nK^{-}) < 2.4\times 10^{-4}$ at the $90\%$ confidence level. △ Less

Submitted 14 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.13338 [pdf, other]

Adaptive Critical Subgraph Mining for Cognitive Impairment Conversion Prediction with T1-MRI-based Brain Network

Authors: Yilin Leng, Wenju Cui, Bai Chen, Xi Jiang, Shuangqing Chen, Jian Zheng

Abstract: Prediction the conversion to early-stage dementia is critical for mitigating its progression but remains challenging due to subtle cognitive impairments and structural brain changes. Traditional T1-weighted magnetic resonance imaging (T1-MRI) research focus on identifying brain atrophy regions but often fails to address the intricate connectivity between them. This limitation underscores the neces… ▽ More Prediction the conversion to early-stage dementia is critical for mitigating its progression but remains challenging due to subtle cognitive impairments and structural brain changes. Traditional T1-weighted magnetic resonance imaging (T1-MRI) research focus on identifying brain atrophy regions but often fails to address the intricate connectivity between them. This limitation underscores the necessity of focuing on inter-regional connectivity for a comprehensive understand of the brain's complex network. Moreover, there is a pressing demand for methods that adaptively preserve and extract critical information, particularly specialized subgraph mining techniques for brain networks. These are essential for develo** high-quality feature representations that reveal critical spatial impacts of structural brain changes and its topology. In this paper, we propose Brain-SubGNN, a novel graph representation network to mine and enhance critical subgraphs based on T1-MRI. This network provides a subgraph-level interpretation, enhancing interpretability and insights for graph analysis. The process begins by extracting node features and a correlation matrix between nodes to construct a task-oriented brain network. Brain-SubGNN then adaptively identifies and enhances critical subgraphs, capturing both loop and neighbor subgraphs. This method reflects the loop topology and local changes, indicative of long-range connections, and maintains local and global brain attributes. Extensive experiments validate the effectiveness and advantages of Brain-SubGNN, demonstrating its potential as a powerful tool for understanding and diagnosing early-stage dementia. Source code is available at https://github.com/Leng-10/Brain-SubGNN. △ Less

Submitted 26 June, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

Comments: 20 pages

arXiv:2403.13277 [pdf]

Quantum valley Hall states in low-buckled counterparts of graphene bilayer

Authors: Yu-Hao Shen, Jun-Ding Zheng, Wen-Yi Tong, Chun-Gang Duan

Abstract: With low-buckled structure for each layer in graphene bilayer system, there breaks inversion symmetry (P-symmetry) for one stacking when both A and B sublattices in top layer are aligned with those in bottom layer. In consideration of spin-orbit coupling (SOC), there opens nontrivial topological gap in each monolayer system to achieve quantum spin Hall effect (QSHE). As long as time-reversal symme… ▽ More With low-buckled structure for each layer in graphene bilayer system, there breaks inversion symmetry (P-symmetry) for one stacking when both A and B sublattices in top layer are aligned with those in bottom layer. In consideration of spin-orbit coupling (SOC), there opens nontrivial topological gap in each monolayer system to achieve quantum spin Hall effect (QSHE). As long as time-reversal symmetry (T-symmetry) is preserved the gapless edge states is robust in each individual layer even for the bilayer absent of PT symmetry. Based on this platform and through tight-binding (TB) model calculations we find it becomes a typical system that can exhibit quantum valley Hall effect (QVHE) when introduced a layer-resolved Rashba SOC that leads to band inversion at each K valley in the hexagonal Brillion zone (BZ). The topological transition comes from that the valley Chern number Cv = CK - CK' switches from 0 to 2, which characterizes the nontrivial QVHE phase transited from two coupled Z2 topological insulators. We also point that the layer-resolved Rashba SOC can be introduced equivalently by twisting two van der Waals touched layers. And through TB calculations, it is shown that the K bands inverts in its corresponding mini BZ when the two layers twisted by a small angle. Our findings advance potential applications for the devices design in topological valleytronics and twistronics. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.12767 [pdf, other]

doi 10.1016/j.eswa.2023.122093

Inter- and intra-uncertainty based feature aggregation model for semi-supervised histopathology image segmentation

Authors: Qiangguo **, Hui Cui, Changming Sun, Yang Song, Jiangbin Zheng, Leilei Cao, Leyi Wei, Ran Su

Abstract: Acquiring pixel-level annotations is often limited in applications such as histology studies that require domain expertise. Various semi-supervised learning approaches have been developed to work with limited ground truth annotations, such as the popular teacher-student models. However, hierarchical prediction uncertainty within the student model (intra-uncertainty) and image prediction uncertaint… ▽ More Acquiring pixel-level annotations is often limited in applications such as histology studies that require domain expertise. Various semi-supervised learning approaches have been developed to work with limited ground truth annotations, such as the popular teacher-student models. However, hierarchical prediction uncertainty within the student model (intra-uncertainty) and image prediction uncertainty (inter-uncertainty) have not been fully utilized by existing methods. To address these issues, we first propose a novel inter- and intra-uncertainty regularization method to measure and constrain both inter- and intra-inconsistencies in the teacher-student architecture. We also propose a new two-stage network with pseudo-mask guided feature aggregation (PG-FANet) as the segmentation model. The two-stage structure complements with the uncertainty regularization strategy to avoid introducing extra modules in solving uncertainties and the aggregation mechanisms enable multi-scale and multi-stage feature integration. Comprehensive experimental results over the MoNuSeg and CRAG datasets show that our PG-FANet outperforms other state-of-the-art methods and our semi-supervised learning framework yields competitive performance with a limited amount of labeled data. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Journal ref: Expert Systems with Applications, 2024, 238: 122093

arXiv:2403.12475 [pdf]

Dielectric response in twisted MoS2 bilayer facilitated by spin-orbit coupling effect

Authors: Yu-Hao Shen, Jun-Ding Zheng, Wen-Yi Tong, Xian-Gang Wan, Chun-Gang Duan

Abstract: Twisted van der Waals bilayers provide ideal two-dimensional (2D) platforms to study the interplay between spin and charge degree of freedom of electron. An exotic dielectric response behavior in such system is what we find here through the investigation of twisted bilayer MoS2 with two different stackings but same size of the commensurate supercell, which forms H-type like and R-type like stacked… ▽ More Twisted van der Waals bilayers provide ideal two-dimensional (2D) platforms to study the interplay between spin and charge degree of freedom of electron. An exotic dielectric response behavior in such system is what we find here through the investigation of twisted bilayer MoS2 with two different stackings but same size of the commensurate supercell, which forms H-type like and R-type like stacked bilayer system. Our first-principles calculations show that applying an out-of-plane electric field gives rise to different response of the electric polarization, whose susceptibility is suppressed in H-type like case compared with that of R-type like one. Further analysis shows the dielectric response to perpendicular electric field is linked to the moire ferroelectric dipole texture. Their underlying link comes from that, through Rashba spin-orbit coupling (SOC) effect, the external electric field tends to change the internal pseudo-spin texture, which further depends on the twist induced dipole texture. Consequently, the suppression of dielectric response can be attributed to the topology protection of such ferroelectric dipole structure. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.12361 [pdf]

Multi-State, Ultra-thin, BEOL-Compatible AlScN Ferroelectric Diodes

Authors: Kwan-Ho Kim, Zirun Han, Yinuo Zhang, Pariasadat Musavigharavi, Jeffrey Zheng, Dhiren K. Pradhan, Eric A. Stach, Roy H. Olsson III, Deep Jariwala

Abstract: The growth in data generation necessitates efficient data processing technologies to address the von Neumann bottleneck in conventional computer architecture. Memory-driven computing, which integrates non-volatile memory (NVM) devices in a 3D stack, is gaining attention, with CMOS back-end-of-line (BEOL) compatible ferroelectric (FE) diodes being ideal due to their two-terminal design and inherent… ▽ More The growth in data generation necessitates efficient data processing technologies to address the von Neumann bottleneck in conventional computer architecture. Memory-driven computing, which integrates non-volatile memory (NVM) devices in a 3D stack, is gaining attention, with CMOS back-end-of-line (BEOL) compatible ferroelectric (FE) diodes being ideal due to their two-terminal design and inherently selector-free nature, facilitating high-density crossbar arrays. Here, we demonstrate BEOL-compatible, high-performance FE-diodes scaled to 5, 10, and 20 nm FE Al0.72Sc0.28N/Al0.64Sc0.36N films. Through interlayer (IL) engineering, we show substantial improvements in the ON/OFF ratios (>166 times) and rectification ratios (>176 times) in these scaled devices. The superlative characteristics also enables 5-bit multi-state operation with a stable retention. We also experimentally and theoretically demonstrate the counterintuitive result that the inclusion of an IL can lead to a decrease in the ferroelectric switching voltage of the device. An in-depth analysis into the device transport mechanisms is performed, and our compact model aligns seamlessly with the experimental results. Our results suggest the possibility of using scaled AlxSc1-xN FE-diodes for high performance, low-power, embedded NVM. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.12359 [pdf, other]

The Lyapunov exponent as a signature of dissipative many-body quantum chaos

Authors: Antonio M. García-García, Jacobus J. M. Verbaarschot, Jie-** Zheng

Abstract: A distinct feature of Hermitian quantum chaotic dynamics is the exponential increase of certain out-of-time-order-correlation (OTOC) functions around the Ehrenfest time with a rate given by a Lyapunov exponent. Physically, the OTOCs describe the growth of quantum uncertainty that crucially depends on the nature of the quantum motion. Here, we employ the OTOC in order to provide a precise definitio… ▽ More A distinct feature of Hermitian quantum chaotic dynamics is the exponential increase of certain out-of-time-order-correlation (OTOC) functions around the Ehrenfest time with a rate given by a Lyapunov exponent. Physically, the OTOCs describe the growth of quantum uncertainty that crucially depends on the nature of the quantum motion. Here, we employ the OTOC in order to provide a precise definition of dissipative quantum chaos. For this purpose, we compute analytically the Lyapunov exponent for the vectorized formulation of the large $q$-limit of a $q$-body Sachdev-Ye-Kitaev model coupled to a Markovian bath. These analytic results are confirmed by an explicit numerical calculation of the Lyapunov exponent for several values of $q \geq 4$ based on the solutions of the Schwinger-Dyson and Bethe-Salpeter equations. We show that the Lyapunov exponent decreases monotonically as the coupling to the bath increases and eventually becomes negative at a critical value of the coupling signaling a transition to a dynamics which is no longer quantum chaotic. Therefore, a positive Lyapunov exponent is a defining feature of dissipative many-body quantum chaos. The observation of the breaking of the exponential growth for sufficiently strong coupling suggests that dissipative quantum chaos may require in certain cases a sufficiently weak coupling to the environment. △ Less

Submitted 14 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: 32 pages, 4 figures, we have added references and made minor corrections

arXiv:2403.10952 [pdf, other]

Universal Fluctuation-Response Relations of Nonequilibrium Dynamics: A Trajectory Information Geometry Framework

Authors: Jiming Zheng, Zhiyue Lu

Abstract: Unraveling the universal principles governing the response of complex systems to environmental changes is crucial for predicting and controlling their behavior. While fluctuation-dissipation relations have been established for systems near equilibrium, a general framework for understanding the responsiveness of systems far from steady states remains elusive. Here, we introduce a novel approach bas… ▽ More Unraveling the universal principles governing the response of complex systems to environmental changes is crucial for predicting and controlling their behavior. While fluctuation-dissipation relations have been established for systems near equilibrium, a general framework for understanding the responsiveness of systems far from steady states remains elusive. Here, we introduce a novel approach based on the information geometry of stochastic trajectories to derive a set of universal thermodynamic bounds on the response of any Markov system, regardless of its proximity to steady states. This theory establishes a new paradigm in non-equilibrium statistical mechanics, providing a unified perspective on the behavior of non-stationary systems, from biological processes to engineered devices, and paving the way for designing complex responsiveness in far-from-equilibrium systems. △ Less

Submitted 7 May, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

arXiv:2403.10877 [pdf, ps, other]

Test of lepton universality and measurement of the form factors of $D^0\to K^{*}(892)^-μ^+ν_μ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (637 additional authors not shown)

Abstract: We report a first study of the semileptonic decay $D^0\rightarrow K^-π^0μ^{+}ν_μ$ by analyzing an $e^+e^-$ annihilation data sample of $7.9~\mathrm{fb}^{-1}$ collected at the center-of-mass energy of 3.773 GeV with the BESIII detector. The absolute branching fraction of $D^0\to K^-π^0μ^{+}ν_μ$ is measured for the first time to be $(0.729 \pm 0.014_{\rm stat} \pm 0.011_{\rm syst})\%$. Based on an a… ▽ More We report a first study of the semileptonic decay $D^0\rightarrow K^-π^0μ^{+}ν_μ$ by analyzing an $e^+e^-$ annihilation data sample of $7.9~\mathrm{fb}^{-1}$ collected at the center-of-mass energy of 3.773 GeV with the BESIII detector. The absolute branching fraction of $D^0\to K^-π^0μ^{+}ν_μ$ is measured for the first time to be $(0.729 \pm 0.014_{\rm stat} \pm 0.011_{\rm syst})\%$. Based on an amplitude analysis, the $S\text{-}{\rm wave}$ contribution is determined to be $(5.76 \pm 0.35_{\rm stat} \pm 0.29_{\rm syst})\%$ of the total decay rate in addition to the dominated $K^{*}(892)^-$ component. The branching fraction of $D^0\to K^{*}(892)^-μ^+ν_μ$ is given to be $(2.062 \pm 0.039_{\rm stat} \pm 0.032_{\rm syst})\%$, which improves the precision of the world average by a factor of 5. Combining with the world average of ${\mathcal B}(D^0\to K^{*}(892)^-e^+ν_e)$, the ratio of the branching fractions obtained is $\frac{{\mathcal B}(D^0\to K^{*}(892)^-μ^+ν_μ)}{{\mathcal B}(D^0\to K^{*}(892)^-e^+ν_e)} = 0.96\pm0.08$, in agreement with lepton flavor universality. Furthermore, assuming single-pole dominance parameterization, the most precise hadronic form factor ratios for $D^0\to K^{*}(892)^{-} μ^+ν_μ$ are extracted to be $r_{V}=V(0)/A_1(0)=1.37 \pm 0.09_{\rm stat} \pm 0.03_{\rm syst}$ and $r_{2}=A_2(0)/A_1(0)=0.76 \pm 0.06_{\rm stat} \pm 0.02_{\rm syst}$. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: 9 pages, 3 figures

arXiv:2403.09975 [pdf, other]

Skeleton-Based Human Action Recognition with Noisy Labels

Authors: Yi Xu, Kunyu Peng, Di Wen, Rui** Liu, Junwei Zheng, Yufan Chen, Jiaming Zhang, Alina Roitberg, Kailun Yang, Rainer Stiefelhagen

Abstract: Understanding human actions from body poses is critical for assistive robots sharing space with humans in order to make informed and safe decisions about the next interaction. However, precise temporal localization and annotation of activity sequences is time-consuming and the resulting labels are often noisy. If not effectively addressed, label noise negatively affects the model's training, resul… ▽ More Understanding human actions from body poses is critical for assistive robots sharing space with humans in order to make informed and safe decisions about the next interaction. However, precise temporal localization and annotation of activity sequences is time-consuming and the resulting labels are often noisy. If not effectively addressed, label noise negatively affects the model's training, resulting in lower recognition quality. Despite its importance, addressing label noise for skeleton-based action recognition has been overlooked so far. In this study, we bridge this gap by implementing a framework that augments well-established skeleton-based human action recognition methods with label-denoising strategies from various research areas to serve as the initial benchmark. Observations reveal that these baselines yield only marginal performance when dealing with sparse skeleton data. Consequently, we introduce a novel methodology, NoiseEraSAR, which integrates global sample selection, co-teaching, and Cross-Modal Mixture-of-Experts (CM-MOE) strategies, aimed at mitigating the adverse impacts of label noise. Our proposed approach demonstrates better performance on the established benchmark, setting new state-of-the-art standards. The source code for this study will be made accessible at https://github.com/xuyizdby/NoiseEraSAR. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: The source code will be made accessible at https://github.com/xuyizdby/NoiseEraSAR

arXiv:2403.09034 [pdf, other]

rFaceNet: An End-to-End Network for Enhanced Physiological Signal Extraction through Identity-Specific Facial Contours

Authors: Dali Zhu, Wenli Zhang, Hualin Zeng, Xiaohao Liu, Long Yang, Jiaqi Zheng

Abstract: Remote photoplethysmography (rPPG) technique extracts blood volume pulse (BVP) signals from subtle pixel changes in video frames. This study introduces rFaceNet, an advanced rPPG method that enhances the extraction of facial BVP signals with a focus on facial contours. rFaceNet integrates identity-specific facial contour information and eliminates redundant data. It efficiently extracts facial con… ▽ More Remote photoplethysmography (rPPG) technique extracts blood volume pulse (BVP) signals from subtle pixel changes in video frames. This study introduces rFaceNet, an advanced rPPG method that enhances the extraction of facial BVP signals with a focus on facial contours. rFaceNet integrates identity-specific facial contour information and eliminates redundant data. It efficiently extracts facial contours from temporally normalized frame inputs through a Temporal Compressor Unit (TCU) and steers the model focus to relevant facial regions by using the Cross-Task Feature Combiner (CTFC). Through elaborate training, the quality and interpretability of facial physiological signals extracted by rFaceNet are greatly improved compared to previous methods. Moreover, our novel approach demonstrates superior performance than SOTA methods in various heart rate estimation benchmarks. △ Less

Submitted 17 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: under-review

arXiv:2403.07705 [pdf, other]

Robust Synthetic-to-Real Transfer for Stereo Matching

Authors: Jiawei Zhang, Jiahe Li, Lei Huang, Xiaohan Yu, Lin Gu, ** Zheng, Xiao Bai

Abstract: With advancements in domain generalized stereo matching networks, models pre-trained on synthetic data demonstrate strong robustness to unseen domains. However, few studies have investigated the robustness after fine-tuning them in real-world scenarios, during which the domain generalization ability can be seriously degraded. In this paper, we explore fine-tuning stereo matching networks without c… ▽ More With advancements in domain generalized stereo matching networks, models pre-trained on synthetic data demonstrate strong robustness to unseen domains. However, few studies have investigated the robustness after fine-tuning them in real-world scenarios, during which the domain generalization ability can be seriously degraded. In this paper, we explore fine-tuning stereo matching networks without compromising their robustness to unseen domains. Our motivation stems from comparing Ground Truth (GT) versus Pseudo Label (PL) for fine-tuning: GT degrades, but PL preserves the domain generalization ability. Empirically, we find the difference between GT and PL implies valuable information that can regularize networks during fine-tuning. We also propose a framework to utilize this difference for fine-tuning, consisting of a frozen Teacher, an exponential moving average (EMA) Teacher, and a Student network. The core idea is to utilize the EMA Teacher to measure what the Student has learned and dynamically improve GT and PL for fine-tuning. We integrate our framework with state-of-the-art networks and evaluate its effectiveness on several real-world datasets. Extensive experiments show that our method effectively preserves the domain generalization ability during fine-tuning. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: Accepted at CVPR 2024

arXiv:2403.06912 [pdf, other]

DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization

Authors: Jiahe Li, Jiawei Zhang, Xiao Bai, ** Zheng, Xin Ning, Jun Zhou, Lin Gu

Abstract: Radiance fields have demonstrated impressive performance in synthesizing novel views from sparse input views, yet prevailing methods suffer from high training costs and slow inference speed. This paper introduces DNGaussian, a depth-regularized framework based on 3D Gaussian radiance fields, offering real-time and high-quality few-shot novel view synthesis at low costs. Our motivation stems from t… ▽ More Radiance fields have demonstrated impressive performance in synthesizing novel views from sparse input views, yet prevailing methods suffer from high training costs and slow inference speed. This paper introduces DNGaussian, a depth-regularized framework based on 3D Gaussian radiance fields, offering real-time and high-quality few-shot novel view synthesis at low costs. Our motivation stems from the highly efficient representation and surprising quality of the recent 3D Gaussian Splatting, despite it will encounter a geometry degradation when input views decrease. In the Gaussian radiance fields, we find this degradation in scene geometry primarily lined to the positioning of Gaussian primitives and can be mitigated by depth constraint. Consequently, we propose a Hard and Soft Depth Regularization to restore accurate scene geometry under coarse monocular depth supervision while maintaining a fine-grained color appearance. To further refine detailed geometry resha**, we introduce Global-Local Depth Normalization, enhancing the focus on small local depth changes. Extensive experiments on LLFF, DTU, and Blender datasets demonstrate that DNGaussian outperforms state-of-the-art methods, achieving comparable or better results with significantly reduced memory cost, a $25 \times$ reduction in training time, and over $3000 \times$ faster rendering speed. △ Less

Submitted 24 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: Accepted at CVPR 2024. Project page: https://fictionarry.github.io/DNGaussian/

arXiv:2403.06766 [pdf, other]

Determination of the number of $ψ(3686)$ events taken at BESIII

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: The number of $ψ(3686)$ events collected by the BESIII detector during the 2021 run period is determined to be $(2259.3\pm 11.1)\times 10^6$ by counting inclusive $ψ(3686)$ hadronic events. The uncertainty is systematic and the statistical uncertainty is negligible. Meanwhile, the numbers of $ψ(3686)$ events collected during the 2009 and 2012 run periods are updated to be… ▽ More The number of $ψ(3686)$ events collected by the BESIII detector during the 2021 run period is determined to be $(2259.3\pm 11.1)\times 10^6$ by counting inclusive $ψ(3686)$ hadronic events. The uncertainty is systematic and the statistical uncertainty is negligible. Meanwhile, the numbers of $ψ(3686)$ events collected during the 2009 and 2012 run periods are updated to be $(107.7\pm0.6)\times 10^6$ and $(345.4\pm 2.6)\times 10^6$, respectively. Both numbers are consistent with the previous measurements within one standard deviation. The total number of $ψ(3686)$ events in the three data samples is $(2712.4\pm14.3)\times10^6$. △ Less

Submitted 28 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.05883 [pdf, other]

Precision premium transformation -- a high-precision astrometric solution based on the precision premium curve

Authors: Z. J. Zheng, Q. Y. Peng, F. R. Lin, D. Li, Y. Zheng

Abstract: Context. In Gaia era, atmospheric turbulence, which causes stochastic wander of a star image, is a fundamental limitation to the astrometric accuracy of ground-based optical imaging. However, the positional bias caused by turbulence (called turbulence error here) can be effectively reduced by measuring a target relative to another reference (a star or a fast-moving target) which locates in the ran… ▽ More Context. In Gaia era, atmospheric turbulence, which causes stochastic wander of a star image, is a fundamental limitation to the astrometric accuracy of ground-based optical imaging. However, the positional bias caused by turbulence (called turbulence error here) can be effectively reduced by measuring a target relative to another reference (a star or a fast-moving target) which locates in the range of only several tens of arcsec, since they suffer from similar turbulence errors. This phenomenon is called the precision premium and has been effectively applied to the astrometry of solar system. Further investigation for the precision premium shows that, the precision premium works at less than about 100 arcsec for two specific objects and the relative positional precision as a function of their angular seperation can be well fitted by a sigmoidal function, called the precision premium curve (PPC). Aims. We want to reduce the turbulence error of a target if it is imaged in an area of high stellar density of a ground-based observation by taking advantage of more Gaia reference stars. Methods. Based on the PPC, we proposed a high-precision astrometric solution called precision premium transformation (PPT) in this paper, which takes advantage of high similarity of turbulence errors in a small region and the dense Gaia reference stars in the region to reduce the turbulence errors on the observation, through a weighted solution. Results. Through systematic analysis, the PPT method exhibits significant advantages in terms of not only precision but also applicability when a target is imaged in an area of high stellar density. The PPT method is also applied to the determination of the proper motion of an open cluster, and the results demonstrate and quantify benefits that the PPT method bestows on ground-based astrometry. △ Less

Submitted 9 March, 2024; originally announced March 2024.

arXiv:2403.05314 [pdf, other]

Advances of Deep Learning in Protein Science: A Comprehensive Survey

Authors: Bozhen Hu, Cheng Tan, Lirong Wu, Jiangbin Zheng, Jun Xia, Zhangyang Gao, Zicheng Liu, Fandi Wu, Guijun Zhang, Stan Z. Li

Abstract: Protein representation learning plays a crucial role in understanding the structure and function of proteins, which are essential biomolecules involved in various biological processes. In recent years, deep learning has emerged as a powerful tool for protein modeling due to its ability to learn complex patterns and representations from large-scale protein data. This comprehensive survey aims to pr… ▽ More Protein representation learning plays a crucial role in understanding the structure and function of proteins, which are essential biomolecules involved in various biological processes. In recent years, deep learning has emerged as a powerful tool for protein modeling due to its ability to learn complex patterns and representations from large-scale protein data. This comprehensive survey aims to provide an overview of the recent advances in deep learning techniques applied to protein science. The survey begins by introducing the developments of deep learning based protein models and emphasizes the importance of protein representation learning in drug discovery, protein engineering, and function annotation. It then delves into the fundamentals of deep learning, including convolutional neural networks, recurrent neural networks, attention models, and graph neural networks in modeling protein sequences, structures, and functions, and explores how these techniques can be used to extract meaningful features and capture intricate relationships within protein data. Next, the survey presents various applications of deep learning in the field of proteins, including protein structure prediction, protein-protein interaction prediction, protein function prediction, etc. Furthermore, it highlights the challenges and limitations of these deep learning techniques and also discusses potential solutions and future directions for overcoming these challenges. This comprehensive survey provides a valuable resource for researchers and practitioners in the field of proteins who are interested in harnessing the power of deep learning techniques. By consolidating the latest advancements and discussing potential avenues for improvement, this review contributes to the ongoing progress in protein research and paves the way for future breakthroughs in the field. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.05131 [pdf, other]

Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation

Authors: Joseph Cho, Fachrina Dewi Puspitasari, Sheng Zheng, **gyao Zheng, Lik-Hang Lee, Tae-Ho Kim, Choong Seon Hong, Chaoning Zhang

Abstract: The evolution of video generation from text, starting with animating MNIST numbers to simulating the physical world with Sora, has progressed at a breakneck speed over the past seven years. While often seen as a superficial expansion of the predecessor text-to-image generation model, text-to-video generation models are developed upon carefully engineered constituents. Here, we systematically discu… ▽ More The evolution of video generation from text, starting with animating MNIST numbers to simulating the physical world with Sora, has progressed at a breakneck speed over the past seven years. While often seen as a superficial expansion of the predecessor text-to-image generation model, text-to-video generation models are developed upon carefully engineered constituents. Here, we systematically discuss these elements consisting of but not limited to core building blocks (vision, language, and temporal) and supporting features from the perspective of their contributions to achieving a world model. We employ the PRISMA framework to curate 97 impactful research articles from renowned scientific databases primarily studying video synthesis using text conditions. Upon minute exploration of these manuscripts, we observe that text-to-video generation involves more intricate technologies beyond the plain extension of text-to-image generation. Our additional review into the shortcomings of Sora-generated videos pinpoints the call for more in-depth studies in various enabling aspects of video generation such as dataset, evaluation metric, efficient architecture, and human-controlled generation. Finally, we conclude that the study of the text-to-video generation may still be in its infancy, requiring contribution from the cross-discipline research community towards its advancement as the first step to realize artificial general intelligence (AGI). △ Less

Submitted 7 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

Comments: First complete survey on Text-to-Video Generation, 44 pages, 20 figures

arXiv:2403.05053 [pdf, other]

PrimeComposer: Faster Progressively Combined Diffusion for Image Composition with Attention Steering

Authors: Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng **

Abstract: Image composition involves seamlessly integrating given objects into a specific visual context. The current training-free methods rely on composing attention weights from several samplers to guide the generator. However, since these weights are derived from disparate contexts, their combination leads to coherence confusion in synthesis and loss of appearance information. These issues worsen with t… ▽ More Image composition involves seamlessly integrating given objects into a specific visual context. The current training-free methods rely on composing attention weights from several samplers to guide the generator. However, since these weights are derived from disparate contexts, their combination leads to coherence confusion in synthesis and loss of appearance information. These issues worsen with their excessive focus on background generation, even when unnecessary in this task. This not only slows down inference but also compromises foreground generation quality. Moreover, these methods introduce unwanted artifacts in the transition area. In this paper, we formulate image composition as a subject-based local editing task, solely focusing on foreground generation. At each step, the edited foreground is combined with the noisy background to maintain scene consistency. To address the remaining issues, we propose PrimeComposer, a faster training-free diffuser that composites the images by well-designed attention steering across different noise levels. This steering is predominantly achieved by our Correlation Diffuser, utilizing its self-attention layers at each step. Within these layers, the synthesized subject interacts with both the referenced object and background, capturing intricate details and coherent relationships. This prior information is encoded into the attention weights, which are then integrated into the self-attention layers of the generator to guide the synthesis process. Besides, we introduce a Region-constrained Cross-Attention to confine the impact of specific subject-related words to desired regions, addressing the unwanted artifacts shown in the prior method thereby further improving the coherence in the transition area. Our method exhibits the fastest inference efficiency and extensive experiments demonstrate our superiority both qualitatively and quantitatively. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Showing 101–150 of 1,629 results for author: Zheng, J