-
GLID: Pre-training a Generalist Encoder-Decoder Vision Model
Authors:
Jihao Liu,
**liang Zheng,
Yu Liu,
Hongsheng Li
Abstract:
This paper proposes a GeneraLIst encoder-Decoder (GLID) pre-training method for better handling various downstream computer vision tasks. While self-supervised pre-training approaches, e.g., Masked Autoencoder, have shown success in transfer learning, task-specific sub-architectures are still required to be appended for different downstream tasks, which cannot enjoy the benefits of large-scale pre…
▽ More
This paper proposes a GeneraLIst encoder-Decoder (GLID) pre-training method for better handling various downstream computer vision tasks. While self-supervised pre-training approaches, e.g., Masked Autoencoder, have shown success in transfer learning, task-specific sub-architectures are still required to be appended for different downstream tasks, which cannot enjoy the benefits of large-scale pre-training. GLID overcomes this challenge by allowing the pre-trained generalist encoder-decoder to be fine-tuned on various vision tasks with minimal task-specific architecture modifications. In the GLID training scheme, pre-training pretext task and other downstream tasks are modeled as "query-to-answer" problems, including the pre-training pretext task and other downstream tasks. We pre-train a task-agnostic encoder-decoder with query-mask pairs. During fine-tuning, GLID maintains the pre-trained encoder-decoder and queries, only replacing the topmost linear transformation layer with task-specific linear heads. This minimizes the pretrain-finetune architecture inconsistency and enables the pre-trained model to better adapt to downstream tasks. GLID achieves competitive performance on various vision tasks, including object detection, image segmentation, pose estimation, and depth estimation, outperforming or matching specialist models such as Mask2Former, DETR, ViTPose, and BinsFormer.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Measurement of $e^{+}e^{-}\to ωη^{\prime}$ cross sections at $\sqrt{s}=$ 2.000 to 3.080 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (599 additional authors not shown)
Abstract:
The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be…
▽ More
The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be $Γ_{R}=(167\pm77\pm7)~\rm{MeV}$, where the first uncertainties are statistical and the second are systematic.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Tianyu: search for the second solar system and explore the dynamic universe
Authors:
Fabo Feng,
Yicheng Rui,
Zhimao Du,
Qing Lin,
Congcong Zhang,
Dan Zhou,
Kaiming Cui,
Masahiro Ogihara,
Ming Yang,
Jie Lin,
Yongzhi Cai,
Taozhi Yang,
Xiaoying Pang,
Mingjie Jian,
Wenxiong Li,
Hengxiao Guo,
Xian Shi,
Jianchun Shi,
Jianyang Li,
Kangrou Guo,
Song Yao,
Aming Chen,
Peng Jia,
Xianyu Tan,
James S. Jenkins
, et al. (10 additional authors not shown)
Abstract:
Giant planets like Jupiter and Saturn, play important roles in the formation and habitability of Earth-like planets. The detection of solar system analogs that have multiple cold giant planets is essential for our understanding of planet habitability and planet formation. Although transit surveys such as Kepler and TESS have discovered thousands of exoplanets, these missions are not sensitive to l…
▽ More
Giant planets like Jupiter and Saturn, play important roles in the formation and habitability of Earth-like planets. The detection of solar system analogs that have multiple cold giant planets is essential for our understanding of planet habitability and planet formation. Although transit surveys such as Kepler and TESS have discovered thousands of exoplanets, these missions are not sensitive to long period planets due to their limited observation baseline. The Tianyu project, comprising two 1-meter telescopes (Tianyu-I and II), is designed to detect transiting cold giant planets in order to find solar system analogs. Featuring a large field of view and equipped with a high-speed CMOS camera, Tianyu-I will perform a high-precision photometric survey of about 100 million stars, measuring light curves at hour-long cadence. The candidates found by Tianyu-I will be confirmed by Tianyu-II and other surveys and follow-up facilities through multi-band photometry, spectroscopy, and high resolution imaging. Tianyu telescopes will be situated at an elevation about 4000 meters in Lenghu, China. With a photometric precision of 1% for stars with V < 18 mag, Tianyu is expected to find more than 300 transiting exoplanets, including about 12 cold giant planets, over five years. A five-year survey of Tianyu would discover 1-2 solar system analogs. Moreover, Tianyu is also designed for non-exoplanetary exploration, incorporating multiple survey modes covering timescales from sub-seconds to months, with a particular emphasis on events occurring within the sub-second to hour range. It excels in observing areas such as infant supernovae, rare variable stars and binaries, tidal disruption events, Be stars, cometary activities, and interstellar objects. These discoveries not only enhance our comprehension of the universe but also offer compelling opportunities for public engagement in scientific exploration.
△ Less
Submitted 10 April, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
Not All Contexts Are Equal: Teaching LLMs Credibility-aware Generation
Authors:
Ruotong Pan,
Boxi Cao,
Hongyu Lin,
Xianpei Han,
Jia Zheng,
Sirui Wang,
Xunliang Cai,
Le Sun
Abstract:
The rapid development of large language models has led to the widespread adoption of Retrieval-Augmented Generation (RAG), which integrates external knowledge to alleviate knowledge bottlenecks and mitigate hallucinations. However, the existing RAG paradigm inevitably suffers from the impact of flawed information introduced during the retrieval phrase, thereby diminishing the reliability and corre…
▽ More
The rapid development of large language models has led to the widespread adoption of Retrieval-Augmented Generation (RAG), which integrates external knowledge to alleviate knowledge bottlenecks and mitigate hallucinations. However, the existing RAG paradigm inevitably suffers from the impact of flawed information introduced during the retrieval phrase, thereby diminishing the reliability and correctness of the generated outcomes. In this paper, we propose Credibility-aware Generation (CAG), a universally applicable framework designed to mitigate the impact of flawed information in RAG. At its core, CAG aims to equip models with the ability to discern and process information based on its credibility. To this end, we propose an innovative data transformation framework that generates data based on credibility, thereby effectively endowing models with the capability of CAG. Furthermore, to accurately evaluate the models' capabilities of CAG, we construct a comprehensive benchmark covering three critical real-world scenarios. Experimental results demonstrate that our model can effectively understand and utilize credibility for generation, significantly outperform other models with retrieval augmentation, and exhibit resilience against the disruption caused by noisy documents, thereby maintaining robust performance. Moreover, our model supports customized credibility, offering a wide range of potential applications.
△ Less
Submitted 8 May, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
Measurement of the Born cross section for $e^{+}e^{-}\to ηh_c $ at center-of-mass energies between 4.1 and 4.6\,GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
We measure the Born cross section for the reaction $e^{+}e^{-} \rightarrow ηh_c$ from $\sqrt{s} = 4.129$ to $4.600$~GeV using data sets collected by the BESIII detector running at the BEPCII collider. A resonant structure in the cross section line shape near 4.200~GeV is observed with a statistical significance of 7$σ$. The parameters of this resonance are measured to be \MeasMass\ and \MeasWidth,…
▽ More
We measure the Born cross section for the reaction $e^{+}e^{-} \rightarrow ηh_c$ from $\sqrt{s} = 4.129$ to $4.600$~GeV using data sets collected by the BESIII detector running at the BEPCII collider. A resonant structure in the cross section line shape near 4.200~GeV is observed with a statistical significance of 7$σ$. The parameters of this resonance are measured to be \MeasMass\ and \MeasWidth, where the first uncertainties are statistical and the second systematic.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Unified Physical-Digital Attack Detection Challenge
Authors:
Haocheng Yuan,
Ajian Liu,
Junze Zheng,
Jun Wan,
Jiankang Deng,
Sergio Escalera,
Hugo Jair Escalante,
Isabelle Guyon,
Zhen Lei
Abstract:
Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attac…
▽ More
Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attack Detection (UAD) algorithms, a large-scale UniAttackData dataset has been collected. UniAttackData is the largest public dataset for Unified Attack Detection, with a total of 28,706 videos, where each unique identity encompasses all advanced attack types. Based on this dataset, we organized a Unified Physical-Digital Face Attack Detection Challenge to boost the research in Unified Attack Detections. It attracted 136 teams for the development phase, with 13 qualifying for the final round. The results re-verified by the organizing team were used for the final ranking. This paper comprehensively reviews the challenge, detailing the dataset introduction, protocol definition, evaluation criteria, and a summary of published results. Finally, we focus on the detailed analysis of the highest-performing algorithms and offer potential directions for unified physical-digital attack detection inspired by this competition. Challenge Website: https://sites.google.com/view/face-anti-spoofing-challenge/welcome/challengecvpr2024.
△ Less
Submitted 18 April, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Search for the Rare Decays $D_s^+\to h^+(h^{0})e^+e^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (618 additional authors not shown)
Abstract:
Using 7.33~fb$^{-1}$ of $e^{+}e^{-}$ collision data collected by the BESIII detector at center-of-mass energies in the range of $\sqrt{s}=4.128 - 4.226$~GeV, we search for the rare decays $D_{s}^+\to h^+(h^{0})e^{+}e^{-}$, where $h$ represents a kaon or pion. By requiring the $e^{+}e^{-}$ invariant mass to be consistent with a $φ(1020)$, $0.98<M(e^{+}e^{-})<1.04$ ~GeV/$c^2$, the decay…
▽ More
Using 7.33~fb$^{-1}$ of $e^{+}e^{-}$ collision data collected by the BESIII detector at center-of-mass energies in the range of $\sqrt{s}=4.128 - 4.226$~GeV, we search for the rare decays $D_{s}^+\to h^+(h^{0})e^{+}e^{-}$, where $h$ represents a kaon or pion. By requiring the $e^{+}e^{-}$ invariant mass to be consistent with a $φ(1020)$, $0.98<M(e^{+}e^{-})<1.04$ ~GeV/$c^2$, the decay $D_s^+\toπ^+φ,φ\to e^{+}e^{-}$ is observed with a statistical significance of 7.8$σ$, and evidence for the decay $D_s^+\toρ^+φ,φ\to e^{+}e^{-}$ is found for the first time with a statistical significance of 4.4$σ$. The decay branching fractions are measured to be $\mathcal{B}(D_s^+\toπ^+φ, φ\to e^{+}e^{-} )=(1.17^{+0.23}_{-0.21}\pm0.03)\times 10^{-5}$, and $\mathcal{B}(D_s^+\toρ^+φ, φ\to e^{+}e^{-} )=(2.44^{+0.67}_{-0.62}\pm 0.16)\times 10^{-5}$, where the first uncertainties are statistical and the second systematic. No significant signal for the three four-body decays of $D_{s}^{+}\to π^{+}π^{0}e^{+}e^{-},\ D_{s}^{+}\to K^{+}π^{0}e^{+}e^{-}$, and $D_{s}^{+}\to K_{S}^{0}π^{+}e^{+}e^{-}$ is observed. For $D_{s}^{+}\to π^{+}π^{0}e^{+}e^{-}$, the $φ$ mass region is vetoed to minimize the long-distance effects. The 90$\%$ confidence level upper limits set on the branching fractions of these decays are in the range of $(7.0-8.1)\times 10^{-5}$.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
VMambaMorph: a Multi-Modality Deformable Image Registration Framework based on Visual State Space Model with Cross-Scan Module
Authors:
Ziyang Wang,
Jian-Qing Zheng,
Chao Ma,
Tao Guo
Abstract:
Image registration, a critical process in medical imaging, involves aligning different sets of medical imaging data into a single unified coordinate system. Deep learning networks, such as the Convolutional Neural Network (CNN)-based VoxelMorph, Vision Transformer (ViT)-based TransMorph, and State Space Model (SSM)-based MambaMorph, have demonstrated effective performance in this domain. The recen…
▽ More
Image registration, a critical process in medical imaging, involves aligning different sets of medical imaging data into a single unified coordinate system. Deep learning networks, such as the Convolutional Neural Network (CNN)-based VoxelMorph, Vision Transformer (ViT)-based TransMorph, and State Space Model (SSM)-based MambaMorph, have demonstrated effective performance in this domain. The recent Visual State Space Model (VMamba), which incorporates a cross-scan module with SSM, has exhibited promising improvements in modeling global-range dependencies with efficient computational cost in computer vision tasks. This paper hereby introduces an exploration of VMamba with image registration, named VMambaMorph. This novel hybrid VMamba-CNN network is designed specifically for 3D image registration. Utilizing a U-shaped network architecture, VMambaMorph computes the deformation field based on target and source volumes. The VMamba-based block with 2D cross-scan module is redesigned for 3D volumetric feature processing. To overcome the complex motion and structure on multi-modality images, we further propose a fine-tune recursive registration framework. We validate VMambaMorph using a public benchmark brain MR-CT registration dataset, comparing its performance against current state-of-the-art methods. The results indicate that VMambaMorph achieves competitive registration quality. The code for VMambaMorph with all baseline methods is available on GitHub.
△ Less
Submitted 14 April, 2024; v1 submitted 7 April, 2024;
originally announced April 2024.
-
Bootstrap** Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models
Authors:
Weiwei Cao,
Jianpeng Zhang,
Yingda Xia,
Tony C. W. Mok,
Zi Li,
Xianghua Ye,
Le Lu,
Jian Zheng,
Yuxing Tang,
Ling Zhang
Abstract:
Radiologists highly desire fully automated versatile AI for medical imaging interpretation. However, the lack of extensively annotated large-scale multi-disease datasets has hindered the achievement of this goal. In this paper, we explore the feasibility of leveraging language as a naturally high-quality supervision for chest CT imaging. In light of the limited availability of image-report pairs,…
▽ More
Radiologists highly desire fully automated versatile AI for medical imaging interpretation. However, the lack of extensively annotated large-scale multi-disease datasets has hindered the achievement of this goal. In this paper, we explore the feasibility of leveraging language as a naturally high-quality supervision for chest CT imaging. In light of the limited availability of image-report pairs, we bootstrap the understanding of 3D chest CT images by distilling chest-related diagnostic knowledge from an extensively pre-trained 2D X-ray expert model. Specifically, we propose a language-guided retrieval method to match each 3D CT image with its semantically closest 2D X-ray image, and perform pair-wise and semantic relation knowledge distillation. Subsequently, we use contrastive learning to align images and reports within the same patient while distinguishing them from the other patients. However, the challenge arises when patients have similar semantic diagnoses, such as healthy patients, potentially confusing if treated as negatives. We introduce a robust contrastive learning that identifies and corrects these false negatives. We train our model with over 12,000 pairs of chest CT images and radiology reports. Extensive experiments across multiple scenarios, including zero-shot learning, report generation, and fine-tuning processes, demonstrate the model's feasibility in interpreting chest CT images.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Search for $η_c(2S)\to 2(π^+π^-)$ and improved measurement of $χ_{cJ}\to 2(π^+π^-)$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
We search for the hadronic decay $η_c(2S)\to 2(π^+π^-)$ in the $ψ(3686)\toγη_c(2S)$ radiative decay using $(27.12\pm 0.14)\times 10^8$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. No significant signal is found, and the upper limit of $\mathcal{B}[ψ(3686)\toγη_c(2S)]\mathcal{B}[η_c(2S)\to 2(π^+π^-)]$ is determined to be $0.78\times 10^{-6}$ at the 90\% confidence level…
▽ More
We search for the hadronic decay $η_c(2S)\to 2(π^+π^-)$ in the $ψ(3686)\toγη_c(2S)$ radiative decay using $(27.12\pm 0.14)\times 10^8$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. No significant signal is found, and the upper limit of $\mathcal{B}[ψ(3686)\toγη_c(2S)]\mathcal{B}[η_c(2S)\to 2(π^+π^-)]$ is determined to be $0.78\times 10^{-6}$ at the 90\% confidence level. Using $ψ(3686)\toγχ_{cJ}$ transitions, we also measure the branching fractions of $\mathcal{B}[χ_{cJ(J=0,1,2)}\to 2(π^+π^-)]$, which are $\mathcal{B}[χ_{c0}\to 2(π^+π^-)]=(2.127\pm 0.002~(\mathrm{stat.})\pm 0.101~(\mathrm{syst.}))$\%, $\mathcal{B}[χ_{c1}\to 2(π^+π^-)]=(0.685\pm 0.001~(\mathrm{stat.})\pm 0.031~\mathrm{syst.}))$\%, and $\mathcal{B}[χ_{c2}\to 2(π^+π^-)]=(1.153\pm 0.001~(\mathrm{stat.})\pm 0.063~(\mathrm{syst.}))$\%.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions
Authors:
Weijia Li,
Haote Yang,
Zhenghao Hu,
Juepeng Zheng,
Gui-Song Xia,
Conghui He
Abstract:
3D building reconstruction from monocular remote sensing images is an important and challenging research problem that has received increasing attention in recent years, owing to its low cost of data acquisition and availability for large-scale applications. However, existing methods rely on expensive 3D-annotated samples for fully-supervised training, restricting their application to large-scale c…
▽ More
3D building reconstruction from monocular remote sensing images is an important and challenging research problem that has received increasing attention in recent years, owing to its low cost of data acquisition and availability for large-scale applications. However, existing methods rely on expensive 3D-annotated samples for fully-supervised training, restricting their application to large-scale cross-city scenarios. In this work, we propose MLS-BRN, a multi-level supervised building reconstruction network that can flexibly utilize training samples with different annotation levels to achieve better reconstruction results in an end-to-end manner. To alleviate the demand on full 3D supervision, we design two new modules, Pseudo Building Bbox Calculator and Roof-Offset guided Footprint Extractor, as well as new tasks and training strategies for different types of samples. Experimental results on several public and new datasets demonstrate that our proposed MLS-BRN achieves competitive performance using much fewer 3D-annotated samples, and significantly improves the footprint extraction and 3D reconstruction performance compared with current state-of-the-art. The code and datasets of this work will be released at https://github.com/opendatalab/MLS-BRN.git.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
LHAASO-KM2A detector simulation using Geant4
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (254 additional authors not shown)
Abstract:
KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with…
▽ More
KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with large altitude difference (30 m) and huge coverage (1.3 km^2). In this paper, the design of the KM2A simulation code G4KM2A based on Geant4 is introduced. The process of G4KM2A is optimized mainly in memory consumption to avoid memory overffow. Some simpliffcations are used to signiffcantly speed up the execution of G4KM2A. The running time is reduced by at least 30 times compared to full detector simulation. The particle distributions and the core/angle resolution comparison between simulation and experimental data of the full KM2A array are also presented, which show good agreement.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Search for di-photon decays of an axion-like particle in radiative J/ψdecays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko
, et al. (604 additional authors not shown)
Abstract:
We search for the di-photon decay of a light pseudoscalar axion-like particle, $a$, in radiative $J/ψ$ decays, using 10 billion $J/ψ$ events collected with the BESIII detector. We find no evidence of a signal and set upper limits at the $95\%$ confidence level on the product branching fraction $\mathcal{B}(J/ψ\to γa) \times \mathcal{B}(a \to γγ)$ and the axion-like particle photon coupling constan…
▽ More
We search for the di-photon decay of a light pseudoscalar axion-like particle, $a$, in radiative $J/ψ$ decays, using 10 billion $J/ψ$ events collected with the BESIII detector. We find no evidence of a signal and set upper limits at the $95\%$ confidence level on the product branching fraction $\mathcal{B}(J/ψ\to γa) \times \mathcal{B}(a \to γγ)$ and the axion-like particle photon coupling constant $g_{a γγ}$ in the ranges of $(3.7-48.5) \times 10^{-8}$ and $(2.2 -101.8)\times 10^{-4}$ GeV$^{-1}$, respectively, for $0.18 \le m_a \le 2.85$ GeV/$c^2$. These are the most stringent limits to date in this mass region.
△ Less
Submitted 3 July, 2024; v1 submitted 6 April, 2024;
originally announced April 2024.
-
Learning Transferable Negative Prompts for Out-of-Distribution Detection
Authors:
Tianqi Li,
Guansong Pang,
Xiao Bai,
Wenjun Miao,
** Zheng
Abstract:
Existing prompt learning methods have shown certain capabilities in Out-of-Distribution (OOD) detection, but the lack of OOD images in the target dataset in their training can lead to mismatches between OOD images and In-Distribution (ID) categories, resulting in a high false positive rate. To address this issue, we introduce a novel OOD detection method, named 'NegPrompt', to learn a set of negat…
▽ More
Existing prompt learning methods have shown certain capabilities in Out-of-Distribution (OOD) detection, but the lack of OOD images in the target dataset in their training can lead to mismatches between OOD images and In-Distribution (ID) categories, resulting in a high false positive rate. To address this issue, we introduce a novel OOD detection method, named 'NegPrompt', to learn a set of negative prompts, each representing a negative connotation of a given class label, for delineating the boundaries between ID and OOD images. It learns such negative prompts with ID data only, without any reliance on external outlier data. Further, current methods assume the availability of samples of all ID classes, rendering them ineffective in open-vocabulary learning scenarios where the inference stage can contain novel ID classes not present during training. In contrast, our learned negative prompts are transferable to novel class labels. Experiments on various ImageNet benchmarks show that NegPrompt surpasses state-of-the-art prompt-learning-based OOD detection methods and maintains a consistent lead in hard OOD detection in closed- and open-vocabulary classification scenarios. Code is available at https://github.com/mala-lab/negprompt.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Evidence of the $h_c\to K_S^0 K^+π^-+c.c.$ decay
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Based on $(2.712\pm0.014)\times10^9$ $ψ(3686)$ events collected by the BESIII collaboration, evidence of the hadronic decay $h_c\to K_S^0K^+π^-+c.c.$ is found with a significance of $4.3σ$ in the $ψ(3686)\toπ^0 h_c$ process. The branching fraction of $h_c\to K_S^0 K^+π^- +c.c.$ is measured to be $(7.3\pm0.8\pm1.8)\times10^{-4}$, where the first and second uncertainties are statistical and systemat…
▽ More
Based on $(2.712\pm0.014)\times10^9$ $ψ(3686)$ events collected by the BESIII collaboration, evidence of the hadronic decay $h_c\to K_S^0K^+π^-+c.c.$ is found with a significance of $4.3σ$ in the $ψ(3686)\toπ^0 h_c$ process. The branching fraction of $h_c\to K_S^0 K^+π^- +c.c.$ is measured to be $(7.3\pm0.8\pm1.8)\times10^{-4}$, where the first and second uncertainties are statistical and systematic, respectively. Combining with the exclusive decay width of $η_c\to K\bar{K}π$, our result indicates inconsistencies with both pQCD and NRQCD predictions.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Fragmented Moments, Balanced Choices: How Do People Make Use of Their Waiting Time?
Authors:
Jian Zheng,
Ge Gao
Abstract:
Everyone spends some time waiting every day. HCI research has developed tools for boosting productivity while waiting. However, little is known about how people naturally spend their waiting time. We conducted an experience sampling study with 21 working adults who used a mobile app to report their daily waiting time activities over two weeks. The aim of this study is to understand the activities…
▽ More
Everyone spends some time waiting every day. HCI research has developed tools for boosting productivity while waiting. However, little is known about how people naturally spend their waiting time. We conducted an experience sampling study with 21 working adults who used a mobile app to report their daily waiting time activities over two weeks. The aim of this study is to understand the activities people do while waiting and the effect of situational factors. We found that participants spent about 60% of their waiting time on leisure activities, 20% on productive activities, and 20% on maintenance activities. These choices are sensitive to situational factors, including accessible device, location, and certain routines of the day. Our study complements previous ones by demonstrating that people purpose waiting time for various goals beyond productivity and to maintain work-life balance. Our findings shed light on future empirical research and system design for time management.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Design2Cloth: 3D Cloth Generation from 2D Masks
Authors:
Jiali Zheng,
Rolandos Alexandros Potamias,
Stefanos Zafeiriou
Abstract:
In recent years, there has been a significant shift in the field of digital avatar research, towards modeling, animating and reconstructing clothed human representations, as a key step towards creating realistic avatars. However, current 3D cloth generation methods are garment specific or trained completely on synthetic data, hence lacking fine details and realism. In this work, we make a step tow…
▽ More
In recent years, there has been a significant shift in the field of digital avatar research, towards modeling, animating and reconstructing clothed human representations, as a key step towards creating realistic avatars. However, current 3D cloth generation methods are garment specific or trained completely on synthetic data, hence lacking fine details and realism. In this work, we make a step towards automatic realistic garment design and propose Design2Cloth, a high fidelity 3D generative model trained on a real world dataset from more than 2000 subject scans. To provide vital contribution to the fashion industry, we developed a user-friendly adversarial model capable of generating diverse and detailed clothes simply by drawing a 2D cloth mask. Under a series of both qualitative and quantitative experiments, we showcase that Design2Cloth outperforms current state-of-the-art cloth generative models by a large margin. In addition to the generative properties of our network, we showcase that the proposed method can be used to achieve high quality reconstructions from single in-the-wild images and 3D scans. Dataset, code and pre-trained model will become publicly available.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Search for $C$-even states decaying to $D_{s}^{\pm}D_{s}^{*\mp}$ with masses between $4.08$ and $4.32$ $\rm GeV/{\it c}^{2}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Six $C$-even states, denoted as $X$, with quantum numbers $J^{PC}=0^{-+}$, $1^{\pm+}$, or $2^{\pm+}$, are searched for via the $e^+e^-\toγD_{s}^{\pm}D_{s}^{*\mp}$ process using $(1667.39\pm8.84)~\mathrm{pb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII storage ring at center-of-mass energy of $\sqrt{s}=(4681.92\pm0.30)~\mathrm{MeV}$. No statistically s…
▽ More
Six $C$-even states, denoted as $X$, with quantum numbers $J^{PC}=0^{-+}$, $1^{\pm+}$, or $2^{\pm+}$, are searched for via the $e^+e^-\toγD_{s}^{\pm}D_{s}^{*\mp}$ process using $(1667.39\pm8.84)~\mathrm{pb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII storage ring at center-of-mass energy of $\sqrt{s}=(4681.92\pm0.30)~\mathrm{MeV}$. No statistically significant signal is observed in the mass range from $4.08$ to $4.32~\mathrm{GeV}/c^{2}$. The upper limits of $σ[e^+e^-\toγX]\cdot \mathcal{B}[X \to D_{s}^{\pm}D_{s}^{*\mp}]$ at a $90\%$ confidence level are determined.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Structured Initialization for Attention in Vision Transformers
Authors:
Jianqiao Zheng,
Xueqian Li,
Simon Lucey
Abstract:
The training of vision transformer (ViT) networks on small-scale datasets poses a significant challenge. By contrast, convolutional neural networks (CNNs) have an architectural inductive bias enabling them to perform well on such problems. In this paper, we argue that the architectural bias inherent to CNNs can be reinterpreted as an initialization bias within ViT. This insight is significant as i…
▽ More
The training of vision transformer (ViT) networks on small-scale datasets poses a significant challenge. By contrast, convolutional neural networks (CNNs) have an architectural inductive bias enabling them to perform well on such problems. In this paper, we argue that the architectural bias inherent to CNNs can be reinterpreted as an initialization bias within ViT. This insight is significant as it empowers ViTs to perform equally well on small-scale problems while maintaining their flexibility for large-scale applications. Our inspiration for this ``structured'' initialization stems from our empirical observation that random impulse filters can achieve comparable performance to learned filters within CNNs. Our approach achieves state-of-the-art performance for data-efficient ViT learning across numerous benchmarks including CIFAR-10, CIFAR-100, and SVHN.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
On the Brun--Titchmarsh theorem
Authors:
** Xi,
Junren Zheng
Abstract:
The classical Brun--Titchmarsh theorem gives an upper bound, which is of correct order of magnitude in the full range, for the number of primes $p\leqslant x$ satisfying $p\equiv a\bmod q$. We strength this inequality for $\log q/\log x$ in different ranges, improving upon previous works by Motohashi, Goldfeld, Iwaniec, Friedlander and Iwaniec, and Maynard for general or special moduli. In particu…
▽ More
The classical Brun--Titchmarsh theorem gives an upper bound, which is of correct order of magnitude in the full range, for the number of primes $p\leqslant x$ satisfying $p\equiv a\bmod q$. We strength this inequality for $\log q/\log x$ in different ranges, improving upon previous works by Motohashi, Goldfeld, Iwaniec, Friedlander and Iwaniec, and Maynard for general or special moduli. In particular, we obtain a Burgess-like constant in the full range $q<x^{1/2-}.$ The proof is based on various estimates for character and exponential sums, by appealing to arithmetic exponent pairs and bilinear forms with algebraic trace functions from $\ell$-adic cohomology, and sums of Kloosterman sums from spectral theory of automorphic forms, as well as large value theorem for Dirichlet polynomials due to Huxley.
△ Less
Submitted 7 April, 2024; v1 submitted 1 April, 2024;
originally announced April 2024.
-
Thin-Shell Object Manipulations With Differentiable Physics Simulations
Authors:
Yian Wang,
Juntian Zheng,
Zhehuan Chen,
Zhou Xian,
Gu Zhang,
Chao Liu,
Chuang Gan
Abstract:
In this work, we aim to teach robots to manipulate various thin-shell materials. Prior works studying thin-shell object manipulation mostly rely on heuristic policies or learn policies from real-world video demonstrations, and only focus on limited material types and tasks (e.g., cloth unfolding). However, these approaches face significant challenges when extended to a wider variety of thin-shell…
▽ More
In this work, we aim to teach robots to manipulate various thin-shell materials. Prior works studying thin-shell object manipulation mostly rely on heuristic policies or learn policies from real-world video demonstrations, and only focus on limited material types and tasks (e.g., cloth unfolding). However, these approaches face significant challenges when extended to a wider variety of thin-shell materials and a diverse range of tasks. While virtual simulations are shown to be effective in diverse robot skill learning and evaluation, prior thin-shell simulation environments only support a subset of thin-shell materials, which also limits their supported range of tasks. We introduce ThinShellLab - a fully differentiable simulation platform tailored for robotic interactions with diverse thin-shell materials possessing varying material properties, enabling flexible thin-shell manipulation skill learning and evaluation. Our experiments suggest that manipulating thin-shell objects presents several unique challenges: 1) thin-shell manipulation relies heavily on frictional forces due to the objects' co-dimensional nature, 2) the materials being manipulated are highly sensitive to minimal variations in interaction actions, and 3) the constant and frequent alteration in contact pairs makes trajectory optimization methods susceptible to local optima, and neither standard reinforcement learning algorithms nor trajectory optimization methods (either gradient-based or gradient-free) are able to solve the tasks alone. To overcome these challenges, we present an optimization scheme that couples sampling-based trajectory optimization and gradient-based optimization, boosting both learning efficiency and converged performance across various proposed tasks. In addition, the differentiable nature of our platform facilitates a smooth sim-to-real transition.
△ Less
Submitted 30 March, 2024;
originally announced April 2024.
-
Superconductivity from On-Chip Metallization on 2D Topological Chalcogenides
Authors:
Yanyu Jia,
Guo Yu,
Tiancheng Song,
Fang Yuan,
Ayelet J Uzan,
Yue Tang,
Pengjie Wang,
Ratnadwip Singha,
Michael Onyszczak,
Zhaoyi Joy Zheng,
Kenji Watanabe,
Takashi Taniguchi,
Leslie M Schoop,
Sanfeng Wu
Abstract:
Two-dimensional (2D) transition metal dichalcogenides (TMDs) is a versatile class of quantum materials of interest to various fields including, e.g., nanoelectronics, optical devices, and topological and correlated quantum matter. Tailoring the electronic properties of TMDs is essential to their applications in many directions. Here, we report that a highly controllable and uniform on-chip 2D meta…
▽ More
Two-dimensional (2D) transition metal dichalcogenides (TMDs) is a versatile class of quantum materials of interest to various fields including, e.g., nanoelectronics, optical devices, and topological and correlated quantum matter. Tailoring the electronic properties of TMDs is essential to their applications in many directions. Here, we report that a highly controllable and uniform on-chip 2D metallization process converts a class of atomically thin TMDs into robust superconductors, a property belonging to none of the starting materials. As examples, we demonstrate the introduction of superconductivity into a class of 2D air-sensitive topological TMDs, including monolayers of Td-WTe2, 1T'-MoTe2 and 2H-MoTe2, as well as their natural and twisted bilayers, metalized with an ultrathin layer of Palladium. This class of TMDs are known to exhibit intriguing topological phases ranging from topological insulator, Weyl semimetal to fractional Chern insulator. The unique, high-quality two-dimensional metallization process is based on our recent findings of the long-distance, non-Fickian in-plane mass transport and chemistry in 2D that occur at relatively low temperatures and in devices fully encapsulated with inert insulating layers. Highly compatible with existing nanofabrication techniques for van der Waals (vdW) stacks, our results offer a route to designing and engineering superconductivity and topological phases in a class of correlated 2D materials.
△ Less
Submitted 30 April, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
SA-GS: Scale-Adaptive Gaussian Splatting for Training-Free Anti-Aliasing
Authors:
Xiaowei Song,
Jv Zheng,
Shiran Yuan,
Huan-ang Gao,
**gwei Zhao,
Xiang He,
Weihao Gu,
Hao Zhao
Abstract:
In this paper, we present a Scale-adaptive method for Anti-aliasing Gaussian Splatting (SA-GS). While the state-of-the-art method Mip-Splatting needs modifying the training procedure of Gaussian splatting, our method functions at test-time and is training-free. Specifically, SA-GS can be applied to any pretrained Gaussian splatting field as a plugin to significantly improve the field's anti-alisin…
▽ More
In this paper, we present a Scale-adaptive method for Anti-aliasing Gaussian Splatting (SA-GS). While the state-of-the-art method Mip-Splatting needs modifying the training procedure of Gaussian splatting, our method functions at test-time and is training-free. Specifically, SA-GS can be applied to any pretrained Gaussian splatting field as a plugin to significantly improve the field's anti-alising performance. The core technique is to apply 2D scale-adaptive filters to each Gaussian during test time. As pointed out by Mip-Splatting, observing Gaussians at different frequencies leads to mismatches between the Gaussian scales during training and testing. Mip-Splatting resolves this issue using 3D smoothing and 2D Mip filters, which are unfortunately not aware of testing frequency. In this work, we show that a 2D scale-adaptive filter that is informed of testing frequency can effectively match the Gaussian scale, thus making the Gaussian primitive distribution remain consistent across different testing frequencies. When scale inconsistency is eliminated, sampling rates smaller than the scene frequency result in conventional jaggedness, and we propose to integrate the projected 2D Gaussian within each pixel during testing. This integration is actually a limiting case of super-sampling, which significantly improves anti-aliasing performance over vanilla Gaussian Splatting. Through extensive experiments using various settings and both bounded and unbounded scenes, we show SA-GS performs comparably with or better than Mip-Splatting. Note that super-sampling and integration are only effective when our scale-adaptive filtering is activated. Our codes, data and models are available at https://github.com/zsy1987/SA-GS.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Measurement of absolute branching fractions of $D_s^+$ hadronic decays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (632 additional authors not shown)
Abstract:
Using $e^+ e^-$ collision data collected at the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of $7.33~{\rm fb}^{-1}$, we determine the absolute branching fractions of fifteen hadronic $D_s^{+}$ decays with a double-tag technique. In particular, we make precise measurements of the branching fractions…
▽ More
Using $e^+ e^-$ collision data collected at the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of $7.33~{\rm fb}^{-1}$, we determine the absolute branching fractions of fifteen hadronic $D_s^{+}$ decays with a double-tag technique. In particular, we make precise measurements of the branching fractions $\mathcal{B}(D_s^+ \to K^+ K^- π^+)=(5.49 \pm 0.04 \pm 0.07)\%$, $\mathcal{B}(D_s^+ \to K_S^0 K^+)=(1.50 \pm 0.01 \pm 0.01)\%$ and $\mathcal{B}(D_s^+ \to K^+ K^- π^+ π^0)=(5.50 \pm 0.05 \pm 0.11)\%$, where the first uncertainties are statistical and the second ones are systematic. The \emph{CP} asymmetries in these decays are also measured and all are found to be compatible with zero.
△ Less
Submitted 30 May, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
Superfluid Oscillator Circuit with Quantum Current Regulator
Authors:
Xue Yang,
Wenkai Bai,
Chen Jiao,
Wu-Ming Liu,
Jun-Hui Zheng,
Tao Yang
Abstract:
We examine the properties of atomic current in a superfluid oscillating circuit consisting of a mesoscopic channel that connects two reservoirs of a Bose-Einstein condensate. We investigate the presence of a critical current in the channel and examine how the amplitude of the oscillations in the number imbalance between the two reservoirs varies with system parameters. In addition to highlighting…
▽ More
We examine the properties of atomic current in a superfluid oscillating circuit consisting of a mesoscopic channel that connects two reservoirs of a Bose-Einstein condensate. We investigate the presence of a critical current in the channel and examine how the amplitude of the oscillations in the number imbalance between the two reservoirs varies with system parameters. In addition to highlighting that the dissipative resistance stems from the formation of vortex pairs, we also illustrate the role of these vortex pairs as a quantum current regulator. The dissipation strength is discrete based on the number imbalance, which corresponds to the emergence of vortex pairs in the system. Our findings indicate that the circuit demonstrates characteristics of both voltage-limiting and current-limiting mechanisms. To model the dam** behavior of the atomic superfluid circuit, we develop an equivalent LC oscillator circuit with a quantum current regulator.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (600 additional authors not shown)
Abstract:
By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fra…
▽ More
By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fractions are measured to be $\mathcal{B}(D^0\rightarrow {K}_1(1270)^-(\to K^0_Sπ^-π^0)e^+ν_e)=(1.69^{+0.53}_{-0.46}\pm0.15)\times10^{-4}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0(\to K^0_Sπ^+π^-)e^+ν_e)=(1.47^{+0.45}_{-0.40}\pm0.20)\times10^{-4}$ with statistical significance of 5.4$σ$ and 5.6$σ$, respectively. When combined with measurements of the $K_1(1270)\to K^+π^-π$ decays, the absolute branching fractions are determined to be $\mathcal{B}(D^0\to K_1(1270)^-e^+ν_e)=(1.05^{+0.33}_{-0.28}\pm0.12\pm0.12)\times10^{-3}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0e^+ν_e)=(1.29^{+0.40}_{-0.35}\pm0.18\pm0.15)\times10^{-3}$. The first and second uncertainties are statistical and systematic, respectively, and the third uncertainties originate from the assumed branching fractions of the $K_1(1270)\to Kππ$ decays.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Electromagnetic chirality-induced negative refraction with the same amplitude and anti-phase of the two chirality coefficients
Authors:
Shun-Cai Zhao,
Zheng-Dong Liu,
Jun Zheng,
Gen Li
Abstract:
We suggest a scheme of electromagnetic chirality-induced negative refraction utilizing magneto-electric cross coupling in a four-level atomic system. The negative refraction can be achieved with the two chirality coefficients having the same amplitude but the opposite phase,and without requiring the simultaneous presence of an electric-dipole and a magnetic-dipole transition near the same transiti…
▽ More
We suggest a scheme of electromagnetic chirality-induced negative refraction utilizing magneto-electric cross coupling in a four-level atomic system. The negative refraction can be achieved with the two chirality coefficients having the same amplitude but the opposite phase,and without requiring the simultaneous presence of an electric-dipole and a magnetic-dipole transition near the same transition frequency. The simultaneously negative electric permittivity and magnetic permeability does not require, either.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model
Authors:
Runmin Dong,
Shuai Yuan,
Bin Luo,
Mengxuan Chen,
**xiao Zhang,
Lixian Zhang,
Weijia Li,
Juepeng Zheng,
Haohuan Fu
Abstract:
Reference-based super-resolution (RefSR) has the potential to build bridges across spatial and temporal resolutions of remote sensing images. However, existing RefSR methods are limited by the faithfulness of content reconstruction and the effectiveness of texture transfer in large scaling factors. Conditional diffusion models have opened up new opportunities for generating realistic high-resoluti…
▽ More
Reference-based super-resolution (RefSR) has the potential to build bridges across spatial and temporal resolutions of remote sensing images. However, existing RefSR methods are limited by the faithfulness of content reconstruction and the effectiveness of texture transfer in large scaling factors. Conditional diffusion models have opened up new opportunities for generating realistic high-resolution images, but effectively utilizing reference images within these models remains an area for further exploration. Furthermore, content fidelity is difficult to guarantee in areas without relevant reference information. To solve these issues, we propose a change-aware diffusion model named Ref-Diff for RefSR, using the land cover change priors to guide the denoising process explicitly. Specifically, we inject the priors into the denoising model to improve the utilization of reference information in unchanged areas and regulate the reconstruction of semantically relevant content in changed areas. With this powerful guidance, we decouple the semantics-guided denoising and reference texture-guided denoising processes to improve the model performance. Extensive experiments demonstrate the superior effectiveness and robustness of the proposed method compared with state-of-the-art RefSR methods in both quantitative and qualitative evaluations. The code and data are available at https://github.com/dongrunmin/RefDiff.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving
Authors:
Junhao Zheng,
Chenhao Lin,
Jiahao Sun,
Zhengyu Zhao,
Qian Li,
Chao Shen
Abstract:
Deep learning-based monocular depth estimation (MDE), extensively applied in autonomous driving, is known to be vulnerable to adversarial attacks. Previous physical attacks against MDE models rely on 2D adversarial patches, so they only affect a small, localized region in the MDE map but fail under various viewpoints. To address these limitations, we propose 3D Depth Fool (3D$^2$Fool), the first 3…
▽ More
Deep learning-based monocular depth estimation (MDE), extensively applied in autonomous driving, is known to be vulnerable to adversarial attacks. Previous physical attacks against MDE models rely on 2D adversarial patches, so they only affect a small, localized region in the MDE map but fail under various viewpoints. To address these limitations, we propose 3D Depth Fool (3D$^2$Fool), the first 3D texture-based adversarial attack against MDE models. 3D$^2$Fool is specifically optimized to generate 3D adversarial textures agnostic to model types of vehicles and to have improved robustness in bad weather conditions, such as rain and fog. Experimental results validate the superior performance of our 3D$^2$Fool across various scenarios, including vehicles, MDE models, weather conditions, and viewpoints. Real-world experiments with printed 3D textures on physical vehicle models further demonstrate that our 3D$^2$Fool can cause an MDE error of over 10 meters.
△ Less
Submitted 27 March, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Cross section measurement of $e^+e^-\to ηψ(2S)$ and search for $e^+e^-\toη\tilde{X}(3872)$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
The energy-dependent cross section for $e^+e^-\to ηψ(2S)$ is measured at eighteen center of mass energies from 4.288 GeV to 4.951 GeV using the BESIII detector. Using the same data samples, we also perform the first search for the reaction $e^+e^-\toη\tilde{X}(3872)$, but no evidence is found for the $\tilde{X}(3872)$ in the $π^+π^- J/ψ$ mass distribution. At each of the eighteen center of mass en…
▽ More
The energy-dependent cross section for $e^+e^-\to ηψ(2S)$ is measured at eighteen center of mass energies from 4.288 GeV to 4.951 GeV using the BESIII detector. Using the same data samples, we also perform the first search for the reaction $e^+e^-\toη\tilde{X}(3872)$, but no evidence is found for the $\tilde{X}(3872)$ in the $π^+π^- J/ψ$ mass distribution. At each of the eighteen center of mass energies, upper limits at the 90\% confidence level on the cross section for $e^+e^-\toηψ(2S)$ and on the product of the $e^+e^-\toη\tilde{X}(3872)$ cross section with the branching fraction of $\tilde{X}(3872)\toπ^+π^- J/ψ$ are reported.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Precise measurement of the $e^+e^-\to D_s^+D_s^-$ cross sections at center-of-mass energies from threshold to 4.95 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using the $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, at center-of-mass energies from the threshold to $4.95$~GeV, we present precise measurements of the cross sections for the process $e^+e^-\to D_s^+D_s^-$ using a single tag method. The resulting cross section lineshape exhibits several new structures, thereby offering an input for coupled channel…
▽ More
Using the $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, at center-of-mass energies from the threshold to $4.95$~GeV, we present precise measurements of the cross sections for the process $e^+e^-\to D_s^+D_s^-$ using a single tag method. The resulting cross section lineshape exhibits several new structures, thereby offering an input for coupled channel analysis and model tests, which are critical to understand vector charmonium-like states with masses between 4 and 5~GeV.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
Authors:
Yufan Chen,
Jiaming Zhang,
Kunyu Peng,
Junwei Zheng,
Rui** Liu,
Philip Torr,
Rainer Stiefelhagen
Abstract:
Before develo** a Document Layout Analysis (DLA) model in real-world applications, conducting comprehensive robustness testing is essential. However, the robustness of DLA models remains underexplored in the literature. To address this, we are the first to introduce a robustness benchmark for DLA models, which includes 450K document images of three datasets. To cover realistic corruptions, we pr…
▽ More
Before develo** a Document Layout Analysis (DLA) model in real-world applications, conducting comprehensive robustness testing is essential. However, the robustness of DLA models remains underexplored in the literature. To address this, we are the first to introduce a robustness benchmark for DLA models, which includes 450K document images of three datasets. To cover realistic corruptions, we propose a perturbation taxonomy with 36 common document perturbations inspired by real-world document processing. Additionally, to better understand document perturbation impacts, we propose two metrics, Mean Perturbation Effect (mPE) for perturbation assessment and Mean Robustness Degradation (mRD) for robustness evaluation. Furthermore, we introduce a self-titled model, i.e., Robust Document Layout Analyzer (RoDLA), which improves attention mechanisms to boost extraction of robust features. Experiments on the proposed benchmarks (PubLayNet-P, DocLayNet-P, and M$^6$Doc-P) demonstrate that RoDLA obtains state-of-the-art mRD scores of 115.7, 135.4, and 150.4, respectively. Compared to previous methods, RoDLA achieves notable improvements in mAP of +3.8%, +7.1% and +12.1%, respectively.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Search for $ΔS=2$ nonleptonic hyperon decays $Ω^-\toΣ^{0}π^{-}$ and $Ω^-\to nK^{-}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the center-of-mass energy of $\sqrt{s} = 3.686$ GeV, we search for the first time for two nonleptonic hyperon decays that change strangeness by two units, $Ω^-\toΣ^{0}π^-$ and $Ω^-\to nK^{-}$. No significant signal is observed. The upper limits on their decay branching fractions are determined to be…
▽ More
Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the center-of-mass energy of $\sqrt{s} = 3.686$ GeV, we search for the first time for two nonleptonic hyperon decays that change strangeness by two units, $Ω^-\toΣ^{0}π^-$ and $Ω^-\to nK^{-}$. No significant signal is observed. The upper limits on their decay branching fractions are determined to be $\mathcal{B}(Ω^-\toΣ^{0}π^-) < 5.4\times 10^{-4}$ and $\mathcal{B}(Ω^-\to nK^{-}) < 2.4\times 10^{-4}$ at the $90\%$ confidence level.
△ Less
Submitted 14 April, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
Adaptive Critical Subgraph Mining for Cognitive Impairment Conversion Prediction with T1-MRI-based Brain Network
Authors:
Yilin Leng,
Wenju Cui,
Bai Chen,
Xi Jiang,
Shuangqing Chen,
Jian Zheng
Abstract:
Prediction the conversion to early-stage dementia is critical for mitigating its progression but remains challenging due to subtle cognitive impairments and structural brain changes. Traditional T1-weighted magnetic resonance imaging (T1-MRI) research focus on identifying brain atrophy regions but often fails to address the intricate connectivity between them. This limitation underscores the neces…
▽ More
Prediction the conversion to early-stage dementia is critical for mitigating its progression but remains challenging due to subtle cognitive impairments and structural brain changes. Traditional T1-weighted magnetic resonance imaging (T1-MRI) research focus on identifying brain atrophy regions but often fails to address the intricate connectivity between them. This limitation underscores the necessity of focuing on inter-regional connectivity for a comprehensive understand of the brain's complex network. Moreover, there is a pressing demand for methods that adaptively preserve and extract critical information, particularly specialized subgraph mining techniques for brain networks. These are essential for develo** high-quality feature representations that reveal critical spatial impacts of structural brain changes and its topology. In this paper, we propose Brain-SubGNN, a novel graph representation network to mine and enhance critical subgraphs based on T1-MRI. This network provides a subgraph-level interpretation, enhancing interpretability and insights for graph analysis. The process begins by extracting node features and a correlation matrix between nodes to construct a task-oriented brain network. Brain-SubGNN then adaptively identifies and enhances critical subgraphs, capturing both loop and neighbor subgraphs. This method reflects the loop topology and local changes, indicative of long-range connections, and maintains local and global brain attributes. Extensive experiments validate the effectiveness and advantages of Brain-SubGNN, demonstrating its potential as a powerful tool for understanding and diagnosing early-stage dementia. Source code is available at https://github.com/Leng-10/Brain-SubGNN.
△ Less
Submitted 26 June, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
Quantum valley Hall states in low-buckled counterparts of graphene bilayer
Authors:
Yu-Hao Shen,
Jun-Ding Zheng,
Wen-Yi Tong,
Chun-Gang Duan
Abstract:
With low-buckled structure for each layer in graphene bilayer system, there breaks inversion symmetry (P-symmetry) for one stacking when both A and B sublattices in top layer are aligned with those in bottom layer. In consideration of spin-orbit coupling (SOC), there opens nontrivial topological gap in each monolayer system to achieve quantum spin Hall effect (QSHE). As long as time-reversal symme…
▽ More
With low-buckled structure for each layer in graphene bilayer system, there breaks inversion symmetry (P-symmetry) for one stacking when both A and B sublattices in top layer are aligned with those in bottom layer. In consideration of spin-orbit coupling (SOC), there opens nontrivial topological gap in each monolayer system to achieve quantum spin Hall effect (QSHE). As long as time-reversal symmetry (T-symmetry) is preserved the gapless edge states is robust in each individual layer even for the bilayer absent of PT symmetry. Based on this platform and through tight-binding (TB) model calculations we find it becomes a typical system that can exhibit quantum valley Hall effect (QVHE) when introduced a layer-resolved Rashba SOC that leads to band inversion at each K valley in the hexagonal Brillion zone (BZ). The topological transition comes from that the valley Chern number Cv = CK - CK' switches from 0 to 2, which characterizes the nontrivial QVHE phase transited from two coupled Z2 topological insulators. We also point that the layer-resolved Rashba SOC can be introduced equivalently by twisting two van der Waals touched layers. And through TB calculations, it is shown that the K bands inverts in its corresponding mini BZ when the two layers twisted by a small angle. Our findings advance potential applications for the devices design in topological valleytronics and twistronics.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Inter- and intra-uncertainty based feature aggregation model for semi-supervised histopathology image segmentation
Authors:
Qiangguo **,
Hui Cui,
Changming Sun,
Yang Song,
Jiangbin Zheng,
Leilei Cao,
Leyi Wei,
Ran Su
Abstract:
Acquiring pixel-level annotations is often limited in applications such as histology studies that require domain expertise. Various semi-supervised learning approaches have been developed to work with limited ground truth annotations, such as the popular teacher-student models. However, hierarchical prediction uncertainty within the student model (intra-uncertainty) and image prediction uncertaint…
▽ More
Acquiring pixel-level annotations is often limited in applications such as histology studies that require domain expertise. Various semi-supervised learning approaches have been developed to work with limited ground truth annotations, such as the popular teacher-student models. However, hierarchical prediction uncertainty within the student model (intra-uncertainty) and image prediction uncertainty (inter-uncertainty) have not been fully utilized by existing methods. To address these issues, we first propose a novel inter- and intra-uncertainty regularization method to measure and constrain both inter- and intra-inconsistencies in the teacher-student architecture. We also propose a new two-stage network with pseudo-mask guided feature aggregation (PG-FANet) as the segmentation model. The two-stage structure complements with the uncertainty regularization strategy to avoid introducing extra modules in solving uncertainties and the aggregation mechanisms enable multi-scale and multi-stage feature integration. Comprehensive experimental results over the MoNuSeg and CRAG datasets show that our PG-FANet outperforms other state-of-the-art methods and our semi-supervised learning framework yields competitive performance with a limited amount of labeled data.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Dielectric response in twisted MoS2 bilayer facilitated by spin-orbit coupling effect
Authors:
Yu-Hao Shen,
Jun-Ding Zheng,
Wen-Yi Tong,
Xian-Gang Wan,
Chun-Gang Duan
Abstract:
Twisted van der Waals bilayers provide ideal two-dimensional (2D) platforms to study the interplay between spin and charge degree of freedom of electron. An exotic dielectric response behavior in such system is what we find here through the investigation of twisted bilayer MoS2 with two different stackings but same size of the commensurate supercell, which forms H-type like and R-type like stacked…
▽ More
Twisted van der Waals bilayers provide ideal two-dimensional (2D) platforms to study the interplay between spin and charge degree of freedom of electron. An exotic dielectric response behavior in such system is what we find here through the investigation of twisted bilayer MoS2 with two different stackings but same size of the commensurate supercell, which forms H-type like and R-type like stacked bilayer system. Our first-principles calculations show that applying an out-of-plane electric field gives rise to different response of the electric polarization, whose susceptibility is suppressed in H-type like case compared with that of R-type like one. Further analysis shows the dielectric response to perpendicular electric field is linked to the moire ferroelectric dipole texture. Their underlying link comes from that, through Rashba spin-orbit coupling (SOC) effect, the external electric field tends to change the internal pseudo-spin texture, which further depends on the twist induced dipole texture. Consequently, the suppression of dielectric response can be attributed to the topology protection of such ferroelectric dipole structure.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Multi-State, Ultra-thin, BEOL-Compatible AlScN Ferroelectric Diodes
Authors:
Kwan-Ho Kim,
Zirun Han,
Yinuo Zhang,
Pariasadat Musavigharavi,
Jeffrey Zheng,
Dhiren K. Pradhan,
Eric A. Stach,
Roy H. Olsson III,
Deep Jariwala
Abstract:
The growth in data generation necessitates efficient data processing technologies to address the von Neumann bottleneck in conventional computer architecture. Memory-driven computing, which integrates non-volatile memory (NVM) devices in a 3D stack, is gaining attention, with CMOS back-end-of-line (BEOL) compatible ferroelectric (FE) diodes being ideal due to their two-terminal design and inherent…
▽ More
The growth in data generation necessitates efficient data processing technologies to address the von Neumann bottleneck in conventional computer architecture. Memory-driven computing, which integrates non-volatile memory (NVM) devices in a 3D stack, is gaining attention, with CMOS back-end-of-line (BEOL) compatible ferroelectric (FE) diodes being ideal due to their two-terminal design and inherently selector-free nature, facilitating high-density crossbar arrays. Here, we demonstrate BEOL-compatible, high-performance FE-diodes scaled to 5, 10, and 20 nm FE Al0.72Sc0.28N/Al0.64Sc0.36N films. Through interlayer (IL) engineering, we show substantial improvements in the ON/OFF ratios (>166 times) and rectification ratios (>176 times) in these scaled devices. The superlative characteristics also enables 5-bit multi-state operation with a stable retention. We also experimentally and theoretically demonstrate the counterintuitive result that the inclusion of an IL can lead to a decrease in the ferroelectric switching voltage of the device. An in-depth analysis into the device transport mechanisms is performed, and our compact model aligns seamlessly with the experimental results. Our results suggest the possibility of using scaled AlxSc1-xN FE-diodes for high performance, low-power, embedded NVM.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
The Lyapunov exponent as a signature of dissipative many-body quantum chaos
Authors:
Antonio M. García-García,
Jacobus J. M. Verbaarschot,
Jie-** Zheng
Abstract:
A distinct feature of Hermitian quantum chaotic dynamics is the exponential increase of certain out-of-time-order-correlation (OTOC) functions around the Ehrenfest time with a rate given by a Lyapunov exponent. Physically, the OTOCs describe the growth of quantum uncertainty that crucially depends on the nature of the quantum motion. Here, we employ the OTOC in order to provide a precise definitio…
▽ More
A distinct feature of Hermitian quantum chaotic dynamics is the exponential increase of certain out-of-time-order-correlation (OTOC) functions around the Ehrenfest time with a rate given by a Lyapunov exponent. Physically, the OTOCs describe the growth of quantum uncertainty that crucially depends on the nature of the quantum motion. Here, we employ the OTOC in order to provide a precise definition of dissipative quantum chaos. For this purpose, we compute analytically the Lyapunov exponent for the vectorized formulation of the large $q$-limit of a $q$-body Sachdev-Ye-Kitaev model coupled to a Markovian bath. These analytic results are confirmed by an explicit numerical calculation of the Lyapunov exponent for several values of $q \geq 4$ based on the solutions of the Schwinger-Dyson and Bethe-Salpeter equations. We show that the Lyapunov exponent decreases monotonically as the coupling to the bath increases and eventually becomes negative at a critical value of the coupling signaling a transition to a dynamics which is no longer quantum chaotic. Therefore, a positive Lyapunov exponent is a defining feature of dissipative many-body quantum chaos. The observation of the breaking of the exponential growth for sufficiently strong coupling suggests that dissipative quantum chaos may require in certain cases a sufficiently weak coupling to the environment.
△ Less
Submitted 14 June, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Universal Fluctuation-Response Relations of Nonequilibrium Dynamics: A Trajectory Information Geometry Framework
Authors:
Jiming Zheng,
Zhiyue Lu
Abstract:
Unraveling the universal principles governing the response of complex systems to environmental changes is crucial for predicting and controlling their behavior. While fluctuation-dissipation relations have been established for systems near equilibrium, a general framework for understanding the responsiveness of systems far from steady states remains elusive. Here, we introduce a novel approach bas…
▽ More
Unraveling the universal principles governing the response of complex systems to environmental changes is crucial for predicting and controlling their behavior. While fluctuation-dissipation relations have been established for systems near equilibrium, a general framework for understanding the responsiveness of systems far from steady states remains elusive. Here, we introduce a novel approach based on the information geometry of stochastic trajectories to derive a set of universal thermodynamic bounds on the response of any Markov system, regardless of its proximity to steady states. This theory establishes a new paradigm in non-equilibrium statistical mechanics, providing a unified perspective on the behavior of non-stationary systems, from biological processes to engineered devices, and paving the way for designing complex responsiveness in far-from-equilibrium systems.
△ Less
Submitted 7 May, 2024; v1 submitted 16 March, 2024;
originally announced March 2024.
-
Test of lepton universality and measurement of the form factors of $D^0\to K^{*}(892)^-μ^+ν_μ$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (637 additional authors not shown)
Abstract:
We report a first study of the semileptonic decay $D^0\rightarrow K^-π^0μ^{+}ν_μ$ by analyzing an $e^+e^-$ annihilation data sample of $7.9~\mathrm{fb}^{-1}$ collected at the center-of-mass energy of 3.773 GeV with the BESIII detector. The absolute branching fraction of $D^0\to K^-π^0μ^{+}ν_μ$ is measured for the first time to be $(0.729 \pm 0.014_{\rm stat} \pm 0.011_{\rm syst})\%$. Based on an a…
▽ More
We report a first study of the semileptonic decay $D^0\rightarrow K^-π^0μ^{+}ν_μ$ by analyzing an $e^+e^-$ annihilation data sample of $7.9~\mathrm{fb}^{-1}$ collected at the center-of-mass energy of 3.773 GeV with the BESIII detector. The absolute branching fraction of $D^0\to K^-π^0μ^{+}ν_μ$ is measured for the first time to be $(0.729 \pm 0.014_{\rm stat} \pm 0.011_{\rm syst})\%$. Based on an amplitude analysis, the $S\text{-}{\rm wave}$ contribution is determined to be $(5.76 \pm 0.35_{\rm stat} \pm 0.29_{\rm syst})\%$ of the total decay rate in addition to the dominated $K^{*}(892)^-$ component. The branching fraction of $D^0\to K^{*}(892)^-μ^+ν_μ$ is given to be $(2.062 \pm 0.039_{\rm stat} \pm 0.032_{\rm syst})\%$, which improves the precision of the world average by a factor of 5. Combining with the world average of ${\mathcal B}(D^0\to K^{*}(892)^-e^+ν_e)$, the ratio of the branching fractions obtained is $\frac{{\mathcal B}(D^0\to K^{*}(892)^-μ^+ν_μ)}{{\mathcal B}(D^0\to K^{*}(892)^-e^+ν_e)} = 0.96\pm0.08$, in agreement with lepton flavor universality. Furthermore, assuming single-pole dominance parameterization, the most precise hadronic form factor ratios for $D^0\to K^{*}(892)^{-} μ^+ν_μ$ are extracted to be $r_{V}=V(0)/A_1(0)=1.37 \pm 0.09_{\rm stat} \pm 0.03_{\rm syst}$ and $r_{2}=A_2(0)/A_1(0)=0.76 \pm 0.06_{\rm stat} \pm 0.02_{\rm syst}$.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Skeleton-Based Human Action Recognition with Noisy Labels
Authors:
Yi Xu,
Kunyu Peng,
Di Wen,
Rui** Liu,
Junwei Zheng,
Yufan Chen,
Jiaming Zhang,
Alina Roitberg,
Kailun Yang,
Rainer Stiefelhagen
Abstract:
Understanding human actions from body poses is critical for assistive robots sharing space with humans in order to make informed and safe decisions about the next interaction. However, precise temporal localization and annotation of activity sequences is time-consuming and the resulting labels are often noisy. If not effectively addressed, label noise negatively affects the model's training, resul…
▽ More
Understanding human actions from body poses is critical for assistive robots sharing space with humans in order to make informed and safe decisions about the next interaction. However, precise temporal localization and annotation of activity sequences is time-consuming and the resulting labels are often noisy. If not effectively addressed, label noise negatively affects the model's training, resulting in lower recognition quality. Despite its importance, addressing label noise for skeleton-based action recognition has been overlooked so far. In this study, we bridge this gap by implementing a framework that augments well-established skeleton-based human action recognition methods with label-denoising strategies from various research areas to serve as the initial benchmark. Observations reveal that these baselines yield only marginal performance when dealing with sparse skeleton data. Consequently, we introduce a novel methodology, NoiseEraSAR, which integrates global sample selection, co-teaching, and Cross-Modal Mixture-of-Experts (CM-MOE) strategies, aimed at mitigating the adverse impacts of label noise. Our proposed approach demonstrates better performance on the established benchmark, setting new state-of-the-art standards. The source code for this study will be made accessible at https://github.com/xuyizdby/NoiseEraSAR.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
rFaceNet: An End-to-End Network for Enhanced Physiological Signal Extraction through Identity-Specific Facial Contours
Authors:
Dali Zhu,
Wenli Zhang,
Hualin Zeng,
Xiaohao Liu,
Long Yang,
Jiaqi Zheng
Abstract:
Remote photoplethysmography (rPPG) technique extracts blood volume pulse (BVP) signals from subtle pixel changes in video frames. This study introduces rFaceNet, an advanced rPPG method that enhances the extraction of facial BVP signals with a focus on facial contours. rFaceNet integrates identity-specific facial contour information and eliminates redundant data. It efficiently extracts facial con…
▽ More
Remote photoplethysmography (rPPG) technique extracts blood volume pulse (BVP) signals from subtle pixel changes in video frames. This study introduces rFaceNet, an advanced rPPG method that enhances the extraction of facial BVP signals with a focus on facial contours. rFaceNet integrates identity-specific facial contour information and eliminates redundant data. It efficiently extracts facial contours from temporally normalized frame inputs through a Temporal Compressor Unit (TCU) and steers the model focus to relevant facial regions by using the Cross-Task Feature Combiner (CTFC). Through elaborate training, the quality and interpretability of facial physiological signals extracted by rFaceNet are greatly improved compared to previous methods. Moreover, our novel approach demonstrates superior performance than SOTA methods in various heart rate estimation benchmarks.
△ Less
Submitted 17 March, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
Robust Synthetic-to-Real Transfer for Stereo Matching
Authors:
Jiawei Zhang,
Jiahe Li,
Lei Huang,
Xiaohan Yu,
Lin Gu,
** Zheng,
Xiao Bai
Abstract:
With advancements in domain generalized stereo matching networks, models pre-trained on synthetic data demonstrate strong robustness to unseen domains. However, few studies have investigated the robustness after fine-tuning them in real-world scenarios, during which the domain generalization ability can be seriously degraded. In this paper, we explore fine-tuning stereo matching networks without c…
▽ More
With advancements in domain generalized stereo matching networks, models pre-trained on synthetic data demonstrate strong robustness to unseen domains. However, few studies have investigated the robustness after fine-tuning them in real-world scenarios, during which the domain generalization ability can be seriously degraded. In this paper, we explore fine-tuning stereo matching networks without compromising their robustness to unseen domains. Our motivation stems from comparing Ground Truth (GT) versus Pseudo Label (PL) for fine-tuning: GT degrades, but PL preserves the domain generalization ability. Empirically, we find the difference between GT and PL implies valuable information that can regularize networks during fine-tuning. We also propose a framework to utilize this difference for fine-tuning, consisting of a frozen Teacher, an exponential moving average (EMA) Teacher, and a Student network. The core idea is to utilize the EMA Teacher to measure what the Student has learned and dynamically improve GT and PL for fine-tuning. We integrate our framework with state-of-the-art networks and evaluate its effectiveness on several real-world datasets. Extensive experiments show that our method effectively preserves the domain generalization ability during fine-tuning.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization
Authors:
Jiahe Li,
Jiawei Zhang,
Xiao Bai,
** Zheng,
Xin Ning,
Jun Zhou,
Lin Gu
Abstract:
Radiance fields have demonstrated impressive performance in synthesizing novel views from sparse input views, yet prevailing methods suffer from high training costs and slow inference speed. This paper introduces DNGaussian, a depth-regularized framework based on 3D Gaussian radiance fields, offering real-time and high-quality few-shot novel view synthesis at low costs. Our motivation stems from t…
▽ More
Radiance fields have demonstrated impressive performance in synthesizing novel views from sparse input views, yet prevailing methods suffer from high training costs and slow inference speed. This paper introduces DNGaussian, a depth-regularized framework based on 3D Gaussian radiance fields, offering real-time and high-quality few-shot novel view synthesis at low costs. Our motivation stems from the highly efficient representation and surprising quality of the recent 3D Gaussian Splatting, despite it will encounter a geometry degradation when input views decrease. In the Gaussian radiance fields, we find this degradation in scene geometry primarily lined to the positioning of Gaussian primitives and can be mitigated by depth constraint. Consequently, we propose a Hard and Soft Depth Regularization to restore accurate scene geometry under coarse monocular depth supervision while maintaining a fine-grained color appearance. To further refine detailed geometry resha**, we introduce Global-Local Depth Normalization, enhancing the focus on small local depth changes. Extensive experiments on LLFF, DTU, and Blender datasets demonstrate that DNGaussian outperforms state-of-the-art methods, achieving comparable or better results with significantly reduced memory cost, a $25 \times$ reduction in training time, and over $3000 \times$ faster rendering speed.
△ Less
Submitted 24 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
Determination of the number of $ψ(3686)$ events taken at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
The number of $ψ(3686)$ events collected by the BESIII detector during the 2021 run period is determined to be $(2259.3\pm 11.1)\times 10^6$ by counting inclusive $ψ(3686)$ hadronic events. The uncertainty is systematic and the statistical uncertainty is negligible. Meanwhile, the numbers of $ψ(3686)$ events collected during the 2009 and 2012 run periods are updated to be…
▽ More
The number of $ψ(3686)$ events collected by the BESIII detector during the 2021 run period is determined to be $(2259.3\pm 11.1)\times 10^6$ by counting inclusive $ψ(3686)$ hadronic events. The uncertainty is systematic and the statistical uncertainty is negligible. Meanwhile, the numbers of $ψ(3686)$ events collected during the 2009 and 2012 run periods are updated to be $(107.7\pm0.6)\times 10^6$ and $(345.4\pm 2.6)\times 10^6$, respectively. Both numbers are consistent with the previous measurements within one standard deviation. The total number of $ψ(3686)$ events in the three data samples is $(2712.4\pm14.3)\times10^6$.
△ Less
Submitted 28 May, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
Precision premium transformation -- a high-precision astrometric solution based on the precision premium curve
Authors:
Z. J. Zheng,
Q. Y. Peng,
F. R. Lin,
D. Li,
Y. Zheng
Abstract:
Context. In Gaia era, atmospheric turbulence, which causes stochastic wander of a star image, is a fundamental limitation to the astrometric accuracy of ground-based optical imaging. However, the positional bias caused by turbulence (called turbulence error here) can be effectively reduced by measuring a target relative to another reference (a star or a fast-moving target) which locates in the ran…
▽ More
Context. In Gaia era, atmospheric turbulence, which causes stochastic wander of a star image, is a fundamental limitation to the astrometric accuracy of ground-based optical imaging. However, the positional bias caused by turbulence (called turbulence error here) can be effectively reduced by measuring a target relative to another reference (a star or a fast-moving target) which locates in the range of only several tens of arcsec, since they suffer from similar turbulence errors. This phenomenon is called the precision premium and has been effectively applied to the astrometry of solar system. Further investigation for the precision premium shows that, the precision premium works at less than about 100 arcsec for two specific objects and the relative positional precision as a function of their angular seperation can be well fitted by a sigmoidal function, called the precision premium curve (PPC). Aims. We want to reduce the turbulence error of a target if it is imaged in an area of high stellar density of a ground-based observation by taking advantage of more Gaia reference stars. Methods. Based on the PPC, we proposed a high-precision astrometric solution called precision premium transformation (PPT) in this paper, which takes advantage of high similarity of turbulence errors in a small region and the dense Gaia reference stars in the region to reduce the turbulence errors on the observation, through a weighted solution. Results. Through systematic analysis, the PPT method exhibits significant advantages in terms of not only precision but also applicability when a target is imaged in an area of high stellar density. The PPT method is also applied to the determination of the proper motion of an open cluster, and the results demonstrate and quantify benefits that the PPT method bestows on ground-based astrometry.
△ Less
Submitted 9 March, 2024;
originally announced March 2024.
-
Advances of Deep Learning in Protein Science: A Comprehensive Survey
Authors:
Bozhen Hu,
Cheng Tan,
Lirong Wu,
Jiangbin Zheng,
Jun Xia,
Zhangyang Gao,
Zicheng Liu,
Fandi Wu,
Guijun Zhang,
Stan Z. Li
Abstract:
Protein representation learning plays a crucial role in understanding the structure and function of proteins, which are essential biomolecules involved in various biological processes. In recent years, deep learning has emerged as a powerful tool for protein modeling due to its ability to learn complex patterns and representations from large-scale protein data. This comprehensive survey aims to pr…
▽ More
Protein representation learning plays a crucial role in understanding the structure and function of proteins, which are essential biomolecules involved in various biological processes. In recent years, deep learning has emerged as a powerful tool for protein modeling due to its ability to learn complex patterns and representations from large-scale protein data. This comprehensive survey aims to provide an overview of the recent advances in deep learning techniques applied to protein science. The survey begins by introducing the developments of deep learning based protein models and emphasizes the importance of protein representation learning in drug discovery, protein engineering, and function annotation. It then delves into the fundamentals of deep learning, including convolutional neural networks, recurrent neural networks, attention models, and graph neural networks in modeling protein sequences, structures, and functions, and explores how these techniques can be used to extract meaningful features and capture intricate relationships within protein data. Next, the survey presents various applications of deep learning in the field of proteins, including protein structure prediction, protein-protein interaction prediction, protein function prediction, etc. Furthermore, it highlights the challenges and limitations of these deep learning techniques and also discusses potential solutions and future directions for overcoming these challenges. This comprehensive survey provides a valuable resource for researchers and practitioners in the field of proteins who are interested in harnessing the power of deep learning techniques. By consolidating the latest advancements and discussing potential avenues for improvement, this review contributes to the ongoing progress in protein research and paves the way for future breakthroughs in the field.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation
Authors:
Joseph Cho,
Fachrina Dewi Puspitasari,
Sheng Zheng,
**gyao Zheng,
Lik-Hang Lee,
Tae-Ho Kim,
Choong Seon Hong,
Chaoning Zhang
Abstract:
The evolution of video generation from text, starting with animating MNIST numbers to simulating the physical world with Sora, has progressed at a breakneck speed over the past seven years. While often seen as a superficial expansion of the predecessor text-to-image generation model, text-to-video generation models are developed upon carefully engineered constituents. Here, we systematically discu…
▽ More
The evolution of video generation from text, starting with animating MNIST numbers to simulating the physical world with Sora, has progressed at a breakneck speed over the past seven years. While often seen as a superficial expansion of the predecessor text-to-image generation model, text-to-video generation models are developed upon carefully engineered constituents. Here, we systematically discuss these elements consisting of but not limited to core building blocks (vision, language, and temporal) and supporting features from the perspective of their contributions to achieving a world model. We employ the PRISMA framework to curate 97 impactful research articles from renowned scientific databases primarily studying video synthesis using text conditions. Upon minute exploration of these manuscripts, we observe that text-to-video generation involves more intricate technologies beyond the plain extension of text-to-image generation. Our additional review into the shortcomings of Sora-generated videos pinpoints the call for more in-depth studies in various enabling aspects of video generation such as dataset, evaluation metric, efficient architecture, and human-controlled generation. Finally, we conclude that the study of the text-to-video generation may still be in its infancy, requiring contribution from the cross-discipline research community towards its advancement as the first step to realize artificial general intelligence (AGI).
△ Less
Submitted 7 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
PrimeComposer: Faster Progressively Combined Diffusion for Image Composition with Attention Steering
Authors:
Yibin Wang,
Weizhong Zhang,
Jianwei Zheng,
Cheng **
Abstract:
Image composition involves seamlessly integrating given objects into a specific visual context. The current training-free methods rely on composing attention weights from several samplers to guide the generator. However, since these weights are derived from disparate contexts, their combination leads to coherence confusion in synthesis and loss of appearance information. These issues worsen with t…
▽ More
Image composition involves seamlessly integrating given objects into a specific visual context. The current training-free methods rely on composing attention weights from several samplers to guide the generator. However, since these weights are derived from disparate contexts, their combination leads to coherence confusion in synthesis and loss of appearance information. These issues worsen with their excessive focus on background generation, even when unnecessary in this task. This not only slows down inference but also compromises foreground generation quality. Moreover, these methods introduce unwanted artifacts in the transition area. In this paper, we formulate image composition as a subject-based local editing task, solely focusing on foreground generation. At each step, the edited foreground is combined with the noisy background to maintain scene consistency. To address the remaining issues, we propose PrimeComposer, a faster training-free diffuser that composites the images by well-designed attention steering across different noise levels. This steering is predominantly achieved by our Correlation Diffuser, utilizing its self-attention layers at each step. Within these layers, the synthesized subject interacts with both the referenced object and background, capturing intricate details and coherent relationships. This prior information is encoded into the attention weights, which are then integrated into the self-attention layers of the generator to guide the synthesis process. Besides, we introduce a Region-constrained Cross-Attention to confine the impact of specific subject-related words to desired regions, addressing the unwanted artifacts shown in the prior method thereby further improving the coherence in the transition area. Our method exhibits the fastest inference efficiency and extensive experiments demonstrate our superiority both qualitatively and quantitatively.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.