-
DIG3D: Marrying Gaussian Splatting with Deformable Transformer for Single Image 3D Reconstruction
Authors:
Jiamin Wu,
Kenkun Liu,
Han Gao,
Xiaoke Jiang,
Lei Zhang
Abstract:
In this paper, we study the problem of 3D reconstruction from a single-view RGB image and propose a novel approach called DIG3D for 3D object reconstruction and novel view synthesis. Our method utilizes an encoder-decoder framework which generates 3D Gaussians in decoder with the guidance of depth-aware image features from encoder. In particular, we introduce the use of deformable transformer, all…
▽ More
In this paper, we study the problem of 3D reconstruction from a single-view RGB image and propose a novel approach called DIG3D for 3D object reconstruction and novel view synthesis. Our method utilizes an encoder-decoder framework which generates 3D Gaussians in decoder with the guidance of depth-aware image features from encoder. In particular, we introduce the use of deformable transformer, allowing efficient and effective decoding through 3D reference point and multi-layer refinement adaptations. By harnessing the benefits of 3D Gaussians, our approach offers an efficient and accurate solution for 3D reconstruction from single-view images. We evaluate our method on the ShapeNet SRN dataset, getting PSNR of 24.21 and 24.98 in car and chair dataset, respectively. The result outperforming the recent method by around 2.25%, demonstrating the effectiveness of our method in achieving superior results.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting
Authors:
Aobo Liang,
Xingguo Jiang,
Yan Sun,
Xiaohou Shi,
Ke Li
Abstract:
Long-term time series forecasting (LTSF) provides longer insights into future trends and patterns. Over the past few years, deep learning models especially Transformers have achieved advanced performance in LTSF tasks. However, LTSF faces inherent challenges such as long-term dependencies capturing and sparse semantic characteristics. Recently, a new state space model (SSM) named Mamba is proposed…
▽ More
Long-term time series forecasting (LTSF) provides longer insights into future trends and patterns. Over the past few years, deep learning models especially Transformers have achieved advanced performance in LTSF tasks. However, LTSF faces inherent challenges such as long-term dependencies capturing and sparse semantic characteristics. Recently, a new state space model (SSM) named Mamba is proposed. With the selective capability on input data and the hardware-aware parallel computing algorithm, Mamba has shown great potential in balancing predicting performance and computational efficiency compared to Transformers. To enhance Mamba's ability to preserve historical information in a longer range, we design a novel Mamba+ block by adding a forget gate inside Mamba to selectively combine the new features with the historical features in a complementary manner. Furthermore, we apply Mamba+ both forward and backward and propose Bi-Mamba+, aiming to promote the model's ability to capture interactions among time series elements. Additionally, multivariate time series data in different scenarios may exhibit varying emphasis on intra- or inter-series dependencies. Therefore, we propose a series-relation-aware decider that controls the utilization of channel-independent or channel-mixing tokenization strategy for specific datasets. Extensive experiments on 8 real-world datasets show that our model achieves more accurate predictions compared with state-of-the-art methods.
△ Less
Submitted 26 June, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
DVF: Advancing Robust and Accurate Fine-Grained Image Retrieval with Retrieval Guidelines
Authors:
Xin Jiang,
Hao Tang,
Rui Yan,
**hui Tang,
Zechao Li
Abstract:
Fine-grained image retrieval (FGIR) is to learn visual representations that distinguish visually similar objects while maintaining generalization. Existing methods propose to generate discriminative features, but rarely consider the particularity of the FGIR task itself. This paper presents a meticulous analysis leading to the proposal of practical guidelines to identify subcategory-specific discr…
▽ More
Fine-grained image retrieval (FGIR) is to learn visual representations that distinguish visually similar objects while maintaining generalization. Existing methods propose to generate discriminative features, but rarely consider the particularity of the FGIR task itself. This paper presents a meticulous analysis leading to the proposal of practical guidelines to identify subcategory-specific discrepancies and generate discriminative features to design effective FGIR models. These guidelines include emphasizing the object (G1), highlighting subcategory-specific discrepancies (G2), and employing effective training strategy (G3). Following G1 and G2, we design a novel Dual Visual Filtering mechanism for the plain visual transformer, denoted as DVF, to capture subcategory-specific discrepancies. Specifically, the dual visual filtering mechanism comprises an object-oriented module and a semantic-oriented module. These components serve to magnify objects and identify discriminative regions, respectively. Following G3, we implement a discriminative model training strategy to improve the discriminability and generalization ability of DVF. Extensive analysis and ablation studies confirm the efficacy of our proposed guidelines. Without bells and whistles, the proposed DVF achieves state-of-the-art performance on three widely-used fine-grained datasets in closed-set and open-set settings.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Self-generated magnetic field in three-dimensional ablative Rayleigh-Taylor instability
Authors:
Dehua Zhang,
Xian Jiang,
Tao Tao,
Jun Li,
Rui Yan,
De-Jun Sun,
Jian Zheng
Abstract:
The self-generated magnetic field in three-dimensional (3D) single-mode ablative Rayleigh-Taylor instabilities (ARTI) relevant to the acceleration phase of a direct-drive inertial confinement fusion (ICF) implosion is investigated. It is found that stronger magnetic fields up to a few thousands of T can be generated by 3D ARTI than by its two-dimensional (2D) counterpart. The Nernst effects signif…
▽ More
The self-generated magnetic field in three-dimensional (3D) single-mode ablative Rayleigh-Taylor instabilities (ARTI) relevant to the acceleration phase of a direct-drive inertial confinement fusion (ICF) implosion is investigated. It is found that stronger magnetic fields up to a few thousands of T can be generated by 3D ARTI than by its two-dimensional (2D) counterpart. The Nernst effects significantly alter the magnetic fields convection and amplify the magnetic fields. The scaling law for the magnetic flux obtained in the 2D simulations performs reasonably well in the 3D cases. While the magnetic field significantly accelerates the bubble growth in the short-wavelength 2D modes through modifying the heat fluxes, the magnetic field mostly accelerates the spike growth but has little influence on the bubble growth in 3D ARTI.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Study of $e^+e^-\toωX(3872)$ and $γX(3872)$ from 4.66 to 4.95 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be…
▽ More
Using data samples with an integrated luminosity of $4.5~\text{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.66 to 4.95 GeV, we study the processes of $e^+e^-\toωX(3872)$ and $e^+e^-\toγX(3872)$. With the $e^+e^-\toωX(3872)$ process, the branching fraction ratio $R\equiv\frac{\mathcal{B}(X(3872)\toγJ/ψ)}{\mathcal{B}(X(3872)\toπ^+π^- J/ψ)}$ is measured to be $0.38\pm0.20_\text{stat.}\pm0.01_\text{syst.}$ ($R< 0.83$ at 90\% confidence level). In addition, we measure the ratio of the average cross section of $e^+e^-\toωX(3872)$ to $e^+e^-\toωχ_{c1}(ωχ_{c2})$ to be $σ_{ωX(3872)}/σ_{ωχ_{c1}}~(σ_{ωX(3872)}/σ_{ωχ_{c2}})=5.2\pm1.0_\text{stat.}\pm1.9_\text{syst.}~ (5.5\pm1.1_\text{stat.}\pm2.4_\text{syst.})$. Finally, we search for the process of $e^+e^-\toγX(3872)$, and no obvious signal is observed. The upper limit on the ratio of the average cross section of $e^+e^-\toγX(3872)$ to $e^+e^-\toωX(3872)$ is set as $σ_{γX(3872)}/σ_{ωX(3872)}<0.23$ at 90\% confidence level.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation
Authors:
Yilong Chen,
Zongyi Xu,
xiaoshui Huang,
Ruicheng Zhang,
Xinqi Jiang,
Xinbo Gao
Abstract:
Current point cloud semantic segmentation has achieved great advances when given sufficient labels. However, the dense annotation of LiDAR point clouds remains prohibitively expensive and time-consuming, unable to keep up with the continuously growing volume of data. In this paper, we propose annotating images with scattered points, followed by utilizing SAM (a Foundation model) to generate semant…
▽ More
Current point cloud semantic segmentation has achieved great advances when given sufficient labels. However, the dense annotation of LiDAR point clouds remains prohibitively expensive and time-consuming, unable to keep up with the continuously growing volume of data. In this paper, we propose annotating images with scattered points, followed by utilizing SAM (a Foundation model) to generate semantic segmentation labels for the images. Finally, by map** the segmentation labels of the images to the LiDAR space using the intrinsic and extrinsic parameters of the camera and LiDAR, we obtain labels for point cloud semantic segmentation, and release Scatter-KITTI and Scatter-nuScenes, which are the first works to utilize image segmentation-based SAM for weakly supervised point cloud semantic segmentation. Furthermore, to mitigate the influence of erroneous pseudo labels obtained from sparse annotations on point cloud features, we propose a multi-modal weakly supervised network for LiDAR semantic segmentation, called MM-ScatterNet. This network combines features from both point cloud and image modalities, enhancing the representation learning of point clouds by introducing consistency constraints between multi-modal features and point cloud features. On the SemanticKITTI dataset, we achieve 66\% of fully supervised performance using only 0.02% of annotated data, and on the NuScenes dataset, we achieve 95% of fully supervised performance using only 0.1% labeled points.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation
Authors:
Chao **,
Zili Zhang,
Xuanlin Jiang,
Fangyue Liu,
Xin Liu,
Xuanzhe Liu,
Xin **
Abstract:
Retrieval-Augmented Generation (RAG) has shown significant improvements in various natural language processing tasks by integrating the strengths of large language models (LLMs) and external knowledge databases. However, RAG introduces long sequence generation and leads to high computation and memory costs. We propose RAGCache, a novel multilevel dynamic caching system tailored for RAG. Our analys…
▽ More
Retrieval-Augmented Generation (RAG) has shown significant improvements in various natural language processing tasks by integrating the strengths of large language models (LLMs) and external knowledge databases. However, RAG introduces long sequence generation and leads to high computation and memory costs. We propose RAGCache, a novel multilevel dynamic caching system tailored for RAG. Our analysis benchmarks current RAG systems, pinpointing the performance bottleneck (i.e., long sequence due to knowledge injection) and optimization opportunities (i.e., caching knowledge's intermediate states). Based on these insights, we design RAGCache, which organizes the intermediate states of retrieved knowledge in a knowledge tree and caches them in the GPU and host memory hierarchy. RAGCache proposes a replacement policy that is aware of LLM inference characteristics and RAG retrieval patterns. It also dynamically overlaps the retrieval and inference steps to minimize the end-to-end latency. We implement RAGCache and evaluate it on vLLM, a state-of-the-art LLM inference system and Faiss, a state-of-the-art vector database. The experimental results show that RAGCache reduces the time to first token (TTFT) by up to 4x and improves the throughput by up to 2.1x compared to vLLM integrated with Faiss.
△ Less
Submitted 25 April, 2024; v1 submitted 18 April, 2024;
originally announced April 2024.
-
Privacy-Preserving UCB Decision Process Verification via zk-SNARKs
Authors:
Xikun Jiang,
He Lyu,
Chenhao Ying,
Yibin Xu,
Boris Düdder,
Yuan Luo
Abstract:
With the increasingly widespread application of machine learning, how to strike a balance between protecting the privacy of data and algorithm parameters and ensuring the verifiability of machine learning has always been a challenge. This study explores the intersection of reinforcement learning and data privacy, specifically addressing the Multi-Armed Bandit (MAB) problem with the Upper Confidenc…
▽ More
With the increasingly widespread application of machine learning, how to strike a balance between protecting the privacy of data and algorithm parameters and ensuring the verifiability of machine learning has always been a challenge. This study explores the intersection of reinforcement learning and data privacy, specifically addressing the Multi-Armed Bandit (MAB) problem with the Upper Confidence Bound (UCB) algorithm. We introduce zkUCB, an innovative algorithm that employs the Zero-Knowledge Succinct Non-Interactive Argument of Knowledge (zk-SNARKs) to enhance UCB. zkUCB is carefully designed to safeguard the confidentiality of training data and algorithmic parameters, ensuring transparent UCB decision-making. Experiments highlight zkUCB's superior performance, attributing its enhanced reward to judicious quantization bit usage that reduces information entropy in the decision-making process. zkUCB's proof size and verification time scale linearly with the execution steps of zkUCB. This showcases zkUCB's adept balance between data security and operational efficiency. This approach contributes significantly to the ongoing discourse on reinforcing data privacy in complex decision-making processes, offering a promising solution for privacy-sensitive applications.
△ Less
Submitted 6 June, 2024; v1 submitted 18 April, 2024;
originally announced April 2024.
-
Tunable magnetism in bilayer transition metal dichalcogenides
Authors:
Li-Ya Qiao,
Xiu-Cai Jiang,
Ze Ruan,
Yu-Zhong Zhang
Abstract:
Twist between neighboring layers and variation of interlayer distance are two extra ways to control the physical properties of stacked two-dimensional van der Waals materials without alteration of chemical compositions or application of external fields, compared to their monolayer counterparts. In this work, we explored the dependence of the magnetic states of the untwisted and twisted bilayer 1T-…
▽ More
Twist between neighboring layers and variation of interlayer distance are two extra ways to control the physical properties of stacked two-dimensional van der Waals materials without alteration of chemical compositions or application of external fields, compared to their monolayer counterparts. In this work, we explored the dependence of the magnetic states of the untwisted and twisted bilayer 1T-VX$_2$ (X = S, Se) on the interlayer distance by density functional theory calculations. We find that, while a magnetic phase transition occurs from interlayer ferromagnetism to interlayer antiferromagnetism either as a function of decreasing interlayer distance for the untwisted bilayer 1T-VX$_2$ or after twist, richer magnetic phase transitions consecutively take place for the twisted bilayer 1T-VX$_2$ as interlayer distance is gradually reduced. Besides, the critical pressures for the phase transition are greatly reduced in twisted bilayer 1T-VX$_2$ compared with the untwisted case. We derived the Heisenberg model with intralayer and interlayer exchange couplings to comprehend the emergence of various magnetic states. Our results point out an easy access towards tunable two-dimensional magnets.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Prolate spheroids settling in a quiescent fluid: clustering, microstructures and collisions
Authors:
Xinyu Jiang,
Chunxiao Xu,
Lihao Zhao
Abstract:
In the present study, we investigate the settling of prolate spheroidal particles in a quiescent fluid by means of particle-resolved direct numerical simulations. By varying the particle volume fraction (phi) from 0.1% to 10%, we observe a non-monotonic variation of particle mean settling velocity with a local maximum at phi=1%. To explain this finding, we examine the spatial distribution and orie…
▽ More
In the present study, we investigate the settling of prolate spheroidal particles in a quiescent fluid by means of particle-resolved direct numerical simulations. By varying the particle volume fraction (phi) from 0.1% to 10%, we observe a non-monotonic variation of particle mean settling velocity with a local maximum at phi=1%. To explain this finding, we examine the spatial distribution and orientation of dispersed particles. The particle-pairs statistics show the tendency of particles to form column-like microsturctures, revealing the attraction and entrapment of paritcles in wake regions. This effect results in the formation of large-scale particle clusters at intermediate particle volume fractions. The most intense particle clustering at phi=1%, quantified by the standard deviation of Voronoi volume, induces a swarm effect and substantially enhances the particle settling rate by around 25%. While, in the very dilute suspension at phi=0.1%, the particle clustering is attenuated due to the long inter-particle distance. In dense suspensions, however, the crowded particle arrangement disrupts particle wakes, breaks particle microstructures and inhibits the particle clustering. Consequently, hindrance effect dominates and reduces the settling speed. In contrast to particle spatial distribution, the particle orientation plays a less important role in the mean settling velocity. At last, we examine the collision rate of settling particles by computing the collision kernel. As phi increases, the collision kernel decreases monotonically, which is primarily attributed to the decrease of radial distribution function. However, the radial relative velocity is almost a constant with the change of phi. These results reveal the importance of hydrodynamic interactions, including the wake-flow-induced attractions and the lubrication effect, in collisions among settling spheroids.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Mitigating the Curse of Dimensionality for Certified Robustness via Dual Randomized Smoothing
Authors:
Song Xia,
Yi Yu,
Xudong Jiang,
Henghui Ding
Abstract:
Randomized Smoothing (RS) has been proven a promising method for endowing an arbitrary image classifier with certified robustness. However, the substantial uncertainty inherent in the high-dimensional isotropic Gaussian noise imposes the curse of dimensionality on RS. Specifically, the upper bound of ${\ell_2}$ certified robustness radius provided by RS exhibits a diminishing trend with the expans…
▽ More
Randomized Smoothing (RS) has been proven a promising method for endowing an arbitrary image classifier with certified robustness. However, the substantial uncertainty inherent in the high-dimensional isotropic Gaussian noise imposes the curse of dimensionality on RS. Specifically, the upper bound of ${\ell_2}$ certified robustness radius provided by RS exhibits a diminishing trend with the expansion of the input dimension $d$, proportionally decreasing at a rate of $1/\sqrt{d}$. This paper explores the feasibility of providing ${\ell_2}$ certified robustness for high-dimensional input through the utilization of dual smoothing in the lower-dimensional space. The proposed Dual Randomized Smoothing (DRS) down-samples the input image into two sub-images and smooths the two sub-images in lower dimensions. Theoretically, we prove that DRS guarantees a tight ${\ell_2}$ certified robustness radius for the original input and reveal that DRS attains a superior upper bound on the ${\ell_2}$ robustness radius, which decreases proportionally at a rate of $(1/\sqrt m + 1/\sqrt n )$ with $m+n=d$. Extensive experiments demonstrate the generalizability and effectiveness of DRS, which exhibits a notable capability to integrate with established methodologies, yielding substantial improvements in both accuracy and ${\ell_2}$ certified robustness baselines of RS on the CIFAR-10 and ImageNet datasets. Code is available at https://github.com/xiasong0501/DRS.
△ Less
Submitted 15 June, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Observation of $D \to a_{0}(980)π$ in the decays $D^{0} \rightarrow π^{+}π^{-}η$ and $D^{+} \rightarrow π^{+}π^{0}η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
We report the first amplitude analysis of the decays $D^{0} \to π^{+} π^{-} η$ and $D^{+} \rightarrow π^{+}π^{0}η$ using a data sample taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, corresponding to an integrated luminosity of 7.9 ${\rm fb}^{-1}$. The contribution from the process $D^{0(+)} \to a_{0}(980)^{+} π^{-(0)}$ is significantly larger than the…
▽ More
We report the first amplitude analysis of the decays $D^{0} \to π^{+} π^{-} η$ and $D^{+} \rightarrow π^{+}π^{0}η$ using a data sample taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, corresponding to an integrated luminosity of 7.9 ${\rm fb}^{-1}$. The contribution from the process $D^{0(+)} \to a_{0}(980)^{+} π^{-(0)}$ is significantly larger than the $D^{0(+)} \to a_{0}(980)^{-(0)} π^{+}$ contribution. The ratios $\mathcal{B}(D^{0} \rightarrow a_{0}(980)^{+}π^{-})/\mathcal{B}(D^{0} \rightarrow a_{0}(980)^{-}π^{+})$ and $\mathcal{B}(D^{+} \rightarrow a_{0}(980)^{+}π^{0})/\mathcal{B}(D^{+} \rightarrow a_{0}(980)^{0}π^{+})$ are measured to be $7.5^{+2.5}_{-0.8\,\mathrm{stat.}}\pm1.7_{\mathrm{syst.}}$ and $2.6\pm0.6_{\mathrm{stat.}}\pm0.3_{\mathrm{syst.}}$, respectively. The measured $D^{0}$ ratio disagrees with the theoretical predictions by orders of magnitudes, thus implying a substantial contribution from final-state interactions.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Federated Optimization with Doubly Regularized Drift Correction
Authors:
Xiaowen Jiang,
Anton Rodomanov,
Sebastian U. Stich
Abstract:
Federated learning is a distributed optimization paradigm that allows training machine learning models across decentralized devices while kee** the data localized. The standard method, FedAvg, suffers from client drift which can hamper performance and increase communication costs over centralized methods. Previous works proposed various strategies to mitigate drift, yet none have shown uniformly…
▽ More
Federated learning is a distributed optimization paradigm that allows training machine learning models across decentralized devices while kee** the data localized. The standard method, FedAvg, suffers from client drift which can hamper performance and increase communication costs over centralized methods. Previous works proposed various strategies to mitigate drift, yet none have shown uniformly improved communication-computation trade-offs over vanilla gradient descent.
In this work, we revisit DANE, an established method in distributed optimization. We show that (i) DANE can achieve the desired communication reduction under Hessian similarity constraints. Furthermore, (ii) we present an extension, DANE+, which supports arbitrary inexact local solvers and has more freedom to choose how to aggregate the local updates. We propose (iii) a novel method, FedRed, which has improved local computational complexity and retains the same communication complexity compared to DANE/DANE+. This is achieved by using doubly regularized drift correction.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Measurement of $e^{+}e^{-}\to ωη^{\prime}$ cross sections at $\sqrt{s}=$ 2.000 to 3.080 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (599 additional authors not shown)
Abstract:
The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be…
▽ More
The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be $Γ_{R}=(167\pm77\pm7)~\rm{MeV}$, where the first uncertainties are statistical and the second are systematic.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Search for prompt production of pentaquarks in charm hadron final states
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
B. Adeva,
M. Adinolfi,
P. Adlarson,
H. Afsharnia,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
A. Alfonso Albero,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey
, et al. (1090 additional authors not shown)
Abstract:
A search for hidden-charm pentaquark states decaying to a range of $Σ_{c}\bar{D}$ and $Λ_{c}\bar{D}$ final states, as well as doubly-charmed pentaquark states to $Σ_{c}D$ and $Λ_{c}^{+}D$, is made using samples of proton-proton collision data corresponding to an integrated luminosity of $5.7fb^{-1}$ recorded by the LHCb detector at $\sqrt{s} = 13Te\kern -0.1em V$. Since no significant signals are…
▽ More
A search for hidden-charm pentaquark states decaying to a range of $Σ_{c}\bar{D}$ and $Λ_{c}\bar{D}$ final states, as well as doubly-charmed pentaquark states to $Σ_{c}D$ and $Λ_{c}^{+}D$, is made using samples of proton-proton collision data corresponding to an integrated luminosity of $5.7fb^{-1}$ recorded by the LHCb detector at $\sqrt{s} = 13Te\kern -0.1em V$. Since no significant signals are found, upper limits are set on the pentaquark yields relative to that of the $Λ_{c}^{+}$ baryon in the $Λ_{c}^{+}\to pK^{-}π^{+}$ decay mode. The known pentaquark states are also investigated, and their signal yields are found to be consistent with zero in all cases.
△ Less
Submitted 2 June, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
HRVDA: High-Resolution Visual Document Assistant
Authors:
Chaohu Liu,
Kun Yin,
Haoyu Cao,
Xinghua Jiang,
Xin Li,
Yinsong Liu,
Deqiang Jiang,
Xing Sun,
Linli Xu
Abstract:
Leveraging vast training data, multimodal large language models (MLLMs) have demonstrated formidable general visual comprehension capabilities and achieved remarkable performance across various tasks. However, their performance in visual document understanding still leaves much room for improvement. This discrepancy is primarily attributed to the fact that visual document understanding is a fine-g…
▽ More
Leveraging vast training data, multimodal large language models (MLLMs) have demonstrated formidable general visual comprehension capabilities and achieved remarkable performance across various tasks. However, their performance in visual document understanding still leaves much room for improvement. This discrepancy is primarily attributed to the fact that visual document understanding is a fine-grained prediction task. In natural scenes, MLLMs typically use low-resolution images, leading to a substantial loss of visual information. Furthermore, general-purpose MLLMs do not excel in handling document-oriented instructions. In this paper, we propose a High-Resolution Visual Document Assistant (HRVDA), which bridges the gap between MLLMs and visual document understanding. This model employs a content filtering mechanism and an instruction filtering module to separately filter out the content-agnostic visual tokens and instruction-agnostic visual tokens, thereby achieving efficient model training and inference for high-resolution images. In addition, we construct a document-oriented visual instruction tuning dataset and apply a multi-stage training strategy to enhance the model's document modeling capabilities. Extensive experiments demonstrate that our model achieves state-of-the-art performance across multiple document understanding datasets, while maintaining training efficiency and inference speed comparable to low-resolution models.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Tuning-Free Adaptive Style Incorporation for Structure-Consistent Text-Driven Style Transfer
Authors:
Yanqi Ge,
Jiaqi Liu,
Qingnan Fan,
Xi Jiang,
Ye Huang,
Shuai Qin,
Hong Gu,
Wen Li,
Lixin Duan
Abstract:
In this work, we target the task of text-driven style transfer in the context of text-to-image (T2I) diffusion models. The main challenge is consistent structure preservation while enabling effective style transfer effects. The past approaches in this field directly concatenate the content and style prompts for a prompt-level style injection, leading to unavoidable structure distortions. In this w…
▽ More
In this work, we target the task of text-driven style transfer in the context of text-to-image (T2I) diffusion models. The main challenge is consistent structure preservation while enabling effective style transfer effects. The past approaches in this field directly concatenate the content and style prompts for a prompt-level style injection, leading to unavoidable structure distortions. In this work, we propose a novel solution to the text-driven style transfer task, namely, Adaptive Style Incorporation~(ASI), to achieve fine-grained feature-level style incorporation. It consists of the Siamese Cross-Attention~(SiCA) to decouple the single-track cross-attention to a dual-track structure to obtain separate content and style features, and the Adaptive Content-Style Blending (AdaBlending) module to couple the content and style information from a structure-consistent manner. Experimentally, our method exhibits much better performance in both structure preservation and stylized effects.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Measurement of the Born cross section for $e^{+}e^{-}\to ηh_c $ at center-of-mass energies between 4.1 and 4.6\,GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
We measure the Born cross section for the reaction $e^{+}e^{-} \rightarrow ηh_c$ from $\sqrt{s} = 4.129$ to $4.600$~GeV using data sets collected by the BESIII detector running at the BEPCII collider. A resonant structure in the cross section line shape near 4.200~GeV is observed with a statistical significance of 7$σ$. The parameters of this resonance are measured to be \MeasMass\ and \MeasWidth,…
▽ More
We measure the Born cross section for the reaction $e^{+}e^{-} \rightarrow ηh_c$ from $\sqrt{s} = 4.129$ to $4.600$~GeV using data sets collected by the BESIII detector running at the BEPCII collider. A resonant structure in the cross section line shape near 4.200~GeV is observed with a statistical significance of 7$σ$. The parameters of this resonance are measured to be \MeasMass\ and \MeasWidth, where the first uncertainties are statistical and the second systematic.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
On Achievable Covert Communication Performance under CSI Estimation Error and Feedback Delay
Authors:
Jiaqing Bai,
Ji He,
Yan** Chen,
Yulong Shen,
Xiaohong Jiang
Abstract:
Covert communication's effectiveness critically depends on precise channel state information (CSI). This paper investigates the impact of imperfect CSI on achievable covert communication performance in a two-hop relay system. Firstly, we introduce a two-hop covert transmission scheme utilizing channel inversion power control (CIPC) to manage opportunistic interference, eliminating the receiver's s…
▽ More
Covert communication's effectiveness critically depends on precise channel state information (CSI). This paper investigates the impact of imperfect CSI on achievable covert communication performance in a two-hop relay system. Firstly, we introduce a two-hop covert transmission scheme utilizing channel inversion power control (CIPC) to manage opportunistic interference, eliminating the receiver's self-interference. Given that CSI estimation error (CEE) and feedback delay (FD) are the two primary factors leading to imperfect CSI, we construct a comprehensive theoretical model to accurately characterize their effects on CSI quality. With the aid of this model, we then derive closed-form solutions for detection error probability (DEP) and covert rate (CR), establishing an analytical framework to delineate the inherent relationship between CEE, FD, and covert performance. Furthermore, to mitigate the adverse effects of imperfect CSI on achievable covert performance, we investigate the joint optimization of channel inversion power and data symbol length to maximize CR under DEP constraints and propose an iterative alternating algorithm to solve the bi-dimensional non-convex optimization problem. Finally, extensive experimental results validate our theoretical framework and illustrate the impact of imperfect CSI on achievable covert performance.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Search for the Rare Decays $D_s^+\to h^+(h^{0})e^+e^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (618 additional authors not shown)
Abstract:
Using 7.33~fb$^{-1}$ of $e^{+}e^{-}$ collision data collected by the BESIII detector at center-of-mass energies in the range of $\sqrt{s}=4.128 - 4.226$~GeV, we search for the rare decays $D_{s}^+\to h^+(h^{0})e^{+}e^{-}$, where $h$ represents a kaon or pion. By requiring the $e^{+}e^{-}$ invariant mass to be consistent with a $φ(1020)$, $0.98<M(e^{+}e^{-})<1.04$ ~GeV/$c^2$, the decay…
▽ More
Using 7.33~fb$^{-1}$ of $e^{+}e^{-}$ collision data collected by the BESIII detector at center-of-mass energies in the range of $\sqrt{s}=4.128 - 4.226$~GeV, we search for the rare decays $D_{s}^+\to h^+(h^{0})e^{+}e^{-}$, where $h$ represents a kaon or pion. By requiring the $e^{+}e^{-}$ invariant mass to be consistent with a $φ(1020)$, $0.98<M(e^{+}e^{-})<1.04$ ~GeV/$c^2$, the decay $D_s^+\toπ^+φ,φ\to e^{+}e^{-}$ is observed with a statistical significance of 7.8$σ$, and evidence for the decay $D_s^+\toρ^+φ,φ\to e^{+}e^{-}$ is found for the first time with a statistical significance of 4.4$σ$. The decay branching fractions are measured to be $\mathcal{B}(D_s^+\toπ^+φ, φ\to e^{+}e^{-} )=(1.17^{+0.23}_{-0.21}\pm0.03)\times 10^{-5}$, and $\mathcal{B}(D_s^+\toρ^+φ, φ\to e^{+}e^{-} )=(2.44^{+0.67}_{-0.62}\pm 0.16)\times 10^{-5}$, where the first uncertainties are statistical and the second systematic. No significant signal for the three four-body decays of $D_{s}^{+}\to π^{+}π^{0}e^{+}e^{-},\ D_{s}^{+}\to K^{+}π^{0}e^{+}e^{-}$, and $D_{s}^{+}\to K_{S}^{0}π^{+}e^{+}e^{-}$ is observed. For $D_{s}^{+}\to π^{+}π^{0}e^{+}e^{-}$, the $φ$ mass region is vetoed to minimize the long-distance effects. The 90$\%$ confidence level upper limits set on the branching fractions of these decays are in the range of $(7.0-8.1)\times 10^{-5}$.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Search for $η_c(2S)\to 2(π^+π^-)$ and improved measurement of $χ_{cJ}\to 2(π^+π^-)$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
We search for the hadronic decay $η_c(2S)\to 2(π^+π^-)$ in the $ψ(3686)\toγη_c(2S)$ radiative decay using $(27.12\pm 0.14)\times 10^8$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. No significant signal is found, and the upper limit of $\mathcal{B}[ψ(3686)\toγη_c(2S)]\mathcal{B}[η_c(2S)\to 2(π^+π^-)]$ is determined to be $0.78\times 10^{-6}$ at the 90\% confidence level…
▽ More
We search for the hadronic decay $η_c(2S)\to 2(π^+π^-)$ in the $ψ(3686)\toγη_c(2S)$ radiative decay using $(27.12\pm 0.14)\times 10^8$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider. No significant signal is found, and the upper limit of $\mathcal{B}[ψ(3686)\toγη_c(2S)]\mathcal{B}[η_c(2S)\to 2(π^+π^-)]$ is determined to be $0.78\times 10^{-6}$ at the 90\% confidence level. Using $ψ(3686)\toγχ_{cJ}$ transitions, we also measure the branching fractions of $\mathcal{B}[χ_{cJ(J=0,1,2)}\to 2(π^+π^-)]$, which are $\mathcal{B}[χ_{c0}\to 2(π^+π^-)]=(2.127\pm 0.002~(\mathrm{stat.})\pm 0.101~(\mathrm{syst.}))$\%, $\mathcal{B}[χ_{c1}\to 2(π^+π^-)]=(0.685\pm 0.001~(\mathrm{stat.})\pm 0.031~\mathrm{syst.}))$\%, and $\mathcal{B}[χ_{c2}\to 2(π^+π^-)]=(1.153\pm 0.001~(\mathrm{stat.})\pm 0.063~(\mathrm{syst.}))$\%.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
LHAASO-KM2A detector simulation using Geant4
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (254 additional authors not shown)
Abstract:
KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with…
▽ More
KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with large altitude difference (30 m) and huge coverage (1.3 km^2). In this paper, the design of the KM2A simulation code G4KM2A based on Geant4 is introduced. The process of G4KM2A is optimized mainly in memory consumption to avoid memory overffow. Some simpliffcations are used to signiffcantly speed up the execution of G4KM2A. The running time is reduced by at least 30 times compared to full detector simulation. The particle distributions and the core/angle resolution comparison between simulation and experimental data of the full KM2A array are also presented, which show good agreement.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Search for di-photon decays of an axion-like particle in radiative J/ψdecays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko
, et al. (604 additional authors not shown)
Abstract:
We search for the di-photon decay of a light pseudoscalar axion-like particle, $a$, in radiative $J/ψ$ decays, using 10 billion $J/ψ$ events collected with the BESIII detector. We find no evidence of a signal and set upper limits at the $95\%$ confidence level on the product branching fraction $\mathcal{B}(J/ψ\to γa) \times \mathcal{B}(a \to γγ)$ and the axion-like particle photon coupling constan…
▽ More
We search for the di-photon decay of a light pseudoscalar axion-like particle, $a$, in radiative $J/ψ$ decays, using 10 billion $J/ψ$ events collected with the BESIII detector. We find no evidence of a signal and set upper limits at the $95\%$ confidence level on the product branching fraction $\mathcal{B}(J/ψ\to γa) \times \mathcal{B}(a \to γγ)$ and the axion-like particle photon coupling constant $g_{a γγ}$ in the ranges of $(3.7-48.5) \times 10^{-8}$ and $(2.2 -101.8)\times 10^{-4}$ GeV$^{-1}$, respectively, for $0.18 \le m_a \le 2.85$ GeV/$c^2$. These are the most stringent limits to date in this mass region.
△ Less
Submitted 3 July, 2024; v1 submitted 6 April, 2024;
originally announced April 2024.
-
Site-selective insulating phase in twisted bilayer Hubbard model
Authors:
Xiu-Cai Jiang,
Ze Ruan,
Yu-Zhong Zhang
Abstract:
The paramagnetic phase diagrams of the half-filled Hubbard model on a twisted bilayer square lattice are investigated using coherent potential approximation. Besides the conventional metallic, band insulating, and Mott insulating phases, we find two site-selective insulating phases where certain sites exhibit band insulating behaviors while the others display Mott insulating behaviors. These phase…
▽ More
The paramagnetic phase diagrams of the half-filled Hubbard model on a twisted bilayer square lattice are investigated using coherent potential approximation. Besides the conventional metallic, band insulating, and Mott insulating phases, we find two site-selective insulating phases where certain sites exhibit band insulating behaviors while the others display Mott insulating behaviors. These phases are identified by the band gap, the double occupancy, the density of states, as well as the imaginary part of self-energy. Furthermore, we examine the effect of on-site potential on the stability of the site-selective insulating phases. Our results indicate that fruitful site-selective phases can be engineered by twisting.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
GENEVIC: GENetic data Exploration and Visualization via Intelligent interactive Console
Authors:
Anindita Nath,
Savannah Mwesigwa,
Yulin Dai,
Xiaoqian Jiang,
Zhongming Zhao
Abstract:
Summary: The vast generation of genetic data poses a significant challenge in efficiently uncovering valuable knowledge. Introducing GENEVIC, an AI-driven chat framework that tackles this challenge by bridging the gap between genetic data generation and biomedical knowledge discovery. Leveraging generative AI, notably ChatGPT, it serves as a biologist's 'copilot'. It automates the analysis, retrie…
▽ More
Summary: The vast generation of genetic data poses a significant challenge in efficiently uncovering valuable knowledge. Introducing GENEVIC, an AI-driven chat framework that tackles this challenge by bridging the gap between genetic data generation and biomedical knowledge discovery. Leveraging generative AI, notably ChatGPT, it serves as a biologist's 'copilot'. It automates the analysis, retrieval, and visualization of customized domain-specific genetic information, and integrates functionalities to generate protein interaction networks, enrich gene sets, and search scientific literature from PubMed, Google Scholar, and arXiv, making it a comprehensive tool for biomedical research. In its pilot phase, GENEVIC is assessed using a curated database that ranks genetic variants associated with Alzheimer's disease, schizophrenia, and cognition, based on their effect weights from the Polygenic Score Catalog, thus enabling researchers to prioritize genetic variants in complex diseases. GENEVIC's operation is user-friendly, accessible without any specialized training, secured by Azure OpenAI's HIPAA-compliant infrastructure, and evaluated for its efficacy through real-time query testing. As a prototype, GENEVIC is set to advance genetic research, enabling informed biomedical decisions.
Availability and implementation: GENEVIC is publicly accessible at https://genevic-anath2024.streamlit.app. The underlying code is open-source and available via GitHub at https://github.com/anath2110/GENEVIC.git.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Search for the $B_s^0 \rightarrow μ^+μ^-γ$ decay
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1068 additional authors not shown)
Abstract:
A search for the fully reconstructed $B_s^0 \rightarrow μ^+μ^-γ$ decay is performed at the LHCb experiment using proton-proton collisions at $\sqrt{s}=13$\,TeV corresponding to an integrated luminosity of $5.4\,\mathrm{fb^{-1}}$. No significant signal is found and upper limits on the branching fraction in intervals of the dimuon mass are set
\begin{align}
{\cal B}(B_s^0 \rightarrow μ^+μ^-γ) <…
▽ More
A search for the fully reconstructed $B_s^0 \rightarrow μ^+μ^-γ$ decay is performed at the LHCb experiment using proton-proton collisions at $\sqrt{s}=13$\,TeV corresponding to an integrated luminosity of $5.4\,\mathrm{fb^{-1}}$. No significant signal is found and upper limits on the branching fraction in intervals of the dimuon mass are set
\begin{align}
{\cal B}(B_s^0 \rightarrow μ^+μ^-γ) < 4.2\times10^{-8},~&m(μμ)\in[2m_μ,~1.70]\,\mathrm{GeV/c^2} ,\nonumber
{\cal B}(B_s^0 \rightarrow μ^+μ^-γ) < 7.7\times10^{-8},~&m(μμ)\in[1.70,~2.88]\,\mathrm{GeV/c^2},\nonumber
{\cal B}(B_s^0 \rightarrow μ^+μ^-γ) < 4.2\times10^{-8},~&m(μμ)\in[3.92 ,~m_{B_s^0}]\,\mathrm{GeV/c^2},\nonumber \end{align} at 95\% confidence level. Additionally, upper limits are set on the branching fraction in the $[2m_μ,~1.70]\,\mathrm{GeV/c^2}$ dimuon mass region excluding the contribution from the intermediate $φ(1020)$ meson, and in the region combining all dimuon-mass intervals.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions
Authors:
Yuting He,
Fuxiang Huang,
Xinrui Jiang,
Yuxiang Nie,
Minghao Wang,
Jiguang Wang,
Hao Chen
Abstract:
Foundation model, which is pre-trained on broad data and is able to adapt to a wide range of tasks, is advancing healthcare. It promotes the development of healthcare artificial intelligence (AI) models, breaking the contradiction between limited AI models and diverse healthcare practices. Much more widespread healthcare scenarios will benefit from the development of a healthcare foundation model…
▽ More
Foundation model, which is pre-trained on broad data and is able to adapt to a wide range of tasks, is advancing healthcare. It promotes the development of healthcare artificial intelligence (AI) models, breaking the contradiction between limited AI models and diverse healthcare practices. Much more widespread healthcare scenarios will benefit from the development of a healthcare foundation model (HFM), improving their advanced intelligent healthcare services. Despite the impending widespread deployment of HFMs, there is currently a lack of clear understanding about how they work in the healthcare field, their current challenges, and where they are headed in the future. To answer these questions, a comprehensive and deep survey of the challenges, opportunities, and future directions of HFMs is presented in this survey. It first conducted a comprehensive overview of the HFM including the methods, data, and applications for a quick grasp of the current progress. Then, it made an in-depth exploration of the challenges present in data, algorithms, and computing infrastructures for constructing and widespread application of foundation models in healthcare. This survey also identifies emerging and promising directions in this field for future development. We believe that this survey will enhance the community's comprehension of the current progress of HFM and serve as a valuable source of guidance for future development in this field. The latest HFM papers and related resources are maintained on our website: https://github.com/YutingHe-list/Awesome-Foundation-Models-for-Advancing-Healthcare.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Evidence of the $h_c\to K_S^0 K^+π^-+c.c.$ decay
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Based on $(2.712\pm0.014)\times10^9$ $ψ(3686)$ events collected by the BESIII collaboration, evidence of the hadronic decay $h_c\to K_S^0K^+π^-+c.c.$ is found with a significance of $4.3σ$ in the $ψ(3686)\toπ^0 h_c$ process. The branching fraction of $h_c\to K_S^0 K^+π^- +c.c.$ is measured to be $(7.3\pm0.8\pm1.8)\times10^{-4}$, where the first and second uncertainties are statistical and systemat…
▽ More
Based on $(2.712\pm0.014)\times10^9$ $ψ(3686)$ events collected by the BESIII collaboration, evidence of the hadronic decay $h_c\to K_S^0K^+π^-+c.c.$ is found with a significance of $4.3σ$ in the $ψ(3686)\toπ^0 h_c$ process. The branching fraction of $h_c\to K_S^0 K^+π^- +c.c.$ is measured to be $(7.3\pm0.8\pm1.8)\times10^{-4}$, where the first and second uncertainties are statistical and systematic, respectively. Combining with the exclusive decay width of $η_c\to K\bar{K}π$, our result indicates inconsistencies with both pQCD and NRQCD predictions.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Search for $C$-even states decaying to $D_{s}^{\pm}D_{s}^{*\mp}$ with masses between $4.08$ and $4.32$ $\rm GeV/{\it c}^{2}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Six $C$-even states, denoted as $X$, with quantum numbers $J^{PC}=0^{-+}$, $1^{\pm+}$, or $2^{\pm+}$, are searched for via the $e^+e^-\toγD_{s}^{\pm}D_{s}^{*\mp}$ process using $(1667.39\pm8.84)~\mathrm{pb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII storage ring at center-of-mass energy of $\sqrt{s}=(4681.92\pm0.30)~\mathrm{MeV}$. No statistically s…
▽ More
Six $C$-even states, denoted as $X$, with quantum numbers $J^{PC}=0^{-+}$, $1^{\pm+}$, or $2^{\pm+}$, are searched for via the $e^+e^-\toγD_{s}^{\pm}D_{s}^{*\mp}$ process using $(1667.39\pm8.84)~\mathrm{pb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII storage ring at center-of-mass energy of $\sqrt{s}=(4681.92\pm0.30)~\mathrm{MeV}$. No statistically significant signal is observed in the mass range from $4.08$ to $4.32~\mathrm{GeV}/c^{2}$. The upper limits of $σ[e^+e^-\toγX]\cdot \mathcal{B}[X \to D_{s}^{\pm}D_{s}^{*\mp}]$ at a $90\%$ confidence level are determined.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Spin-UP: Spin Light for Natural Light Uncalibrated Photometric Stereo
Authors:
Zongrui Li,
Zhan Lu,
Haojie Yan,
Boxin Shi,
Gang Pan,
Qian Zheng,
Xudong Jiang
Abstract:
Natural Light Uncalibrated Photometric Stereo (NaUPS) relieves the strict environment and light assumptions in classical Uncalibrated Photometric Stereo (UPS) methods. However, due to the intrinsic ill-posedness and high-dimensional ambiguities, addressing NaUPS is still an open question. Existing works impose strong assumptions on the environment lights and objects' material, restricting the effe…
▽ More
Natural Light Uncalibrated Photometric Stereo (NaUPS) relieves the strict environment and light assumptions in classical Uncalibrated Photometric Stereo (UPS) methods. However, due to the intrinsic ill-posedness and high-dimensional ambiguities, addressing NaUPS is still an open question. Existing works impose strong assumptions on the environment lights and objects' material, restricting the effectiveness in more general scenarios. Alternatively, some methods leverage supervised learning with intricate models while lacking interpretability, resulting in a biased estimation. In this work, we proposed Spin Light Uncalibrated Photometric Stereo (Spin-UP), an unsupervised method to tackle NaUPS in various environment lights and objects. The proposed method uses a novel setup that captures the object's images on a rotatable platform, which mitigates NaUPS's ill-posedness by reducing unknowns and provides reliable priors to alleviate NaUPS's ambiguities. Leveraging neural inverse rendering and the proposed training strategies, Spin-UP recovers surface normals, environment light, and isotropic reflectance under complex natural light with low computational cost. Experiments have shown that Spin-UP outperforms other supervised / unsupervised NaUPS methods and achieves state-of-the-art performance on synthetic and real-world datasets. Codes and data are available at https://github.com/LMozart/CVPR2024-SpinUP.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
The radiative decay of scalar glueball from lattice QCD
Authors:
**tao Zou,
Long-Cheng Gui,
Ying Chen,
Jian Liang,
Xiangyu Jiang,
Wen Qin
Abstract:
We perform the first lattice QCD study on the radiative decay of the scalar glueball to the vector meson $φ$ in the quenched approximation. The calculations are carried out on three gauge ensembles with different lattice spaicings, which enable us to do the continuum extrapolation. We first revisit the radiative $J/ψ$ decay into the scalar glueball $G$ and obtain the partial decay width…
▽ More
We perform the first lattice QCD study on the radiative decay of the scalar glueball to the vector meson $φ$ in the quenched approximation. The calculations are carried out on three gauge ensembles with different lattice spaicings, which enable us to do the continuum extrapolation. We first revisit the radiative $J/ψ$ decay into the scalar glueball $G$ and obtain the partial decay width $Γ(J/ψ\to γG)=0.449(44)~\text{keV}$ and the branching fraction $\text{Br}(J/ψ\to γG) = 4.8(5)\times 10^{-3}$, which are in agreement with the previous lattice results. We then extend the similar calculation to the process $G\to γφ$ and get the partial decay width $Γ(G \to γφ)= 0.074(47)~\text{keV}$, which implies that the combined branching fraction of $J/ψ\toγG\to γγφ$ is as small as $\mathcal{O}(10^{-9})$ such that this process is hardly detected by the BESIII experiment even with the large $J/ψ$ sample of $\mathcal{O}(10^{10})$. With the vector meson dominance model, the two-photon decay width of the scalar glueball is estimated to be $Γ(G\toγγ)=0.53(46)~\text{eV}$, which results in a large stickiness $S(G)\sim \mathcal{O}(10^4)$ of the scalar glueball by assuming the stickiness of $f_2(1270)$ to be one.
△ Less
Submitted 4 April, 2024; v1 submitted 1 April, 2024;
originally announced April 2024.
-
LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation
Authors:
Zilong Wang,
Xufang Luo,
Xinyang Jiang,
Dongsheng Li,
Lili Qiu
Abstract:
Evaluating generated radiology reports is crucial for the development of radiology AI, but existing metrics fail to reflect the task's clinical requirements. This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment. We compare the performance of various LLMs and demonstrate that, when using GPT-4, our proposed metric achieves e…
▽ More
Evaluating generated radiology reports is crucial for the development of radiology AI, but existing metrics fail to reflect the task's clinical requirements. This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment. We compare the performance of various LLMs and demonstrate that, when using GPT-4, our proposed metric achieves evaluation consistency close to that of radiologists. Furthermore, to reduce costs and improve accessibility, making this method practical, we construct a dataset using LLM evaluation results and perform knowledge distillation to train a smaller model. The distilled model achieves evaluation capabilities comparable to GPT-4. Our framework and distilled model offer an accessible and efficient evaluation method for radiology report generation, facilitating the development of more clinically relevant models. The model will be further open-sourced and accessible.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Spontaneous charge-ordered state in Bernal-stacked bilayer graphene
Authors:
Xiu-Cai Jiang,
Ze-Yi Song,
Ze Ruan,
Yu-Zhong Zhang
Abstract:
We propose that a weakly spontaneous charge-ordered insulating state probably exists in Bernal-stacked bilayer graphene which can account for experimentally observed non-monotonic behavior of resistance as a function of the gated field, namely, the gap closes and reopens at a critical gated field. The underlying physics is demonstrated by a simple model on a corresponding lattice that contains the…
▽ More
We propose that a weakly spontaneous charge-ordered insulating state probably exists in Bernal-stacked bilayer graphene which can account for experimentally observed non-monotonic behavior of resistance as a function of the gated field, namely, the gap closes and reopens at a critical gated field. The underlying physics is demonstrated by a simple model on a corresponding lattice that contains the nearest intralayer and interlayer hop**s, electric field, and staggered potential between different sublattices. Combining density functional theory calculations with model analyses, we argue that the interlayer van der Waals interactions cooperating with ripples may be responsible for the staggered potential which induces a charge-ordered insulating state in the absence of the electric field.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Restriction-induced time-dependent transcytolemmal water exchange: Revisiting the Kärger exchange model
Authors:
Diwei Shi,
Fan Liu,
Sisi Li,
Li Chen,
Xiaoyu Jiang,
John C. Gore,
Quanshui Zheng,
Hua Guo,
Junzhong Xu
Abstract:
The Kärger model and its derivatives have been widely used to incorporate transcytolemmal water exchange rate, an essential characteristic of living cells, into analyses of diffusion MRI (dMRI) signals from tissues. The Kärger model consists of two homogeneous exchanging components coupled by an exchange rate constant and assumes measurements are made with sufficiently long diffusion time and slow…
▽ More
The Kärger model and its derivatives have been widely used to incorporate transcytolemmal water exchange rate, an essential characteristic of living cells, into analyses of diffusion MRI (dMRI) signals from tissues. The Kärger model consists of two homogeneous exchanging components coupled by an exchange rate constant and assumes measurements are made with sufficiently long diffusion time and slow water exchange. Despite successful applications, it remains unclear whether these assumptions are generally valid for practical dMRI sequences and biological tissues. In particular, barrier-induced restrictions to diffusion produce inhomogeneous magnetization distributions in relatively large-sized compartments such as cancer cells, violating the above assumptions. The effects of this inhomogeneity are usually overlooked. We performed computer simulations to quantify how restriction effects, which in images produce edge enhancements at compartment boundaries, influence different variants of the Kärger-model. The results show that the edge enhancement effect will produce larger, time-dependent estimates of exchange rates in e.g., tumors with relatively large cell sizes (>10 μm), resulting in overestimations of water exchange as previously reported. Moreover, stronger diffusion gradients, longer diffusion gradient durations, and larger cell sizes, all cause more pronounced edge enhancement effects. This helps us to better understand the feasibility of the Kärger model in estimating water exchange in different tissue types and provides useful guidance on signal acquisition methods that may mitigate the edge enhancement effect. This work also indicates the need to correct the overestimated transcytolemmal water exchange rates obtained assuming the Kärger-model.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Gate-tunable quantum acoustoelectric transport in graphene
Authors:
Yicheng Mou,
Haonan Chen,
Jiaqi Liu,
Qing Lan,
Jiayu Wang,
Chuanxin Zhang,
Yuxiang Wang,
Jiaming Gu,
Tuoyu Zhao,
Xue Jiang,
Wu Shi,
Cheng Zhang
Abstract:
Transport probes the motion of quasiparticles in response to external excitations. Apart from the well-known electric and thermoelectric transport, acoustoelectric transport induced by traveling acoustic waves has been rarely explored. Here, by adopting a hybrid nanodevices integrated with piezoelectric substrates, we establish a simple design of acoustoelectric transport with gate tunability. We…
▽ More
Transport probes the motion of quasiparticles in response to external excitations. Apart from the well-known electric and thermoelectric transport, acoustoelectric transport induced by traveling acoustic waves has been rarely explored. Here, by adopting a hybrid nanodevices integrated with piezoelectric substrates, we establish a simple design of acoustoelectric transport with gate tunability. We fabricate dual-gated acoustoelectric devices based on BN-encapsuled graphene on LiNbO3. Longitudinal and transverse acoustoelectric voltages are generated by launching pulsed surface acoustic wave. The gate dependence of zero-field longitudinal acoustoelectric signal presents strikingly similar profiles as that of Hall resistivity, providing a valid approach for extracting carrier density without magnetic field. In magnetic fields, acoustoelectric quantum oscillations appear due to Landau quantization, which are more robust and pronounced than Shubnikov-de Haas oscillations. Our work demonstrates a feasible acoustoelectric setup with gate tunability, which can be extended to the broad scope of various Van der Waals materials.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Negative Label Guided OOD Detection with Pretrained Vision-Language Models
Authors:
Xue Jiang,
Feng Liu,
Zhen Fang,
Hong Chen,
Tongliang Liu,
Feng Zheng,
Bo Han
Abstract:
Out-of-distribution (OOD) detection aims at identifying samples from unknown classes, playing a crucial role in trustworthy models against errors on unexpected inputs. Extensive research has been dedicated to exploring OOD detection in the vision modality. Vision-language models (VLMs) can leverage both textual and visual information for various multi-modal applications, whereas few OOD detection…
▽ More
Out-of-distribution (OOD) detection aims at identifying samples from unknown classes, playing a crucial role in trustworthy models against errors on unexpected inputs. Extensive research has been dedicated to exploring OOD detection in the vision modality. Vision-language models (VLMs) can leverage both textual and visual information for various multi-modal applications, whereas few OOD detection methods take into account information from the text modality. In this paper, we propose a novel post hoc OOD detection method, called NegLabel, which takes a vast number of negative labels from extensive corpus databases. We design a novel scheme for the OOD score collaborated with negative labels. Theoretical analysis helps to understand the mechanism of negative labels. Extensive experiments demonstrate that our method NegLabel achieves state-of-the-art performance on various OOD detection benchmarks and generalizes well on multiple VLM architectures. Furthermore, our method NegLabel exhibits remarkable robustness against diverse domain shifts. The codes are available at https://github.com/tmlr-group/NegLabel.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication
Authors:
Mingze Sun,
Chao Xu,
Xinyu Jiang,
Yang Liu,
Baigui Sun,
Ruqi Huang
Abstract:
In this paper, we introduce an innovative task focused on human communication, aiming to generate 3D holistic human motions for both speakers and listeners. Central to our approach is the incorporation of factorization to decouple audio features and the combination of textual semantic information, thereby facilitating the creation of more realistic and coordinated movements. We separately train VQ…
▽ More
In this paper, we introduce an innovative task focused on human communication, aiming to generate 3D holistic human motions for both speakers and listeners. Central to our approach is the incorporation of factorization to decouple audio features and the combination of textual semantic information, thereby facilitating the creation of more realistic and coordinated movements. We separately train VQ-VAEs with respect to the holistic motions of both speaker and listener. We consider the real-time mutual influence between the speaker and the listener and propose a novel chain-like transformer-based auto-regressive model specifically designed to characterize real-world communication scenarios effectively which can generate the motions of both the speaker and the listener simultaneously. These designs ensure that the results we generate are both coordinated and diverse. Our approach demonstrates state-of-the-art performance on two benchmark datasets. Furthermore, we introduce the HoCo holistic communication dataset, which is a valuable resource for future research. Our HoCo dataset and code will be released for research purposes upon acceptance.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Measurement of absolute branching fractions of $D_s^+$ hadronic decays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (632 additional authors not shown)
Abstract:
Using $e^+ e^-$ collision data collected at the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of $7.33~{\rm fb}^{-1}$, we determine the absolute branching fractions of fifteen hadronic $D_s^{+}$ decays with a double-tag technique. In particular, we make precise measurements of the branching fractions…
▽ More
Using $e^+ e^-$ collision data collected at the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of $7.33~{\rm fb}^{-1}$, we determine the absolute branching fractions of fifteen hadronic $D_s^{+}$ decays with a double-tag technique. In particular, we make precise measurements of the branching fractions $\mathcal{B}(D_s^+ \to K^+ K^- π^+)=(5.49 \pm 0.04 \pm 0.07)\%$, $\mathcal{B}(D_s^+ \to K_S^0 K^+)=(1.50 \pm 0.01 \pm 0.01)\%$ and $\mathcal{B}(D_s^+ \to K^+ K^- π^+ π^0)=(5.50 \pm 0.05 \pm 0.11)\%$, where the first uncertainties are statistical and the second ones are systematic. The \emph{CP} asymmetries in these decays are also measured and all are found to be compatible with zero.
△ Less
Submitted 30 May, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
Observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (600 additional authors not shown)
Abstract:
By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fra…
▽ More
By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fractions are measured to be $\mathcal{B}(D^0\rightarrow {K}_1(1270)^-(\to K^0_Sπ^-π^0)e^+ν_e)=(1.69^{+0.53}_{-0.46}\pm0.15)\times10^{-4}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0(\to K^0_Sπ^+π^-)e^+ν_e)=(1.47^{+0.45}_{-0.40}\pm0.20)\times10^{-4}$ with statistical significance of 5.4$σ$ and 5.6$σ$, respectively. When combined with measurements of the $K_1(1270)\to K^+π^-π$ decays, the absolute branching fractions are determined to be $\mathcal{B}(D^0\to K_1(1270)^-e^+ν_e)=(1.05^{+0.33}_{-0.28}\pm0.12\pm0.12)\times10^{-3}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0e^+ν_e)=(1.29^{+0.40}_{-0.35}\pm0.18\pm0.15)\times10^{-3}$. The first and second uncertainties are statistical and systematic, respectively, and the third uncertainties originate from the assumed branching fractions of the $K_1(1270)\to Kππ$ decays.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction
Authors:
Yuxiang Zhao,
Zhuomin Chai,
Xun Jiang,
Yibo Lin,
Runsheng Wang,
Ru Huang
Abstract:
IR drop on the power delivery network (PDN) is closely related to PDN's configuration and cell current consumption. As the integrated circuit (IC) design is growing larger, dynamic IR drop simulation becomes computationally unaffordable and machine learning based IR drop prediction has been explored as a promising solution. Although CNN-based methods have been adapted to IR drop prediction task in…
▽ More
IR drop on the power delivery network (PDN) is closely related to PDN's configuration and cell current consumption. As the integrated circuit (IC) design is growing larger, dynamic IR drop simulation becomes computationally unaffordable and machine learning based IR drop prediction has been explored as a promising solution. Although CNN-based methods have been adapted to IR drop prediction task in several works, the shortcomings of overlooking PDN configuration is non-negligible. In this paper, we consider not only how to properly represent cell-PDN relation, but also how to model IR drop following its physical nature in the feature aggregation procedure. Thus, we propose a novel graph structure, PDNGraph, to unify the representations of the PDN structure and the fine-grained cell-PDN relation. We further propose a dual-branch heterogeneous network, PDNNet, incorporating two parallel GNN-CNN branches to favorably capture the above features during the learning process. Several key designs are presented to make the dynamic IR drop prediction highly effective and interpretable. We are the first work to apply graph structure to deep-learning based dynamic IR drop prediction method. Experiments show that PDNNet outperforms the state-of-the-art CNN-based methods by up to 39.3% reduction in prediction error and achieves 545x speedup compared to the commercial tool, which demonstrates the superiority of our method.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation
Authors:
Xilin Jiang,
Cong Han,
Nima Mesgarani
Abstract:
Transformers have been the most successful architecture for various speech modeling tasks, including speech separation. However, the self-attention mechanism in transformers with quadratic complexity is inefficient in computation and memory. Recent models incorporate new layers and modules along with transformers for better performance but also introduce extra model complexity. In this work, we re…
▽ More
Transformers have been the most successful architecture for various speech modeling tasks, including speech separation. However, the self-attention mechanism in transformers with quadratic complexity is inefficient in computation and memory. Recent models incorporate new layers and modules along with transformers for better performance but also introduce extra model complexity. In this work, we replace transformers with Mamba, a selective state space model, for speech separation. We propose dual-path Mamba, which models short-term and long-term forward and backward dependency of speech signals using selective state spaces. Our experimental results on the WSJ0-2mix data show that our dual-path Mamba models of comparably smaller sizes outperform state-of-the-art RNN model DPRNN, CNN model WaveSplit, and transformer model Sepformer. Code: https://github.com/xi-j/Mamba-TasNet
△ Less
Submitted 30 April, 2024; v1 submitted 27 March, 2024;
originally announced March 2024.
-
Random-coupled Neural Network
Authors:
Haoran Liu,
Mingzhe Liu,
Peng Li,
Jiahui Wu,
Xin Jiang,
Zhuo Zuo,
Bingqi Liu
Abstract:
Improving the efficiency of current neural networks and modeling them in biological neural systems have become popular research directions in recent years. Pulse-coupled neural network (PCNN) is a well applicated model for imitating the computation characteristics of the human brain in computer vision and neural network fields. However, differences between the PCNN and biological neural systems re…
▽ More
Improving the efficiency of current neural networks and modeling them in biological neural systems have become popular research directions in recent years. Pulse-coupled neural network (PCNN) is a well applicated model for imitating the computation characteristics of the human brain in computer vision and neural network fields. However, differences between the PCNN and biological neural systems remain: limited neural connection, high computational cost, and lack of stochastic property. In this study, random-coupled neural network (RCNN) is proposed. It overcomes these difficulties in PCNN's neuromorphic computing via a random inactivation process. This process randomly closes some neural connections in the RCNN model, realized by the random inactivation weight matrix of link input. This releases the computational burden of PCNN, making it affordable to achieve vast neural connections. Furthermore, the image and video processing mechanisms of RCNN are researched. It encodes constant stimuli as periodic spike trains and periodic stimuli as chaotic spike trains, the same as biological neural information encoding characteristics. Finally, the RCNN is applicated to image segmentation, fusion, and pulse shape discrimination subtasks. It is demonstrated to be robust, efficient, and highly anti-noised, with outstanding performance in all applications mentioned above.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Cross section measurement of $e^+e^-\to ηψ(2S)$ and search for $e^+e^-\toη\tilde{X}(3872)$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
The energy-dependent cross section for $e^+e^-\to ηψ(2S)$ is measured at eighteen center of mass energies from 4.288 GeV to 4.951 GeV using the BESIII detector. Using the same data samples, we also perform the first search for the reaction $e^+e^-\toη\tilde{X}(3872)$, but no evidence is found for the $\tilde{X}(3872)$ in the $π^+π^- J/ψ$ mass distribution. At each of the eighteen center of mass en…
▽ More
The energy-dependent cross section for $e^+e^-\to ηψ(2S)$ is measured at eighteen center of mass energies from 4.288 GeV to 4.951 GeV using the BESIII detector. Using the same data samples, we also perform the first search for the reaction $e^+e^-\toη\tilde{X}(3872)$, but no evidence is found for the $\tilde{X}(3872)$ in the $π^+π^- J/ψ$ mass distribution. At each of the eighteen center of mass energies, upper limits at the 90\% confidence level on the cross section for $e^+e^-\toηψ(2S)$ and on the product of the $e^+e^-\toη\tilde{X}(3872)$ cross section with the branching fraction of $\tilde{X}(3872)\toπ^+π^- J/ψ$ are reported.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Visually Guided Generative Text-Layout Pre-training for Document Intelligence
Authors:
Zhiming Mao,
Haoli Bai,
Lu Hou,
Jiansheng Wei,
Xin Jiang,
Qun Liu,
Kam-Fai Wong
Abstract:
Prior study shows that pre-training techniques can boost the performance of visual document understanding (VDU), which typically requires models to gain abilities to perceive and reason both document texts and layouts (e.g., locations of texts and table-cells). To this end, we propose visually guided generative text-layout pre-training, named ViTLP. Given a document image, the model optimizes hier…
▽ More
Prior study shows that pre-training techniques can boost the performance of visual document understanding (VDU), which typically requires models to gain abilities to perceive and reason both document texts and layouts (e.g., locations of texts and table-cells). To this end, we propose visually guided generative text-layout pre-training, named ViTLP. Given a document image, the model optimizes hierarchical language and layout modeling objectives to generate the interleaved text and layout sequence. In addition, to address the limitation of processing long documents by Transformers, we introduce a straightforward yet effective multi-segment generative pre-training scheme, facilitating ViTLP to process word-intensive documents of any length. ViTLP can function as a native OCR model to localize and recognize texts of document images. Besides, ViTLP can be effectively applied to various downstream VDU tasks. Extensive experiments show that ViTLP achieves competitive performance over existing baselines on benchmark VDU tasks, including information extraction, document classification, and document question answering.
△ Less
Submitted 27 March, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
HyPer-EP: Meta-Learning Hybrid Personalized Models for Cardiac Electrophysiology
Authors:
Xiajun Jiang,
Sumeet Vadhavkar,
Yubo Ye,
Maryam Toloubidokhti,
Ryan Missel,
Linwei Wang
Abstract:
Personalized virtual heart models have demonstrated increasing potential for clinical use, although the estimation of their parameters given patient-specific data remain a challenge. Traditional physics-based modeling approaches are computationally costly and often neglect the inherent structural errors in these models due to model simplifications and assumptions. Modern deep learning approaches,…
▽ More
Personalized virtual heart models have demonstrated increasing potential for clinical use, although the estimation of their parameters given patient-specific data remain a challenge. Traditional physics-based modeling approaches are computationally costly and often neglect the inherent structural errors in these models due to model simplifications and assumptions. Modern deep learning approaches, on the other hand, rely heavily on data supervision and lacks interpretability. In this paper, we present a novel hybrid modeling framework to describe a personalized cardiac digital twin as a combination of a physics-based known expression augmented by neural network modeling of its unknown gap to reality. We then present a novel meta-learning framework to enable the separate identification of both the physics-based and neural components in the hybrid model. We demonstrate the feasibility and generality of this hybrid modeling framework with two examples of instantiations and their proof-of-concept in synthetic experiments.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Precise measurement of the $e^+e^-\to D_s^+D_s^-$ cross sections at center-of-mass energies from threshold to 4.95 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using the $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, at center-of-mass energies from the threshold to $4.95$~GeV, we present precise measurements of the cross sections for the process $e^+e^-\to D_s^+D_s^-$ using a single tag method. The resulting cross section lineshape exhibits several new structures, thereby offering an input for coupled channel…
▽ More
Using the $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, at center-of-mass energies from the threshold to $4.95$~GeV, we present precise measurements of the cross sections for the process $e^+e^-\to D_s^+D_s^-$ using a single tag method. The resulting cross section lineshape exhibits several new structures, thereby offering an input for coupled channel analysis and model tests, which are critical to understand vector charmonium-like states with masses between 4 and 5~GeV.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Tunable band gap in twisted bilayer graphene
Authors:
Xiu-Cai Jiang,
Yi-Yuan Zhao,
Yu-Zhong Zhang
Abstract:
At large commensurate angles, twisted bilayer graphene which holds even parity under sublattice exchange exhibits a tiny gap. Here, we point out a way to tune this tiny gap into a large gap. We start from comprehensive understanding of the physical origin of gap opening by density functional theory calculations. We reveal that the effective inter-layer hop**, intra-layer CDW, or inter-layer char…
▽ More
At large commensurate angles, twisted bilayer graphene which holds even parity under sublattice exchange exhibits a tiny gap. Here, we point out a way to tune this tiny gap into a large gap. We start from comprehensive understanding of the physical origin of gap opening by density functional theory calculations. We reveal that the effective inter-layer hop**, intra-layer CDW, or inter-layer charge imbalance favors a gap. Then, on the basis of tight-binding calculations, we suggest that a periodic transverse inhomogeneous pressure, which can tune inter-layer hop**s in specific regions of the moi$\rm\acute{r}$e supercell, may open a gap of over $100$~meV, which is further confirmed by first-principles calculations. Our results provide a theoretical guidance for experiments to open a large gap in twisted bilayer graphene.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
SoftPatch: Unsupervised Anomaly Detection with Noisy Data
Authors:
Xi Jiang,
Ying Chen,
Qiang Nie,
Yong Liu,
Jianlin Liu,
Bin-Bin Gao,
Jun Liu,
Chengjie Wang,
Feng Zheng
Abstract:
Although mainstream unsupervised anomaly detection (AD) algorithms perform well in academic datasets, their performance is limited in practical application due to the ideal experimental setting of clean training data. Training with noisy data is an inevitable problem in real-world anomaly detection but is seldom discussed. This paper considers label-level noise in image sensory anomaly detection f…
▽ More
Although mainstream unsupervised anomaly detection (AD) algorithms perform well in academic datasets, their performance is limited in practical application due to the ideal experimental setting of clean training data. Training with noisy data is an inevitable problem in real-world anomaly detection but is seldom discussed. This paper considers label-level noise in image sensory anomaly detection for the first time. To solve this problem, we proposed a memory-based unsupervised AD method, SoftPatch, which efficiently denoises the data at the patch level. Noise discriminators are utilized to generate outlier scores for patch-level noise elimination before coreset construction. The scores are then stored in the memory bank to soften the anomaly detection boundary. Compared with existing methods, SoftPatch maintains a strong modeling ability of normal data and alleviates the overconfidence problem in coreset. Comprehensive experiments in various noise scenes demonstrate that SoftPatch outperforms the state-of-the-art AD methods on the MVTecAD and BTAD benchmarks and is comparable to those methods under the setting without noise.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Toward Multi-class Anomaly Detection: Exploring Class-aware Unified Model against Inter-class Interference
Authors:
Xi Jiang,
Ying Chen,
Qiang Nie,
Jianlin Liu,
Yong Liu,
Chengjie Wang,
Feng Zheng
Abstract:
In the context of high usability in single-class anomaly detection models, recent academic research has become concerned about the more complex multi-class anomaly detection. Although several papers have designed unified models for this task, they often overlook the utility of class labels, a potent tool for mitigating inter-class interference. To address this issue, we introduce a Multi-class Imp…
▽ More
In the context of high usability in single-class anomaly detection models, recent academic research has become concerned about the more complex multi-class anomaly detection. Although several papers have designed unified models for this task, they often overlook the utility of class labels, a potent tool for mitigating inter-class interference. To address this issue, we introduce a Multi-class Implicit Neural representation Transformer for unified Anomaly Detection (MINT-AD), which leverages the fine-grained category information in the training stage. By learning the multi-class distributions, the model generates class-aware query embeddings for the transformer decoder, mitigating inter-class interference within the reconstruction model. Utilizing such an implicit neural representation network, MINT-AD can project category and position information into a feature embedding space, further supervised by classification and prior probability loss functions. Experimental results on multiple datasets demonstrate that MINT-AD outperforms existing unified training models.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Search for $ΔS=2$ nonleptonic hyperon decays $Ω^-\toΣ^{0}π^{-}$ and $Ω^-\to nK^{-}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the center-of-mass energy of $\sqrt{s} = 3.686$ GeV, we search for the first time for two nonleptonic hyperon decays that change strangeness by two units, $Ω^-\toΣ^{0}π^-$ and $Ω^-\to nK^{-}$. No significant signal is observed. The upper limits on their decay branching fractions are determined to be…
▽ More
Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the center-of-mass energy of $\sqrt{s} = 3.686$ GeV, we search for the first time for two nonleptonic hyperon decays that change strangeness by two units, $Ω^-\toΣ^{0}π^-$ and $Ω^-\to nK^{-}$. No significant signal is observed. The upper limits on their decay branching fractions are determined to be $\mathcal{B}(Ω^-\toΣ^{0}π^-) < 5.4\times 10^{-4}$ and $\mathcal{B}(Ω^-\to nK^{-}) < 2.4\times 10^{-4}$ at the $90\%$ confidence level.
△ Less
Submitted 14 April, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.