-
Fair Data Generation via Score-based Diffusion Model
Authors:
Yujie Lin,
Dong Li,
Chen Zhao,
Minglai Shao
Abstract:
The fairness of AI decision-making has garnered increasing attention, leading to the proposal of numerous fairness algorithms. In this paper, we aim not to address this issue by directly introducing fair learning algorithms, but rather by generating entirely new, fair synthetic data from biased datasets for use in any downstream tasks. Additionally, the distribution of test data may differ from th…
▽ More
The fairness of AI decision-making has garnered increasing attention, leading to the proposal of numerous fairness algorithms. In this paper, we aim not to address this issue by directly introducing fair learning algorithms, but rather by generating entirely new, fair synthetic data from biased datasets for use in any downstream tasks. Additionally, the distribution of test data may differ from that of the training set, potentially impacting the performance of the generated synthetic data in downstream tasks. To address these two challenges, we propose a diffusion model-based framework, FADM: Fairness-Aware Diffusion with Meta-training. FADM introduces two types of gradient induction during the sampling phase of the diffusion model: one to ensure that the generated samples belong to the desired target categories, and another to make the sensitive attributes of the generated samples difficult to classify into any specific sensitive attribute category. To overcome data distribution shifts in the test environment, we train the diffusion model and the two classifiers used for induction within a meta-learning framework. Compared to other baselines, FADM allows for flexible control over the categories of the generated samples and exhibits superior generalization capability. Experiments on real datasets demonstrate that FADM achieves better accuracy and optimal fairness in downstream tasks.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the…
▽ More
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the $90\%$ confidence level. In addition, the branching faction $B(J/ψ\toωK^+ K^- η)$ is measured to be $(3.33\pm0.02(\rm{stat.})\pm 0.12(\rm{syst.}))\times 10^{-4}$.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Kinematics and star formation of hub-filament systems in W49A
Authors:
WenJun Zhang,
Jianjun Zhou,
Jarken Esimbek,
Willem Baan,
Yuxin He,
Xindi Tang,
Dalei Li,
Weiguang Ji,
Gang Wu,
Yingxiu Ma,
Jiasheng Li,
Dongdong Zhou,
Kadirya Tursun,
Toktarkhan Komesh
Abstract:
W49A is a prominent giant molecular cloud (GMC) that exhibits strong star formation activities, yet its structural and kinematic properties remain uncertain. Our study aims to investigate the large-scale structure and kinematics of W49A, and elucidate the role of filaments and hub-filament systems (HFSs) in its star formation activity. We utilized continuum data from Herschel and the James Clerk M…
▽ More
W49A is a prominent giant molecular cloud (GMC) that exhibits strong star formation activities, yet its structural and kinematic properties remain uncertain. Our study aims to investigate the large-scale structure and kinematics of W49A, and elucidate the role of filaments and hub-filament systems (HFSs) in its star formation activity. We utilized continuum data from Herschel and the James Clerk Maxwell Telescope (JCMT) as well as the molecular lines 12CO (3-2), 13CO (3-2), and C18O (3-2) to identify filaments and HFS structures within W49A. Further analysis focused on the physical properties, kinematics, and mass transport within these structures. Additionally, recombination line emission from the H I/OH/Recombination (THOR) line survey was employed to trace the central H II region and ionized gas. Our findings reveal that W49A comprises one blue-shifted (B-S) HFS and one red-shifted (R-S) HFS, each with multiple filaments and dense hubs. Notably, significant velocity gradients were detected along these filaments, indicative of material transport toward the hubs. High mass accretion rates along the filaments facilitate the formation of massive stars in the HFSs. Furthermore, the presence of V-shaped structures around clumps in position-velocity diagrams suggests ongoing gravitational collapse and local star formation within the filaments. Our results indicate that W49A consists of one R-S HFS and one B-S HFS, and that the material transport from filaments to the hub promotes the formation of massive stars in the hub. These findings underscore the significance of HFSs in sha** the star formation history of W49A.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
LLM-based Knowledge Pruning for Time Series Data Analytics on Edge-computing Devices
Authors:
Ruibing **,
Qing Xu,
Min Wu,
Yuecong Xu,
Dan Li,
Xiaoli Li,
Zhenghua Chen
Abstract:
Limited by the scale and diversity of time series data, the neural networks trained on time series data often overfit and show unsatisfacotry performances. In comparison, large language models (LLMs) recently exhibit impressive generalization in diverse fields. Although massive LLM based approaches are proposed for time series tasks, these methods require to load the whole LLM in both training and…
▽ More
Limited by the scale and diversity of time series data, the neural networks trained on time series data often overfit and show unsatisfacotry performances. In comparison, large language models (LLMs) recently exhibit impressive generalization in diverse fields. Although massive LLM based approaches are proposed for time series tasks, these methods require to load the whole LLM in both training and reference. This high computational demands limit practical applications in resource-constrained settings, like edge-computing and IoT devices. To address this issue, we propose Knowledge Pruning (KP), a novel paradigm for time series learning in this paper. For a specific downstream task, we argue that the world knowledge learned by LLMs is much redundant and only the related knowledge termed as "pertinent knowledge" is useful. Unlike other methods, our KP targets to prune the redundant knowledge and only distill the pertinent knowledge into the target model. This reduces model size and computational costs significantly. Additionally, different from existing LLM based approaches, our KP does not require to load the LLM in the process of training and testing, further easing computational burdens. With our proposed KP, a lightweight network can effectively learn the pertinent knowledge, achieving satisfactory performances with a low computation cost. To verify the effectiveness of our KP, two fundamental tasks on edge-computing devices are investigated in our experiments, where eight diverse environments or benchmarks with different networks are used to verify the generalization of our KP. Through experiments, our KP demonstrates effective learning of pertinent knowledge, achieving notable performance improvements in regression (19.7% on average) and classification (up to 13.7%) tasks, showcasing state-of-the-art results.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes…
▽ More
In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (636 additional authors not shown)
Abstract:
Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur…
▽ More
Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measured in both destructive and constructive interference scenarios for the first time. The mass and width of the $η_{c}(1S)$ are measured to be $M=(2984.14 \pm 0.13 \pm 0.38)$ MeV/$c^{2}$ and $Γ=(28.82 \pm 0.11 \pm 0.82)$ MeV, respectively. Clear signals for the decays of the $χ_{cJ}(J=0,1,2)$ and the $η_{c}(2S)$ to $2(π^{+}π^{-})η$ are also observed for the first time, and the corresponding branching fractions are measured. The ratio of the branching fractions between the $η_{c}(2S)$ and $η_{c}(1S)$ decays is significantly lower than the theoretical prediction, which might suggest different dynamics in their decays.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Multi-Teacher Multi-Objective Meta-Learning for Zero-Shot Hyperspectral Band Selection
Authors:
Jie Feng,
Xiaojian Zhong,
Di Li,
Weisheng Dong,
Ronghua Shang,
Licheng Jiao
Abstract:
Band selection plays a crucial role in hyperspectral image classification by removing redundant and noisy bands and retaining discriminative ones. However, most existing deep learning-based methods are aimed at dealing with a specific band selection dataset, and need to retrain parameters for new datasets, which significantly limits their generalizability.To address this issue, a novel multi-teach…
▽ More
Band selection plays a crucial role in hyperspectral image classification by removing redundant and noisy bands and retaining discriminative ones. However, most existing deep learning-based methods are aimed at dealing with a specific band selection dataset, and need to retrain parameters for new datasets, which significantly limits their generalizability.To address this issue, a novel multi-teacher multi-objective meta-learning network (M$^3$BS) is proposed for zero-shot hyperspectral band selection. In M$^3$BS, a generalizable graph convolution network (GCN) is constructed to generate dataset-agnostic base, and extract compatible meta-knowledge from multiple band selection tasks. To enhance the ability of meta-knowledge extraction, multiple band selection teachers are introduced to provide diverse high-quality experiences.strategy Finally, subsequent classification tasks are attached and jointly optimized with multi-teacher band selection tasks through multi-objective meta-learning in an end-to-end trainable way. Multi-objective meta-learning guarantees to coordinate diverse optimization objectives automatically and adapt to various datasets simultaneously. Once the optimization is accomplished, the acquired meta-knowledge can be directly transferred to unseen datasets without any retraining or fine-tuning. Experimental results demonstrate the effectiveness and efficiency of our proposed method on par with state-of-the-art baselines for zero-shot hyperspectral band selection.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
DLLens: Testing Deep Learning Libraries via LLM-aided Synthesis
Authors:
Meiziniu Li,
Dongze Li,
Jianmeng Liu,
Jialun Cao,
Yongqiang Tian,
Shing-Chi Cheung
Abstract:
Testing is a major approach to ensuring the quality of deep learning (DL) libraries. Existing testing techniques commonly adopt differential testing to relieve the need for test oracle construction. However, these techniques are limited in finding implementations that offer the same functionality and generating diverse test inputs for differential testing. This paper introduces DLLens, a novel dif…
▽ More
Testing is a major approach to ensuring the quality of deep learning (DL) libraries. Existing testing techniques commonly adopt differential testing to relieve the need for test oracle construction. However, these techniques are limited in finding implementations that offer the same functionality and generating diverse test inputs for differential testing. This paper introduces DLLens, a novel differential testing technique for DL library testing. Our insight is that APIs in different DL libraries are commonly designed to accomplish various computations for the same set of published DL algorithms. Although the map** of these APIs is not often one-to-one, we observe that their computations can be mutually simulated after proper composition and adaptation. The use of these simulation counterparts facilitates differential testing for the detection of functional DL library bugs. Leveraging the insight, we propose DLLens as a novel mechanism that utilizes a large language model (LLM) to synthesize valid counterparts of DL library APIs. To generate diverse test inputs, DLLens incorporates a static analysis method aided by LLM to extract path constraints from all execution paths in each API and its counterpart's implementations. These path constraints are then used to guide the generation of diverse test inputs. We evaluate DLLens on two popular DL libraries, TensorFlow and PyTorch. Our evaluation shows that DLLens can synthesize counterparts for more than twice as many APIs found by state-of-the-art techniques on these libraries. Moreover, DLLens can extract 26.7% more constraints and detect 2.5 times as many bugs as state-of-the-art techniques. DLLens has successfully found 56 bugs in recent TensorFlow and PyTorch libraries. Among them, 41 are previously unknown, 39 of which have been confirmed by developers after reporting, and 19 of those confirmed bugs have been fixed by developers.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Investigating Sulfur Chemistry in the HD 163296 disk
Authors:
Rong Ma,
Donghui Quan,
Yan Zhou,
Jarken Esimbek,
Dalei Li,
Xiaohu Li,
Xia Zhang,
Juan Tuo,
Yanan Feng
Abstract:
Sulfur chemistry in the formation process of low-mass stars and planets remains poorly understood. The protoplanetary disks (PPDs) are the birthplace of planets and its distinctive environment provides an intriguing platform for investigating models of sulfur chemistry. We analyzed the ALMA observations of CS 7-6 transitions in the HD 163296 disk and perform astrochemical modeling to explore its s…
▽ More
Sulfur chemistry in the formation process of low-mass stars and planets remains poorly understood. The protoplanetary disks (PPDs) are the birthplace of planets and its distinctive environment provides an intriguing platform for investigating models of sulfur chemistry. We analyzed the ALMA observations of CS 7-6 transitions in the HD 163296 disk and perform astrochemical modeling to explore its sulfur chemistry. We simulated the distribution of sulfur-containing molecules and compared it with observationally deduced fractional column densities. We have found that the simulated column density of CS is consistent with the observationally deduced fractional column densities, while the simulated column density of C$_2$S is lower than the observationally deduced upper limits on column densities. This results indicate that we have a good understanding of the chemical properties of CS and C$_2$S in the disk. We also investigated the influence of the C/O ratio on sulfur-containing molecules and found that the column densities of SO, SO$_2$, and H$_2$S near the central star are dependent on the C/O ratio. Additionally, we found that the $N$[CS]/$N$[SO] ratio can serve as a promising indicator of the disk's C/O ratio in the HD 163296. Overall, the disk of HD 163296 provides a favorable environment for the detection of sulfur-containing molecules.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Rethinking the impact of noisy labels in graph classification: A utility and privacy perspective
Authors:
De Li,
Xianxian Li,
Zeming Gan,
Qiyu Li,
Bin Qu,
**yan Wang
Abstract:
Graph neural networks based on message-passing mechanisms have achieved advanced results in graph classification tasks. However, their generalization performance degrades when noisy labels are present in the training data. Most existing noisy labeling approaches focus on the visual domain or graph node classification tasks and analyze the impact of noisy labels only from a utility perspective. Unl…
▽ More
Graph neural networks based on message-passing mechanisms have achieved advanced results in graph classification tasks. However, their generalization performance degrades when noisy labels are present in the training data. Most existing noisy labeling approaches focus on the visual domain or graph node classification tasks and analyze the impact of noisy labels only from a utility perspective. Unlike existing work, in this paper, we measure the effects of noise labels on graph classification from data privacy and model utility perspectives. We find that noise labels degrade the model's generalization performance and enhance the ability of membership inference attacks on graph data privacy. To this end, we propose the robust graph neural network approach with noisy labeled graph classification. Specifically, we first accurately filter the noisy samples by high-confidence samples and the first feature principal component vector of each class. Then, the robust principal component vectors and the model output under data augmentation are utilized to achieve noise label correction guided by dual spatial information. Finally, supervised graph contrastive learning is introduced to enhance the embedding quality of the model and protect the privacy of the training graph data. The utility and privacy of the proposed method are validated by comparing twelve different methods on eight real graph classification datasets. Compared with the state-of-the-art methods, the RGLC method achieves at most and at least 7.8% and 0.8% performance gain at 30% noisy labeling rate, respectively, and reduces the accuracy of privacy attacks to below 60%.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
TernaryLLM: Ternarized Large Language Model
Authors:
Tianqi Chen,
Zhe Li,
Weixiang Xu,
Zeyu Zhu,
Dong Li,
Lu Tian,
Emad Barsoum,
Peisong Wang,
Jian Cheng
Abstract:
Large language models (LLMs) have achieved remarkable performance on Natural Language Processing (NLP) tasks, but they are hindered by high computational costs and memory requirements. Ternarization, an extreme form of quantization, offers a solution by reducing memory usage and enabling energy-efficient floating-point additions. However, applying ternarization to LLMs faces challenges stemming fr…
▽ More
Large language models (LLMs) have achieved remarkable performance on Natural Language Processing (NLP) tasks, but they are hindered by high computational costs and memory requirements. Ternarization, an extreme form of quantization, offers a solution by reducing memory usage and enabling energy-efficient floating-point additions. However, applying ternarization to LLMs faces challenges stemming from outliers in both weights and activations. In this work, observing asymmetric outliers and non-zero means in weights, we introduce Dual Learnable Ternarization (DLT), which enables both scales and shifts to be learnable. We also propose Outlier-Friendly Feature Knowledge Distillation (OFF) to recover the information lost in extremely low-bit quantization. The proposed OFF can incorporate semantic information and is insensitive to outliers. At the core of OFF is maximizing the mutual information between features in ternarized and floating-point models using cosine similarity. Extensive experiments demonstrate that our TernaryLLM surpasses previous low-bit quantization methods on the standard text generation and zero-shot benchmarks for different LLM families. Specifically, for one of the most powerful open-source models, LLaMA-3, our approach (W1.58A16) outperforms the previous state-of-the-art method (W2A16) by 5.8 in terms of perplexity on C4 and by 8.2% in terms of average accuracy on zero-shot tasks.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
STimage-1K4M: A histopathology image-gene expression dataset for spatial transcriptomics
Authors:
Jiawen Chen,
Muqing Zhou,
Wenrong Wu,
**wei Zhang,
Yun Li,
Didong Li
Abstract:
Recent advances in multi-modal algorithms have driven and been driven by the increasing availability of large image-text datasets, leading to significant strides in various fields, including computational pathology. However, in most existing medical image-text datasets, the text typically provides high-level summaries that may not sufficiently describe sub-tile regions within a large pathology ima…
▽ More
Recent advances in multi-modal algorithms have driven and been driven by the increasing availability of large image-text datasets, leading to significant strides in various fields, including computational pathology. However, in most existing medical image-text datasets, the text typically provides high-level summaries that may not sufficiently describe sub-tile regions within a large pathology image. For example, an image might cover an extensive tissue area containing cancerous and healthy regions, but the accompanying text might only specify that this image is a cancer slide, lacking the nuanced details needed for in-depth analysis. In this study, we introduce STimage-1K4M, a novel dataset designed to bridge this gap by providing genomic features for sub-tile images. STimage-1K4M contains 1,149 images derived from spatial transcriptomics data, which captures gene expression information at the level of individual spatial spots within a pathology image. Specifically, each image in the dataset is broken down into smaller sub-image tiles, with each tile paired with 15,000-30,000 dimensional gene expressions. With 4,293,195 pairs of sub-tile images and gene expressions, STimage-1K4M offers unprecedented granularity, paving the way for a wide range of advanced research in multi-modal data analysis an innovative applications in computational pathology, and beyond.
△ Less
Submitted 20 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Strong and weak $CP$ tests in sequential decays of polarized $Σ^0$ hyperons
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The wea…
▽ More
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The weak-$CP$ test is performed in the subsequent decays of their daughter particles $Λ$ and $\barΛ$. Also for the first time, the transverse polarizations of the $Σ^0$ hyperons in $J/ψ$ and $ψ(3686)$ decays are observed with opposite directions, and the ratios between the S-wave and D-wave contributions of the $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ decays are obtained. These results are crucial to understand the decay dynamics of the charmonium states and the production mechanism of the $Σ^0-\barΣ^0$ pairs.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Efficient algorithm for the oscillatory matrix functions
Authors:
Dong** Li,
Xue Wang,
Xiuying Zhang
Abstract:
This paper introduces an efficient algorithm for computing the general oscillatory matrix functions. These computations are crucial for solving second-order semi-linear initial value problems. The method is exploited using the scaling and restoring technique based on a quadruple angle formula in conjunction with a truncated Taylor series. The choice of the scaling parameter and the degree of the T…
▽ More
This paper introduces an efficient algorithm for computing the general oscillatory matrix functions. These computations are crucial for solving second-order semi-linear initial value problems. The method is exploited using the scaling and restoring technique based on a quadruple angle formula in conjunction with a truncated Taylor series. The choice of the scaling parameter and the degree of the Taylor polynomial relies on a forward error analysis. Numerical experiments show that the new algorithm behaves in a stable fashion and performs well in both accuracy and efficiency.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Async Learned User Embeddings for Ads Delivery Optimization
Authors:
Mingwei Tang,
Meng Liu,
Hong Li,
Junjie Yang,
Chenglin Wei,
Boyang Li,
Dai Li,
Rengan Xu,
Yifan Xu,
Zehua Zhang,
Xiangyu Wang,
Linfeng Liu,
Yuelei Xie,
Chengye Liu,
Labib Fawaz,
Li Li,
Hongnan Wang,
Bill Zhu,
Sri Reddy
Abstract:
In recommendation systems, high-quality user embeddings can capture subtle preferences, enable precise similarity calculations, and adapt to changing preferences over time to maintain relevance. The effectiveness of recommendation systems depends on the quality of user embedding. We propose to asynchronously learn high fidelity user embeddings for billions of users each day from sequence based mul…
▽ More
In recommendation systems, high-quality user embeddings can capture subtle preferences, enable precise similarity calculations, and adapt to changing preferences over time to maintain relevance. The effectiveness of recommendation systems depends on the quality of user embedding. We propose to asynchronously learn high fidelity user embeddings for billions of users each day from sequence based multimodal user activities through a Transformer-like large scale feature learning module. The async learned user representations embeddings (ALURE) are further converted to user similarity graphs through graph learning and then combined with user realtime activities to retrieval highly related ads candidates for the ads delivery system. Our method shows significant gains in both offline and online experiments.
△ Less
Submitted 23 June, 2024; v1 submitted 9 June, 2024;
originally announced June 2024.
-
Measurement of the integrated luminosity of the data collected at 3.773 GeV by BESIII from 2021 to 2024
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$,…
▽ More
We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$, $8.157 \pm 0.031$~fb$^{-1}$, and $4.191 \pm 0.016$~fb$^{-1}$, respectively, by analyzing large angle Bhabha scattering events. The uncertainties are dominated by systematic effects and the statistical uncertainties are negligible. Our results provide essential input for future analyses and precision measurements.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Perturbation Towards Easy Samples Improves Targeted Adversarial Transferability
Authors:
Junqi Gao,
Biqing Qi,
Yao Li,
Zhichang Guo,
Dong Li,
Yuming Xing,
Dazhi Zhang
Abstract:
The transferability of adversarial perturbations provides an effective shortcut for black-box attacks. Targeted perturbations have greater practicality but are more difficult to transfer between models. In this paper, we experimentally and theoretically demonstrated that neural networks trained on the same dataset have more consistent performance in High-Sample-Density-Regions (HSDR) of each class…
▽ More
The transferability of adversarial perturbations provides an effective shortcut for black-box attacks. Targeted perturbations have greater practicality but are more difficult to transfer between models. In this paper, we experimentally and theoretically demonstrated that neural networks trained on the same dataset have more consistent performance in High-Sample-Density-Regions (HSDR) of each class instead of low sample density regions. Therefore, in the target setting, adding perturbations towards HSDR of the target class is more effective in improving transferability. However, density estimation is challenging in high-dimensional scenarios. Further theoretical and experimental verification demonstrates that easy samples with low loss are more likely to be located in HSDR. Perturbations towards such easy samples in the target class can avoid density estimation for HSDR location. Based on the above facts, we verified that adding perturbations to easy samples in the target class improves targeted adversarial transferability of existing attack methods. A generative targeted attack strategy named Easy Sample Matching Attack (ESMA) is proposed, which has a higher success rate for targeted attacks and outperforms the SOTA generative method. Moreover, ESMA requires only 5% of the storage space and much less computation time comparing to the current SOTA, as ESMA attacks all classes with only one model instead of seperate models for each class. Our code is available at https://github.com/gjq100/ESMA.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Fighting Against the Repetitive Training and Sample Dependency Problem in Few-shot Named Entity Recognition
Authors:
Chang Tian,
Wenpeng Yin,
Dan Li,
Marie-Francine Moens
Abstract:
Few-shot named entity recognition (NER) systems recognize entities using a few labeled training examples. The general pipeline consists of a span detector to identify entity spans in text and an entity-type classifier to assign types to entities. Current span detectors rely on extensive manual labeling to guide training. Almost every span detector requires initial training on basic span features f…
▽ More
Few-shot named entity recognition (NER) systems recognize entities using a few labeled training examples. The general pipeline consists of a span detector to identify entity spans in text and an entity-type classifier to assign types to entities. Current span detectors rely on extensive manual labeling to guide training. Almost every span detector requires initial training on basic span features followed by adaptation to task-specific features. This process leads to repetitive training of the basic span features among span detectors. Additionally, metric-based entity-type classifiers, such as prototypical networks, typically employ a specific metric that gauges the distance between the query sample and entity-type referents, ultimately assigning the most probable entity type to the query sample. However, these classifiers encounter the sample dependency problem, primarily stemming from the limited samples available for each entity-type referent. To address these challenges, we proposed an improved few-shot NER pipeline. First, we introduce a step**stone span detector that is pre-trained on open-domain Wikipedia data. It can be used to initialize the pipeline span detector to reduce the repetitive training of basic features. Second, we leverage a large language model (LLM) to set reliable entity-type referents, eliminating reliance on few-shot samples of each type. Our model exhibits superior performance with fewer training steps and human-labeled data compared with baselines, as demonstrated through extensive experiments on various datasets. Particularly in fine-grained few-shot NER settings, our model outperforms strong baselines, including ChatGPT. We will publicly release the code, datasets, LLM outputs, and model checkpoints.
△ Less
Submitted 18 June, 2024; v1 submitted 8 June, 2024;
originally announced June 2024.
-
Sparse Multi-baseline SAR Cross-modal 3D Reconstruction of Vehicle Targets
Authors:
Da Li,
Guoqiang Zhao,
Houjun Sun,
Jiacheng Bao
Abstract:
Multi-baseline SAR 3D imaging faces significant challenges due to data sparsity. In recent years, deep learning techniques have achieved notable success in enhancing the quality of sparse SAR 3D imaging. However, previous work typically rely on full-aperture high-resolution radar images to supervise the training of deep neural networks (DNNs), utilizing only single-modal information from radar dat…
▽ More
Multi-baseline SAR 3D imaging faces significant challenges due to data sparsity. In recent years, deep learning techniques have achieved notable success in enhancing the quality of sparse SAR 3D imaging. However, previous work typically rely on full-aperture high-resolution radar images to supervise the training of deep neural networks (DNNs), utilizing only single-modal information from radar data. Consequently, imaging performance is limited, and acquiring full-aperture data for multi-baseline SAR is costly and sometimes impractical in real-world applications. In this paper, we propose a Cross-Modal Reconstruction Network (CMR-Net), which integrates differentiable render and cross-modal supervision with optical images to reconstruct highly sparse multi-baseline SAR 3D images of vehicle targets into visually structured and high-resolution images. We meticulously designed the network architecture and training strategies to enhance network generalization capability. Remarkably, CMR-Net, trained solely on simulated data, demonstrates high-resolution reconstruction capabilities on both publicly available simulation datasets and real measured datasets, outperforming traditional sparse reconstruction algorithms based on compressed sensing and other learning-based methods. Additionally, using optical images as supervision provides a cost-effective way to build training datasets, reducing the difficulty of method dissemination. Our work showcases the broad prospects of deep learning in multi-baseline SAR 3D imaging and offers a novel path for researching radar imaging based on cross-modal learning theory.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Exploiting Global Graph Homophily for Generalized Defense in Graph Neural Networks
Authors:
Duanyu Li,
Huijun Wu,
Min Xie,
Xugang Wu,
Zhenwei Wu,
Wenzhe Zhang
Abstract:
Graph neural network (GNN) models play a pivotal role in numerous tasks involving graph-related data analysis. Despite their efficacy, similar to other deep learning models, GNNs are susceptible to adversarial attacks. Even minor perturbations in graph data can induce substantial alterations in model predictions. While existing research has explored various adversarial defense techniques for GNNs,…
▽ More
Graph neural network (GNN) models play a pivotal role in numerous tasks involving graph-related data analysis. Despite their efficacy, similar to other deep learning models, GNNs are susceptible to adversarial attacks. Even minor perturbations in graph data can induce substantial alterations in model predictions. While existing research has explored various adversarial defense techniques for GNNs, the challenge of defending against adversarial attacks on real-world scale graph data remains largely unresolved. On one hand, methods reliant on graph purification and preprocessing tend to excessively emphasize local graph information, leading to sub-optimal defensive outcomes. On the other hand, approaches rooted in graph structure learning entail significant time overheads, rendering them impractical for large-scale graphs. In this paper, we propose a new defense method named Talos, which enhances the global, rather than local, homophily of graphs as a defense. Experiments show that the proposed approach notably outperforms state-of-the-art defense approaches, while imposing little computational overhead.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Measurements of the branching fractions of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^-π^0/η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for…
▽ More
Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for $h_c \to K^+ K^- π^0$ and $h_c \to K^+ K^- η$ are found with significances of $3.5σ$ and $3.3σ$, respectively, after considering the systematic uncertainties. The branching fractions of these decays are measured to be $\mathcal{B}(h_c \to π^+ π^- π^0)=(1.36\pm0.16\pm0.14)\times10^{-3}$, $\mathcal{B}(h_c \to K^+ K^- π^0)=(3.26\pm0.84\pm0.36)\times10^{-4}$, and $\mathcal{B}(h_c \to K^+ K^- η)=(3.13\pm1.08\pm0.38)\times10^{-4}$, where the first uncertainties are statistical and the second are systematic. No significant signal of $h_c\toπ^+π^-η$ is found, and the upper limit of its decay branching fraction is determined to be $\mathcal{B}(h_c\toπ^+π^-η) < 4.0 \times 10^{-4}$ at 90% confidence level.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Authors:
Yijiong Yu,
Huiqiang Jiang,
Xufang Luo,
Qianhui Wu,
Chin-Yew Lin,
Dongsheng Li,
Yuqing Yang,
Yongfeng Huang,
Lili Qiu
Abstract:
Large Language Models (LLMs) are increasingly applied in various real-world scenarios due to their excellent generalization capabilities and robust generative abilities. However, they exhibit position bias, also known as "lost in the middle", a phenomenon that is especially pronounced in long-context scenarios, which indicates the placement of the key information in different positions of a prompt…
▽ More
Large Language Models (LLMs) are increasingly applied in various real-world scenarios due to their excellent generalization capabilities and robust generative abilities. However, they exhibit position bias, also known as "lost in the middle", a phenomenon that is especially pronounced in long-context scenarios, which indicates the placement of the key information in different positions of a prompt can significantly affect accuracy. This paper first explores the micro-level manifestations of position bias, concluding that attention weights are a micro-level expression of position bias. It further identifies that, in addition to position embeddings, causal attention mask also contributes to position bias by creating position-specific hidden states. Based on these insights, we propose a method to mitigate position bias by scaling this positional hidden states. Experiments on the NaturalQuestions Multi-document QA, KV retrieval, LongBench and timeline reorder tasks, using various models including RoPE models, context windowextended models, and Alibi models, demonstrate the effectiveness and generalizability of our approach. Our method can improve performance by up to 15.2% by modifying just one dimension of hidden states. Our code is available at https://aka.ms/PositionalHidden.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning
Authors:
Depeng Li,
Tianqi Wang,
Junwei Chen,
Wei Dai,
Zhigang Zeng
Abstract:
Class-incremental learning (CIL) aims to train a model to learn new classes from non-stationary data streams without forgetting old ones. In this paper, we propose a new kind of connectionist model by tailoring neural unit dynamics that adapt the behavior of neural networks for CIL. In each training session, it introduces a supervisory mechanism to guide network expansion whose growth size is comp…
▽ More
Class-incremental learning (CIL) aims to train a model to learn new classes from non-stationary data streams without forgetting old ones. In this paper, we propose a new kind of connectionist model by tailoring neural unit dynamics that adapt the behavior of neural networks for CIL. In each training session, it introduces a supervisory mechanism to guide network expansion whose growth size is compactly commensurate with the intrinsic complexity of a newly arriving task. This constructs a near-minimal network while allowing the model to expand its capacity when cannot sufficiently hold new classes. At inference time, it automatically reactivates the required neural units to retrieve knowledge and leaves the remaining inactivated to prevent interference. We name our model AutoActivator, which is effective and scalable. To gain insights into the neural unit dynamics, we theoretically analyze the model's convergence property via a universal approximation theorem on learning sequential map**s, which is under-explored in the CIL community. Experiments show that our method achieves strong CIL performance in rehearsal-free and minimal-expansion settings with different backbones.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Noninvasive magnetic detection of 2D van der Waals room-temperature ferromagnet Fe3GaTe2 using divacancy spins in SiC
Authors:
Xia Chen,
Qin-Yue Luo,
Pei-Jie Guo,
Hao-Jie Zhou,
Qi-Cheng Hu,
Hong-Peng Wu,
Xiao-Wen Shen,
Ru-Yue Cui,
Lei Dong,
Tian-Xing Wei,
Yu-Hang Xiao,
De-Ren Li,
Li Lei,
Xi Zhang,
Jun-Feng Wang,
Gang Xiang
Abstract:
Room-temperature (RT) two-dimensional (2D) van der Waals (vdW) ferromagnets hold immense promise for next-generation spintronic devices for information storage and processing. To achieve high-density energy-efficient spintronic devices, it is essential to understand local magnetic properties of RT 2D vdW magnets. In this work, we realize noninvasive in situ magnetic detection in vdW-layered ferrom…
▽ More
Room-temperature (RT) two-dimensional (2D) van der Waals (vdW) ferromagnets hold immense promise for next-generation spintronic devices for information storage and processing. To achieve high-density energy-efficient spintronic devices, it is essential to understand local magnetic properties of RT 2D vdW magnets. In this work, we realize noninvasive in situ magnetic detection in vdW-layered ferromagnet Fe3GaTe2 using divacancy spins quantum sensor in silicon carbide (SiC) at RT. The structural features and magnetic properties of the Fe3GaTe2 are characterized utilizing Raman spectrum, magnetization and magneto-transport measurements. Further detailed analysis of temperature- and magnetic field-dependent optically detected magnetic resonances of the PL6 divacancy near the Fe3GaTe2 reveal that, the Curie temperature (Tc) of Fe3GaTe2 is ~360K, and the magnetization increases with external magnetic fields. Additionally, spin relaxometry technology is employed to probe the magnetic fluctuations of Fe3GaTe2, revealing a peak in the spin relaxation rate around Tc. These experiments give insights into the intriguing local magnetic properties of 2D vdW RT ferromagnet Fe3GaTe2 and pave the way for the application of SiC quantum sensors in noninvasive in situ magnetic detection of related 2D vdW magnets.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Measurements of the branching fractions of semileptonic $D^{+}_s$ decays via $e^+e^-\to D_s^{*+}D_s^{*-}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are…
▽ More
We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are ${\mathcal B}(D_s^+\to ηe^+ν_e)=(2.35\pm0.11_{\rm stat}\pm 0.10_{\rm syst})\%,$ ${\mathcal
B}(D_s^+\to η^\prime e^+ν_e)=(0.82\pm0.09_{\rm stat}\pm 0.04_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to φe^+ν_e)=(2.21\pm0.16_{\rm stat}\pm 0.11_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to f_0(980) e^+ν_e,f_0(980)\toπ^+π^-)=(0.15\pm0.02_{\rm stat}\pm 0.01_{\rm syst})\%,$ ${\mathcal
B}(D_s^+\to K^0 e^+ν_e)=(0.24\pm0.04_{\rm stat}\pm 0.01_{\rm syst})\%,$ and ${\mathcal B}(D_s^+\to K^{*0} e^+ν_e)=(0.19\pm0.03_{\rm stat}\pm 0.01_{\rm syst})\%.$ These results are consistent with those measured via the $e^+e^-\to D_s^{*\pm}D_s^{\mp}$ process by BESIII and CLEO. The hadronic transition form factors $D^+_s\to ηe^+ν_e$, $D^+_s\to η^\prime e^+ν_e$, and $D^+_s\to K^0 e^+ν_e$ at four-momentum transfer squared $q^2$ = 0 are determined to be $f^η_+(0) = 0.482 \pm 0.011_{\rm stat} \pm 0.009_{\rm syst}\pm0.004_{\rm input},$ $f^{η^{\prime}}_+(0) = 0.562 \pm 0.031_{\rm stat} \pm 0.014_{\rm
syst}\pm0.003_{\rm input},$ and $f^{K^0}_+(0) = 0.624 \pm 0.052_{\rm
stat} \pm 0.013_{\rm syst}\pm0.002_{\rm input}.$
△ Less
Submitted 4 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Extraction of Maternal and fetal ECG in a non-invasive way from abdominal ECG recordings using modified Progressive FastICA Peel-off
Authors:
Yao Li,
Xuanyu Luo,
Haowen Zhao,
Jiawen Cui,
Yangfan She,
Dongfang Li,
Lai Jiang,
Xu Zhang
Abstract:
The non-invasive abdominal electrocardiogram (AECG) gives a non-invasive way to monitor fetal well-being during pregnancy. Due to the overlap with maternal ECG (MECG) as well as potential noises from other sources, it is challenging to extract weak fetal ECG (FECG) using surface electrodes. Taking advantage of precise source separation capability of the FastICA approach combined with its constrain…
▽ More
The non-invasive abdominal electrocardiogram (AECG) gives a non-invasive way to monitor fetal well-being during pregnancy. Due to the overlap with maternal ECG (MECG) as well as potential noises from other sources, it is challenging to extract weak fetal ECG (FECG) using surface electrodes. Taking advantage of precise source separation capability of the FastICA approach combined with its constrained version specific to FECG, with weak source extraction capability warranted by the peel-off strategy and FECG waveform reconstruction ability ensured by singular value decomposition (SVD) method, a novel framework for FECG extraction from AECG recordings is presented in this paper. Specifically, a periodic constrained FastICA(pcFastICA) was developed to improve the precision of examining and correcting FECG source signals, based on the statistical characteristics of continuous and repetitive ECG emissions. Additionally, a successive judgement algorithm is designed to selected the optimal maternal and fetal ECG. The performance of the proposed method was examined on public datasets, synthetic data and clinical data, with an F1-scores for FECG extraction on ADFECG and NIFECGA dataset of 99.71% and 99.36%, on synthetic data with the highest noise level of 98.77%, on clinical data of 98.09%, which are all superior to other comparative methods. The results indicates that our proposed method has potential and effectiveness to separate weak FECG from multichannel AECG with high precision in high noise condition, which is of vital importance for ensuring the safety of both the fetus and the mother, as well as the advancement of artificial intelligent clinical monitoring.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Theoretical study of $N(1535)$ and $Σ^*(1/2^-)$ in the Cabibbo-favored process $Λ_c^+ \to p \bar{K}^0η$
Authors:
Ying Li,
Si-Wei Liu,
En Wang,
De-Min Li,
Li-Sheng Geng,
Ju-Jun Xie
Abstract:
Motivated by the recent experimental measurements, we have investigated the Cabibbo-favored process $Λ_c^+ \to p \bar{K}^0η$, where the $N(1535)$ resonance is dynamically generated from the $S$-wave pseudoscalar meson-octet baryon interactions within the chiral unitary approach. The contributions from the intermediate $N(1650)$ and the predicted low-lying baryon $Σ^*(1/2^-)$ are also considered. I…
▽ More
Motivated by the recent experimental measurements, we have investigated the Cabibbo-favored process $Λ_c^+ \to p \bar{K}^0η$, where the $N(1535)$ resonance is dynamically generated from the $S$-wave pseudoscalar meson-octet baryon interactions within the chiral unitary approach. The contributions from the intermediate $N(1650)$ and the predicted low-lying baryon $Σ^*(1/2^-)$ are also considered. In addition, a Breit-Wigner amplitude for the $N(1535)$ resonance is checked. By comparing with the measured $ηp$, $\bar{K}^0 η$, and $p \bar{K}^0$ invariant mass squared distributions, our results support the interpretation of $N(1535)$ as a dynamically generated state. Furthermore, we demonstrate that, with the contribution from $Σ^*(1/2^-)$ taken into account, the calculated invariant mass spectrum agrees with the Belle measurements. Future precise measurements of the $Λ_c^+\to p \bar{K}^0η$ process can further elucidate the existence of the low-lying baryon $Σ^*(1/2^-)$.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Dimba: Transformer-Mamba Diffusion Models
Authors:
Zhengcong Fei,
Mingyuan Fan,
Changqian Yu,
Debang Li,
Youqiang Zhang,
Junshi Huang
Abstract:
This paper unveils Dimba, a new text-to-image diffusion model that employs a distinctive hybrid architecture combining Transformer and Mamba elements. Specifically, Dimba sequentially stacked blocks alternate between Transformer and Mamba layers, and integrate conditional information through the cross-attention layer, thus capitalizing on the advantages of both architectural paradigms. We investig…
▽ More
This paper unveils Dimba, a new text-to-image diffusion model that employs a distinctive hybrid architecture combining Transformer and Mamba elements. Specifically, Dimba sequentially stacked blocks alternate between Transformer and Mamba layers, and integrate conditional information through the cross-attention layer, thus capitalizing on the advantages of both architectural paradigms. We investigate several optimization strategies, including quality tuning, resolution adaption, and identify critical configurations necessary for large-scale image generation. The model's flexible design supports scenarios that cater to specific resource constraints and objectives. When scaled appropriately, Dimba offers substantial throughput and a reduced memory footprint relative to conventional pure Transformers-based benchmarks. Extensive experiments indicate that Dimba achieves comparable performance compared with benchmarks in terms of image quality, artistic rendering, and semantic control. We also report several intriguing properties of architecture discovered during evaluation and release checkpoints in experiments. Our findings emphasize the promise of large-scale hybrid Transformer-Mamba architectures in the foundational stage of diffusion models, suggesting a bright future for text-to-image generation.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Joint Frame Structure, Beamwidth, and Power Allocation for UAV-Aided Localization and Communication
Authors:
Tianhao. Liang,
Tingting. Zhang,
Sheng. Zhou,
Wentao. Liu,
Dong. Li,
Qinyu. Zhang
Abstract:
In wireless sensors networks, integrating localization and communications techniques is crucial for efficient spectrum and hardware utilization. In this paper, we present a novel framework of unmanned aerial vehicle (UAV)-aided localization and communication for ground node (GN), where the average spectral efficiency (SE) is used to reveal the intricate relationship among frame structure, channel…
▽ More
In wireless sensors networks, integrating localization and communications techniques is crucial for efficient spectrum and hardware utilization. In this paper, we present a novel framework of unmanned aerial vehicle (UAV)-aided localization and communication for ground node (GN), where the average spectral efficiency (SE) is used to reveal the intricate relationship among frame structure, channel estimation error, and localization accuracy. In particular, we first derive the lower bounds for channel estimation error and the three dimensional location prediction error. Leveraging these comprehensive analysis, we formulate a problem to maximize the average SE in UAV-GN communication, where the frame structure, beamwidth and power allocation are jointly optimized. Subsequently, we propose an efficient iterative algorithm to address this non-convex problem with closed-form expressions for beamwidth and power allocation. Numerical results demonstrate that the performance of our proposed method can approach the upper bound with much lower complexity, and achieve over 70\% performance gain compared to non-localization benchmarks. Additionally, the analysis highlights the dominant impacts from the Doppler effect over noise on the average SE.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay
Authors:
Daya Bay collaboration,
F. P. An,
W. D. Bai,
A. B. Balantekin,
M. Bishai,
S. Blyth,
G. F. Cao,
J. Cao,
J. F. Chang,
Y. Chang,
H. S. Chen,
H. Y. Chen,
S. M. Chen,
Y. Chen,
Y. X. Chen,
Z. Y. Chen,
J. Cheng,
J. Cheng,
Y. -C. Cheng,
Z. K. Cheng,
J. J. Cherwinka,
M. C. Chu,
J. P. Cummings,
O. Dalager,
F. S. Deng
, et al. (177 additional authors not shown)
Abstract:
This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive…
▽ More
This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive region, the relative $\overlineν_{e}$ rates and energy spectra variation among the near and far detectors gives $\mathrm{sin}^22θ_{13} = 0.0759_{-0.0049}^{+0.0050}$ and $Δm^2_{32} = (2.72^{+0.14}_{-0.15})\times10^{-3}$ eV$^2$ assuming the normal neutrino mass ordering, and $Δm^2_{32} = (-2.83^{+0.15}_{-0.14})\times10^{-3}$ eV$^2$ for the inverted neutrino mass ordering. This estimate of $\sin^2 2θ_{13}$ is consistent with and essentially independent from the one obtained using the capture-on-gadolinium sample at Daya Bay. The combination of these two results yields $\mathrm{sin}^22θ_{13}= 0.0833\pm0.0022$, which represents an 8% relative improvement in precision regarding the Daya Bay full 3158-day capture-on-gadolinium result.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
An Efficient Trajectory Generation for Bi-copter Flight in Tight Space
Authors:
Xin Dong,
Yangjie Cui,
**gwu Xiang,
Daochun Li,
Zhan Tu
Abstract:
Unlike squared (or alike) quadrotors, elongated bi-copters leverage natural superiority in crossing tight spaces. To date, extensive works have focused on the design, modeling, and control of bi-copters. Besides, a proper motion planner utilizing bi-copters' shape characteristics is essential to efficiently and safely traverse tight spaces, yet it has rarely been studied. Current motion planning m…
▽ More
Unlike squared (or alike) quadrotors, elongated bi-copters leverage natural superiority in crossing tight spaces. To date, extensive works have focused on the design, modeling, and control of bi-copters. Besides, a proper motion planner utilizing bi-copters' shape characteristics is essential to efficiently and safely traverse tight spaces, yet it has rarely been studied. Current motion planning methods will significantly compromise their ability to traverse narrow spaces if the map is inflated based on the long dimension of the bi-copter. In this paper, we propose an efficient motion planning method that enables the safe navigation of bi-copters through narrow spaces. We first adapt a dynamic, feasible path-finding algorithm with whole-body collision checks to generate a collision-free path. Subsequently, we jointly optimize the position and rotation of the bi-copter to produce a trajectory that is safe, dynamically feasible, and smooth. Extensive simulations and real-world experiments have been conducted to verify the reliability and robustness of the proposed method.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Revisiting Energy Distribution and Formation Rate of CHIME Fast Radio Bursts
Authors:
K. J. Zhang,
X. F. Dong,
A. E. Rodin,
V. A. Fedorova,
Y. F. Huang,
D. Li,
P. Wang,
Q. M. Li,
C. Du,
F. Xu,
Z. B. Zhang
Abstract:
Using a large sample of fast radio bursts (FRBs) from the first CHIME/FRB catalog, we apply the Lynden-Bell's c$^-$ method to study their energy function and formation rate evolutions with redshift. It is found with the non-parametric Kendell's $τ$ statistics that the FRB energy strongly evolves with the cosmological redshift as $E(z)\propto(1 + z)^{5.23}$. After removing the redshift dependence,…
▽ More
Using a large sample of fast radio bursts (FRBs) from the first CHIME/FRB catalog, we apply the Lynden-Bell's c$^-$ method to study their energy function and formation rate evolutions with redshift. It is found with the non-parametric Kendell's $τ$ statistics that the FRB energy strongly evolves with the cosmological redshift as $E(z)\propto(1 + z)^{5.23}$. After removing the redshift dependence, the local energy distribution can be described by a broken power-law form of $Ψ(E_{0})\propto E_{0}^{-0.38}$ for the low-energy segment and $Ψ(E_{0})\propto E_{0}^{-2.01}$ for the high-energy segment with a dividing line of $\sim2.1\times10^{40} \rm erg$. Interestingly, we find that the formation rate of CHIME FRBs also evolves with redshift as $ρ(z)\propto(1+z)^{-4.73\pm0.08}$. The local formation rate $ρ(0)$ of the CHIME FRBs is constrained to be about $ 1.25\times 10^4\rm{\,Gpc^{-3}yr^{-1}}$ that is comparable with some previous estimations. In addition, we notice the formation rate not only exceeds the star formation rate at the lower redshifts but also always declines with the increase of redshift, which does not match the star formation history at all. Consequently, we suggest that most FRBs could originate from the older stellar populations.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Performance Evaluation of Dam** Systems in Civil Engineering Structures Via Minimal Sensor
Authors:
Xinhao He,
Dan Li
Abstract:
To control structural responses under various actions, the growing use of supplementary dam** systems in modern civil engineering structures necessitates inspecting and evaluating their operational performance postinstallation. However, due to the dispersed placement and complex nonlinearities of these devices, difficulties arise in determining minimal sensor configuration. This is inherently co…
▽ More
To control structural responses under various actions, the growing use of supplementary dam** systems in modern civil engineering structures necessitates inspecting and evaluating their operational performance postinstallation. However, due to the dispersed placement and complex nonlinearities of these devices, difficulties arise in determining minimal sensor configuration. This is inherently connected to a pivotal challenge: establishing a reliable input-output map**, which comprises both the mathematical model and sensor arrangements. Prior work indicates this can be achieved through theoretical observability analysis or Lie symmetries analysis, both of which provide different perspectives on the existence of a way to access the solutions of a system identification problem uniquely (at least locally). The present study introduces a unified framework, enhanced by algorithm realization as an application guide, for analyzing the observability and Lie symmetries of a given input-output map**. We demonstrate its implementation via examples of a building structure with various dam** systems under different conditions such as seismic loads, wind loads, and operational vibrations. Finally, we present a case study for an isolation building with an inerter damper and minimal sensor arrangement under seismic action. The results demonstrate that the unscented Kalman filter, a system identification method, can precisely estimate structural responses and assess dam** device performance once a reliable input-output map** is established.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
A Free-fermion Formulation of Two-dimensional Ising Models
Authors:
De-Zhang Li,
Xin Wang,
Xiao-Bao Yang
Abstract:
Ising model is a famous toy model in condensed matter and statistical physics. In this work we present a free-fermion formulation of the two-dimensional classical Ising models on the honeycomb, triangular and Kagomé lattices. Each Ising model is studied in the case of a zero field and the case of an imaginary field i(π/2)k_BT. We employ the decorated lattice technique, star-triangle transformation…
▽ More
Ising model is a famous toy model in condensed matter and statistical physics. In this work we present a free-fermion formulation of the two-dimensional classical Ising models on the honeycomb, triangular and Kagomé lattices. Each Ising model is studied in the case of a zero field and the case of an imaginary field i(π/2)k_BT. We employ the decorated lattice technique, star-triangle transformation and weak-graph expansion method to exactly map each Ising model in both cases into an eight-vertex model on the square lattice. The resulting vertex weights are shown to satisfy the free-fermion condition. In the zero field case, each Ising model is an even free-fermion model. In the case of the imaginary field, the Ising model on the honeycomb lattice is an even free-fermion model while those on the triangular and Kagomé lattices are odd free-fermion models. The exact solution of the Kagomé lattice Ising model in the imaginary field i(π/2)k_BT is obtained, which has not been reported in previous works. The frustrated Ising models on the triangular and Kagomé lattices in the imaginary field still exhibit a non-zero residual entropy.
△ Less
Submitted 31 May, 2024;
originally announced June 2024.
-
You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet
Authors:
Zhen Qin,
Yuxin Mao,
Xuyang Shen,
Dong Li,
**g Zhang,
Yuchao Dai,
Yiran Zhong
Abstract:
Linear attention mechanisms have gained prominence in causal language models due to their linear computational complexity and enhanced speed. However, the inherent decay mechanism in linear attention presents challenges when applied to multi-dimensional sequence modeling tasks, such as image processing and multi-modal learning. In these scenarios, the utilization of sequential scanning to establis…
▽ More
Linear attention mechanisms have gained prominence in causal language models due to their linear computational complexity and enhanced speed. However, the inherent decay mechanism in linear attention presents challenges when applied to multi-dimensional sequence modeling tasks, such as image processing and multi-modal learning. In these scenarios, the utilization of sequential scanning to establish a global receptive field necessitates multiple scans for multi-dimensional data, thereby leading to inefficiencies. This paper identifies the inefficiency caused by a multiplicative linear recurrence and proposes an efficient alternative additive linear recurrence to avoid the issue, as it can handle multi-dimensional data within a single scan. We further develop an efficient multi-dimensional sequential modeling framework called LightNet based on the new recurrence. Moreover, we present two new multi-dimensional linear relative positional encoding methods, MD-TPE and MD-LRPE to enhance the model's ability to discern positional information in multi-dimensional scenarios. Our empirical evaluations across various tasks, including image classification, image generation, bidirectional language modeling, and autoregressive language modeling, demonstrate the efficacy of LightNet, showcasing its potential as a versatile and efficient solution for multi-dimensional sequential modeling.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Aligning Multiclass Neural Network Classifier Criterion with Task Performance via $F_β$-Score
Authors:
Nathan Tsoi,
Deyuan Li,
Taesoo Daniel Lee,
Marynel Vázquez
Abstract:
Multiclass neural network classifiers are typically trained using cross-entropy loss. Following training, the performance of this same neural network is evaluated using an application-specific metric based on the multiclass confusion matrix, such as the Macro $F_β$-Score. It is questionable whether the use of cross-entropy will yield a classifier that aligns with the intended application-specific…
▽ More
Multiclass neural network classifiers are typically trained using cross-entropy loss. Following training, the performance of this same neural network is evaluated using an application-specific metric based on the multiclass confusion matrix, such as the Macro $F_β$-Score. It is questionable whether the use of cross-entropy will yield a classifier that aligns with the intended application-specific performance criteria, particularly in scenarios where there is a need to emphasize one aspect of classifier performance. For example, if greater precision is preferred over recall, the $β$ value in the $F_β$ evaluation metric can be adjusted accordingly, but the cross-entropy objective remains unaware of this preference during training. We propose a method that addresses this training-evaluation gap for multiclass neural network classifiers such that users can train these models informed by the desired final $F_β$-Score. Following prior work in binary classification, we utilize the concepts of the soft-set confusion matrices and a piecewise-linear approximation of the Heaviside step function. Our method extends the $2 \times 2$ binary soft-set confusion matrix to a multiclass $d \times d$ confusion matrix and proposes dynamic adaptation of the threshold value $τ$, which parameterizes the piecewise-linear Heaviside approximation during run-time. We present a theoretical analysis that shows that our method can be used to optimize for a soft-set based approximation of Macro-$F_β$ that is a consistent estimator of Macro-$F_β$, and our extensive experiments show the practical effectiveness of our approach.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Spectroscopy of bumpy BHs: non-rotating case
Authors:
Colin Weller,
Dongjun Li,
Yanbei Chen
Abstract:
Recent detections of gravitational waves have made black hole quasinormal modes a powerful tool in testing predictions of general relativity. Understanding the spectrum of these quasinormal modes in a broad class of theories beyond general relativity and a variety of astrophysical environments around black holes remains vital. In this work, we study the quasinormal mode spectrum of parametrized de…
▽ More
Recent detections of gravitational waves have made black hole quasinormal modes a powerful tool in testing predictions of general relativity. Understanding the spectrum of these quasinormal modes in a broad class of theories beyond general relativity and a variety of astrophysical environments around black holes remains vital. In this work, we study the quasinormal mode spectrum of parametrized deformations of a non-rotating black hole in the vacuum. Following Vigeland and Hughes, we model these parametrized deformations as axisymmetric multipole moments in the Weyl coordinates with amplitudes much less than the amplitude of the Schwarzschild potential. These tiny bumps in the black hole geometry satisfy the linearized vacuum Einstein equations and are asymptotically flat. We use the recently developed modified Teukolsky formalism to derive one decoupled differential equation for the radiative Weyl scalar $Ψ_0$. We then use the eigenvalue perturbation method to compute the quasinormal mode frequency shifts of both even- and odd-parity modes with $\ell=2,3$ and up to the overtone number $n=2$ for the Weyl multipoles with $\ell_W=2,3$. Our calculation provides an avenue to directly connect the multipole moments of a modified black hole spacetime to the QNM frequency shifts in a parametric way.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Search for $e^{+}e^{-}\toη'ψ(2S)$ at center-of-mass energies from 4.66 to 4.95 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using data samples with an integrated luminosity of $4.67~\mathrm{fb}^{-1}$ collected by the BESIII detector operating at the BEPCII collider, we search for the process $e^+e^- \rightarrow η' ψ(2S)$ at center-of-mass energies from $4.66$ to $4.95~\mathrm{GeV}$. No significant signal is observed, and upper limits for the Born cross sections $σ^B(e^+e^-\rightarrowη'ψ(2S))$ at the 90\% confidence lev…
▽ More
Using data samples with an integrated luminosity of $4.67~\mathrm{fb}^{-1}$ collected by the BESIII detector operating at the BEPCII collider, we search for the process $e^+e^- \rightarrow η' ψ(2S)$ at center-of-mass energies from $4.66$ to $4.95~\mathrm{GeV}$. No significant signal is observed, and upper limits for the Born cross sections $σ^B(e^+e^-\rightarrowη'ψ(2S))$ at the 90\% confidence level are determined.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Query Provenance Analysis for Robust and Efficient Query-based Black-box Attack Defense
Authors:
Shaofei Li,
Ziqi Zhang,
Haomin Jia,
Ding Li,
Yao Guo,
Xiangqun Chen
Abstract:
Query-based black-box attacks have emerged as a significant threat to machine learning systems, where adversaries can manipulate the input queries to generate adversarial examples that can cause misclassification of the model. To counter these attacks, researchers have proposed Stateful Defense Models (SDMs) for detecting adversarial query sequences and rejecting queries that are "similar" to the…
▽ More
Query-based black-box attacks have emerged as a significant threat to machine learning systems, where adversaries can manipulate the input queries to generate adversarial examples that can cause misclassification of the model. To counter these attacks, researchers have proposed Stateful Defense Models (SDMs) for detecting adversarial query sequences and rejecting queries that are "similar" to the history queries. Existing state-of-the-art (SOTA) SDMs (e.g., BlackLight and PIHA) have shown great effectiveness in defending against these attacks. However, recent studies have shown that they are vulnerable to Oracle-guided Adaptive Rejection Sampling (OARS) attacks, which is a stronger adaptive attack strategy. It can be easily integrated with existing attack algorithms to evade the SDMs by generating queries with fine-tuned direction and step size of perturbations utilizing the leaked decision information from the SDMs.
In this paper, we propose a novel approach, Query Provenance Analysis (QPA), for more robust and efficient SDMs. QPA encapsulates the historical relationships among queries as the sequence feature to capture the fundamental difference between benign and adversarial query sequences. To utilize the query provenance, we propose an efficient query provenance analysis algorithm with dynamic management. We evaluate QPA compared with two baselines, BlackLight and PIHA, on four widely used datasets with six query-based black-box attack algorithms. The results show that QPA outperforms the baselines in terms of defense effectiveness and efficiency on both non-adaptive and adaptive attacks. Specifically, QPA reduces the Attack Success Rate (ASR) of OARS to 4.08%, comparing to 77.63% and 87.72% for BlackLight and PIHA, respectively. Moreover, QPA also achieves 7.67x and 2.25x higher throughput than BlackLight and PIHA.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Study of the decays $χ_{cJ} \rightarrow Λ\barΛφ$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (637 additional authors not shown)
Abstract:
Based on $(2712.4 \pm 14.3) \times 10^{6}$ $ e^{+}e^{-}\toψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, we report the first evidence of $χ_{c0}\to Λ\bar Λφ$ decays and the first observation of $χ_{c1,2}\to Λ\bar Λφ$ decays, with significances of $4.5σ$, $11.3σ$ and $13.0σ$, respectively. The decay branching fractions of $χ_{c0,1,2}\to Λ\bar Λφ$ are measured t…
▽ More
Based on $(2712.4 \pm 14.3) \times 10^{6}$ $ e^{+}e^{-}\toψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, we report the first evidence of $χ_{c0}\to Λ\bar Λφ$ decays and the first observation of $χ_{c1,2}\to Λ\bar Λφ$ decays, with significances of $4.5σ$, $11.3σ$ and $13.0σ$, respectively. The decay branching fractions of $χ_{c0,1,2}\to Λ\bar Λφ$ are measured to be $( 2.99\pm1.24\pm0.19) \times 10^{-5}$, $(6.01\pm0.90\pm0.40 )\times 10^{-5}$, and $(7.13\pm0.81\pm0.36) \times 10^{-5}$, where the first uncertainties are statistical and the second systematic. No obvious enhancement near the $Λ\barΛ$ production threshold or excited $Λ$ state is found in the $Λφ$ (or $\barΛφ$) system.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
DAFNet: Dynamic Auxiliary Fusion for Sequential Model Editing in Large Language Models
Authors:
Taolin Zhang,
Qizhou Chen,
Dongyang Li,
Chengyu Wang,
Xiaofeng He,
Longtao Huang,
Hui Xue,
Jun Huang
Abstract:
Recently, while large language models (LLMs) have demonstrated impressive results, they still suffer from hallucination, i.e., the generation of false information. Model editing is the task of fixing factual mistakes in LLMs; yet, most previous works treat it as a one-time task, paying little attention to ever-emerging mistakes generated by LLMs. We address the task of sequential model editing (SM…
▽ More
Recently, while large language models (LLMs) have demonstrated impressive results, they still suffer from hallucination, i.e., the generation of false information. Model editing is the task of fixing factual mistakes in LLMs; yet, most previous works treat it as a one-time task, paying little attention to ever-emerging mistakes generated by LLMs. We address the task of sequential model editing (SME) that aims to rectify mistakes continuously. A Dynamic Auxiliary Fusion Network (DAFNet) is designed to enhance the semantic interaction among the factual knowledge within the entire sequence, preventing catastrophic forgetting during the editing process of multiple knowledge triples. Specifically, (1) for semantic fusion within a relation triple, we aggregate the intra-editing attention flow into auto-regressive self-attention with token-level granularity in LLMs. We further leverage multi-layer diagonal inter-editing attention flow to update the weighted representations of the entire sequence-level granularity. (2) Considering that auxiliary parameters are required to store the knowledge for sequential editing, we construct a new dataset named \textbf{DAFSet}, fulfilling recent, popular, long-tail and robust properties to enhance the generality of sequential editing. Experiments show DAFNet significantly outperforms strong baselines in single-turn and sequential editing. The usage of DAFSet also consistently improves the performance of other auxiliary network-based methods in various scenarios
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
A unified approach to the spectral radius, connectivity and edge-connectivity of graphs
Authors:
Yu Wang,
Dan Li,
Huiqiu Lin
Abstract:
For two integers $r\geq 2$ and $h\geq 0$, the \emph{$h$-extra $r$-component connectivity} $κ^h_r(G)$ of a graph $G$ is defined to be the minimum size of a subset of vertices whose removal disconnects $G$, and there are at least $r$ connected components in $G\!-\!S$ and each component has at least $h+1$ vertices. Denote by $\mathcal{G}_{n,δ}^{κ_r^h}$ the set of graphs with $h$-extra $r$-component c…
▽ More
For two integers $r\geq 2$ and $h\geq 0$, the \emph{$h$-extra $r$-component connectivity} $κ^h_r(G)$ of a graph $G$ is defined to be the minimum size of a subset of vertices whose removal disconnects $G$, and there are at least $r$ connected components in $G\!-\!S$ and each component has at least $h+1$ vertices. Denote by $\mathcal{G}_{n,δ}^{κ_r^h}$ the set of graphs with $h$-extra $r$-component connectivity $κ^h_r(G)$ and minimum degree $δ$. The following problem concerning spectral radius was proposed by Brualdi and Solheid [On the spectral radius of complementary acyclic matrices of zeros and one, SIAM J. Algebra Discrete Methods 7 (1986) 265-272]: Given a set of graphs $\mathscr{S}$, find an upper bound for the spectral radius of graphs in $\mathscr{S}$ and characterize the graphs in which the maximal spectral radius is attained. We study this question for $\mathscr{S}=\mathcal{G}_{n,δ}^{κ_r^h}$ where $r\geq 2$ and $h\geq 0$. Fan, Gu and Lin [$l$-connectivity, $l$-edge-connectivity and spectral radius of graphs, \emph{arXiv}:2309.05247] give the answer to $r\geq 2$ and $h=0$. In this paper, we solve this problem completely for $r\geq 2$ and $h\geq1$. Moreover, we also investigate analogous problems for the edge version. Our results can break the restriction of the extremum structure of the conditional connectivity. This implies some previous results in connectivity and edge-connectivity.
△ Less
Submitted 3 July, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
DiffPhysBA: Diffusion-based Physical Backdoor Attack against Person Re-Identification in Real-World
Authors:
Wenli Sun,
Xinyang Jiang,
Dongsheng Li,
Cairong Zhao
Abstract:
Person Re-Identification (ReID) systems pose a significant security risk from backdoor attacks, allowing adversaries to evade tracking or impersonate others. Beyond recognizing this issue, we investigate how backdoor attacks can be deployed in real-world scenarios, where a ReID model is typically trained on data collected in the digital domain and then deployed in a physical environment. This atta…
▽ More
Person Re-Identification (ReID) systems pose a significant security risk from backdoor attacks, allowing adversaries to evade tracking or impersonate others. Beyond recognizing this issue, we investigate how backdoor attacks can be deployed in real-world scenarios, where a ReID model is typically trained on data collected in the digital domain and then deployed in a physical environment. This attack scenario requires an attack flow that embeds backdoor triggers in the digital domain realistically enough to also activate the buried backdoor in person ReID models in the physical domain. This paper realizes this attack flow by leveraging a diffusion model to generate realistic accessories on pedestrian images (e.g., bags, hats, etc.) as backdoor triggers. However, the noticeable domain gap between the triggers generated by the off-the-shelf diffusion model and their physical counterparts results in a low attack success rate. Therefore, we introduce a novel diffusion-based physical backdoor attack (DiffPhysBA) method that adopts a training-free similarity-guided sampling process to enhance the resemblance between generated and physical triggers. Consequently, DiffPhysBA can generate realistic attributes as semantic-level triggers in the digital domain and provides higher physical ASR compared to the direct paste method by 25.6% on the real-world test set. Through evaluations on newly proposed real-world and synthetic ReID test sets, DiffPhysBA demonstrates an impressive success rate exceeding 90% in both the digital and physical domains. Notably, it excels in digital stealth metrics and can effectively evade state-of-the-art defense methods.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Correlated Electronic Structure and Density-Wave Gap in Trilayer Nickelate La4Ni3O10
Authors:
X. Du,
Y. D. Li,
Y. T. Cao,
C. Y. Pei,
M. X. Zhang,
W. X. Zhao,
K. Y. Zhai,
R. Z. Xu,
Z. K. Liu,
Z. W. Li,
J. K. Zhao,
G. Li,
Y. L. Chen,
Y. P. Qi,
H. J. Guo,
L. X. Yang
Abstract:
The discovery of pressurized superconductivity at 80 K in La3Ni2O7 officially brings nickelates into the family of high-temperature superconductors, which gives rise to not only new insights but also mysteries in the strongly correlated superconductivity. More recently, the sibling compound La4Ni3O10 was also shown to be superconducting below about 25 K under pressure, further boosting the popular…
▽ More
The discovery of pressurized superconductivity at 80 K in La3Ni2O7 officially brings nickelates into the family of high-temperature superconductors, which gives rise to not only new insights but also mysteries in the strongly correlated superconductivity. More recently, the sibling compound La4Ni3O10 was also shown to be superconducting below about 25 K under pressure, further boosting the popularity of nickelates in the Ruddlesden-Popper phase. In this study, combining high-resolution angle-resolved photoemission spectroscopy and ab initio calculation, we systematically investigate the electronic structures of La4Ni3O10 at ambient pressure. We reveal a high resemblance of La4Ni3O10 with La3Ni2O7 in the orbital-dependent fermiology and electronic structure, suggesting a similar electronic correlation between the two compounds. The temperature-dependent measurements imply an orbital-dependent energy gap related to the density-wave transition in La4Ni3O10. By comparing the theoretical pressure-dependent electronic structure, clues about the superconducting high-pressure phase can be deduced from the ambient measurements, providing crucial information for deciphering the unconventional superconductivity in nickelates.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Optical Extinctions of Inter-Arm Molecular Clouds in M31: A Pilot Study for the Upcoming CSST Observations
Authors:
Cailing Chen,
Zheng Zheng,
Chao-Wei Tsai,
Sihan Jiao,
**g Tang,
**gwen Wu,
Di Li,
Yun Zheng,
Lin**g Feng,
Yujiao Yang,
Yuan Liang
Abstract:
Recent sub-millimeter dust thermal emission observations have unveiled a significant number of inter-arm massive molecular clouds in M31.However,the effectiveness of this technique is limited to its sensitivity,making it challenging to study more distant galaxies.This study introduces an alternative approach,utilizing optical extinctions derived from space-based telescopes,with a focus on the fort…
▽ More
Recent sub-millimeter dust thermal emission observations have unveiled a significant number of inter-arm massive molecular clouds in M31.However,the effectiveness of this technique is limited to its sensitivity,making it challenging to study more distant galaxies.This study introduces an alternative approach,utilizing optical extinctions derived from space-based telescopes,with a focus on the forthcoming China Space Station Telescope(CSST).We first demonstrate the capability of this method by constructing dust extinction maps for 17 inter-arm massive molecular clouds in M31 using the Panchromatic Hubble Andromeda Treasury(PHAT) data.Our analysis reveals that inter-arm massive molecular clouds with an optical extinction(AV) greater than 1.6 mag exhibit a notable AV excess,facilitating their identification.The majority of these inter-arm massive molecular clouds show an AV around 1 mag,aligning with measurements from our JCMT data.Further validation using a mock CSST RGB star catalog confirms the method's effectiveness.We show that the derived AV values using CSST z and y photometries align more closely with the input values.Molecular clouds with AV>1.6 mag can also be identified using the CSST mock data.We thus claim that future CSST observation could provide an effective way for the detection of inter-arm massive molecular clouds with significant optical extinction in nearby galaxies.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining
Authors:
Dong Li,
Yidi Liu,
Xueyang Fu,
Senyan Xu,
Zheng-Jun Zha
Abstract:
Image deraining aims to remove rain streaks from rainy images and restore clear backgrounds. Currently, some research that employs the Fourier transform has proved to be effective for image deraining, due to it acting as an effective frequency prior for capturing rain streaks. However, despite there exists dependency of low frequency and high frequency in images, these Fourier-based methods rarely…
▽ More
Image deraining aims to remove rain streaks from rainy images and restore clear backgrounds. Currently, some research that employs the Fourier transform has proved to be effective for image deraining, due to it acting as an effective frequency prior for capturing rain streaks. However, despite there exists dependency of low frequency and high frequency in images, these Fourier-based methods rarely exploit the correlation of different frequencies for conjuncting their learning procedures, limiting the full utilization of frequency information for image deraining. Alternatively, the recently emerged Mamba technique depicts its effectiveness and efficiency for modeling correlation in various domains (e.g., spatial, temporal), and we argue that introducing Mamba into its unexplored Fourier spaces to correlate different frequencies would help improve image deraining. This motivates us to propose a new framework termed FourierMamba, which performs image deraining with Mamba in the Fourier space. Owning to the unique arrangement of frequency orders in Fourier space, the core of FourierMamba lies in the scanning encoding of different frequencies, where the low-high frequency order formats exhibit differently in the spatial dimension (unarranged in axis) and channel dimension (arranged in axis). Therefore, we design FourierMamba that correlates Fourier space information in the spatial and channel dimensions with distinct designs. Specifically, in the spatial dimension Fourier space, we introduce the zigzag coding to scan the frequencies to rearrange the orders from low to high frequencies, thereby orderly correlating the connections between frequencies; in the channel dimension Fourier space with arranged orders of frequencies in axis, we can directly use Mamba to perform frequency correlation and improve the channel information representation.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning
Authors:
Ruchika Chavhan,
Da Li,
Timothy Hospedales
Abstract:
While large-scale text-to-image diffusion models have demonstrated impressive image-generation capabilities, there are significant concerns about their potential misuse for generating unsafe content, violating copyright, and perpetuating societal biases. Recently, the text-to-image generation community has begun addressing these concerns by editing or unlearning undesired concepts from pre-trained…
▽ More
While large-scale text-to-image diffusion models have demonstrated impressive image-generation capabilities, there are significant concerns about their potential misuse for generating unsafe content, violating copyright, and perpetuating societal biases. Recently, the text-to-image generation community has begun addressing these concerns by editing or unlearning undesired concepts from pre-trained models. However, these methods often involve data-intensive and inefficient fine-tuning or utilize various forms of token remap**, rendering them susceptible to adversarial jailbreaks. In this paper, we present a simple and effective training-free approach, ConceptPrune, wherein we first identify critical regions within pre-trained models responsible for generating undesirable concepts, thereby facilitating straightforward concept unlearning via weight pruning. Experiments across a range of concepts including artistic styles, nudity, object erasure, and gender debiasing demonstrate that target concepts can be efficiently erased by pruning a tiny fraction, approximately 0.12% of total weights, enabling multi-concept erasure and robustness against various white-box and black-box adversarial attacks.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Can Graph Learning Improve Task Planning?
Authors:
Xixi Wu,
Yifei Shen,
Caihua Shan,
Kaitao Song,
Siwei Wang,
Bohang Zhang,
Jiarui Feng,
Hong Cheng,
Wei Chen,
Yun Xiong,
Dongsheng Li
Abstract:
Task planning is emerging as an important research topic alongside the development of large language models (LLMs). It aims to break down complex user requests into solvable sub-tasks, thereby fulfilling the original requests. In this context, the sub-tasks can be naturally viewed as a graph, where the nodes represent the sub-tasks, and the edges denote the dependencies among them. Consequently, t…
▽ More
Task planning is emerging as an important research topic alongside the development of large language models (LLMs). It aims to break down complex user requests into solvable sub-tasks, thereby fulfilling the original requests. In this context, the sub-tasks can be naturally viewed as a graph, where the nodes represent the sub-tasks, and the edges denote the dependencies among them. Consequently, task planning is a decision-making problem that involves selecting a connected path or subgraph within the corresponding graph and invoking it. In this paper, we explore graph learning-based methods for task planning, a direction that is orthogonal to the prevalent focus on prompt design. Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs, which is adeptly addressed by graph neural networks (GNNs). This theoretical insight led us to integrate GNNs with LLMs to enhance overall performance. Extensive experiments demonstrate that GNN-based methods surpass existing solutions even without training, and minimal training can further enhance their performance. Additionally, our approach complements prompt engineering and fine-tuning techniques, with performance further enhanced by improved prompts or a fine-tuned model.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Empowering Embodied Manipulation: A Bimanual-Mobile Robot Manipulation Dataset for Household Tasks
Authors:
Tianle Zhang,
Dongjiang Li,
Yihang Li,
Zecui Zeng,
Lin Zhao,
Lei Sun,
Yue Chen,
Xuelong Wei,
Yibing Zhan,
Lusong Li,
Xiaodong He
Abstract:
The advancements in embodied AI are increasingly enabling robots to tackle complex real-world tasks, such as household manipulation. However, the deployment of robots in these environments remains constrained by the lack of comprehensive bimanual-mobile robot manipulation data that can be learned. Existing datasets predominantly focus on single-arm manipulation tasks, while the few dual-arm datase…
▽ More
The advancements in embodied AI are increasingly enabling robots to tackle complex real-world tasks, such as household manipulation. However, the deployment of robots in these environments remains constrained by the lack of comprehensive bimanual-mobile robot manipulation data that can be learned. Existing datasets predominantly focus on single-arm manipulation tasks, while the few dual-arm datasets available often lack mobility features, task diversity, comprehensive sensor data, and robust evaluation metrics; they fail to capture the intricate and dynamic nature of household manipulation tasks that bimanual-mobile robots are expected to perform. To overcome these limitations, we propose BRMData, a Bimanual-mobile Robot Manipulation Dataset specifically designed for household applications. BRMData encompasses 10 diverse household tasks, including single-arm and dual-arm tasks, as well as both tabletop and mobile manipulations, utilizing multi-view and depth-sensing data information. Moreover, BRMData features tasks of increasing difficulty, ranging from single-object to multi-object gras**, non-interactive to human-robot interactive scenarios, and rigid-object to flexible-object manipulation, closely simulating real-world household applications. Additionally, we introduce a novel Manipulation Efficiency Score (MES) metric to evaluate both the precision and efficiency of robot manipulation methods in household tasks. We thoroughly evaluate and analyze the performance of advanced robot manipulation learning methods using our BRMData, aiming to drive the development of bimanual-mobile robot manipulation technologies. The dataset is now open-sourced and available at https://embodiedrobot.github.io/.
△ Less
Submitted 6 June, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
Kinetic temperature of massive star-forming molecular clumps measured with formaldehyde V. The massive filament DR21
Authors:
X. Zhao,
X. D. Tang,
C. Henkel,
Y. Gong,
Y. Lin,
D. L. Li,
Y. X. He,
Y. P. Ao,
X. Lu,
T. Liu,
Y. Sun,
K. Wang,
X. P. Chen,
J. Esimbek,
J. J. Zhou,
J. W. Wu,
J. J. Qiu,
X. W. Zheng,
J. S. Li,
C. S. Luo,
Q. Zhao
Abstract:
The kinetic temperature structure of the massive filament DR21 has been mapped using the IRAM 30 m telescope. This map** employed the para-H$_2$CO triplet ($J_{\rm K_aK_c}$ = 3$_{03}$--2$_{02}$, 3$_{22}$--2$_{21}$, and 3$_{21}$--2$_{20}$) on a scale of $\sim$0.1 pc. By modeling the averaged line ratios of para-H$_{2}$CO with RADEX under non-LTE assumptions, the kinetic temperature of the dense g…
▽ More
The kinetic temperature structure of the massive filament DR21 has been mapped using the IRAM 30 m telescope. This map** employed the para-H$_2$CO triplet ($J_{\rm K_aK_c}$ = 3$_{03}$--2$_{02}$, 3$_{22}$--2$_{21}$, and 3$_{21}$--2$_{20}$) on a scale of $\sim$0.1 pc. By modeling the averaged line ratios of para-H$_{2}$CO with RADEX under non-LTE assumptions, the kinetic temperature of the dense gas was derived at a density of $n$(H$_{2}$) = 10$^{5}$ cm$^{-3}$. The para-H$_2$CO lines reveal significantly higher temperatures than NH$_3$ (1,1)/(2,2) and FIR wavelengths. The dense clumps appear to correlate with the notable kinetic temperature. Among the four dense cores (N44, N46, N48, and N54), temperature gradients are observed on a scale of $\sim$0.1-0.3 pc. This suggests that the warm dense gas is influenced by internal star formation activity. With the exception of N54, the temperature profiles of these cores were fitted with power-law indices ranging from $-$0.3 to $-$0.5. This indicates that the warm dense gas is heated by radiation emitted from internally embedded protostar(s) and/or clusters. While there is no direct evidence supporting the idea that the dense gas is heated by shocks resulting from a past explosive event in the DR21 region, our measurements toward the DR21W1 region provide compelling evidence that the dense gas is indeed heated by shocks originating from the western DR21 flow. Higher temperatures appear to be associated with turbulence. The physical parameters of the dense gas in the DR21 filament exhibit a remarkable similarity to the results obtained in OMC-1 and N113. This may imply that the physical mechanisms governing the dynamics and thermodynamics of dense gas traced by H$_{2}$CO in diverse star formation regions may be dominated by common underlying principles despite variations in specific environmental conditions. (abbreviated)
△ Less
Submitted 29 May, 2024;
originally announced May 2024.