-
Merging history of massive galaxies at 3<z<6
Authors:
Kemeng Li,
Zhen Jiang,
** He,
Qi Guo,
Jie Wang
Abstract:
The observational data of high redshift galaxies become increasingly abundant, especially since the operation of the James Webb Space Telescope (JWST), which allows us to verify and optimize the galaxy formation model at high redshifts. In this work, we investigate the merging history of massive galaxies at $3 < z < 6$ using a well-developed semi-analytic galaxy formation catalogue. We find that t…
▽ More
The observational data of high redshift galaxies become increasingly abundant, especially since the operation of the James Webb Space Telescope (JWST), which allows us to verify and optimize the galaxy formation model at high redshifts. In this work, we investigate the merging history of massive galaxies at $3 < z < 6$ using a well-developed semi-analytic galaxy formation catalogue. We find that the major merger rate increases with redshift up to 3 and then flattens. The fraction of wet mergers, during which the sum of the cold gas mass is higher than the sum of the stellar mass in two merging galaxies, also increases from $\sim$ 34\% at $z = 0$ to 96\% at $z = 3$. Interestingly, almost all major mergers are wet at $z > 3$ . This can be attributed to the high fraction ($> 50\%$) of cold gas at $z > 3$. In addition, we study some special systems of massive merging galaxies at $3 < z < 6$, including the massive gas-rich major merging systems and extreme dense proto-clusters, and investigate the supermassive black hole-dark matter halo mass relation and dual AGNs. We find that the galaxy formation model reproduces the incidence of those observed massive galaxies, but fails to reproduce the relation between the supermassive black hole mass and the dark matter halo mass at $z \sim 6$. The latter requires more careful estimates of the supermassive black hole masses observationally. Otherwise, it could suggest modifications of the modeling of the supermassive black hole growth at high redshifts.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
C-NERF: Representing Scene Changes as Directional Consistency Difference-based NeRF
Authors:
Rui Huang,
Binbin Jiang,
Qingyi Zhao,
William Wang,
Yuxiang Zhang,
Qing Guo
Abstract:
In this work, we aim to detect the changes caused by object variations in a scene represented by the neural radiance fields (NeRFs). Given an arbitrary view and two sets of scene images captured at different timestamps, we can predict the scene changes in that view, which has significant potential applications in scene monitoring and measuring. We conducted preliminary studies and found that such…
▽ More
In this work, we aim to detect the changes caused by object variations in a scene represented by the neural radiance fields (NeRFs). Given an arbitrary view and two sets of scene images captured at different timestamps, we can predict the scene changes in that view, which has significant potential applications in scene monitoring and measuring. We conducted preliminary studies and found that such an exciting task cannot be easily achieved by utilizing existing NeRFs and 2D change detection methods with many false or missing detections. The main reason is that the 2D change detection is based on the pixel appearance difference between spatial-aligned image pairs and neglects the stereo information in the NeRF. To address the limitations, we propose the C-NERF to represent scene changes as directional consistency difference-based NeRF, which mainly contains three modules. We first perform the spatial alignment of two NeRFs captured before and after changes. Then, we identify the change points based on the direction-consistent constraint; that is, real change points have similar change representations across view directions, but fake change points do not. Finally, we design the change map rendering process based on the built NeRFs and can generate the change map of an arbitrarily specified view direction. To validate the effectiveness, we build a new dataset containing ten scenes covering diverse scenarios with different changing objects. Our approach surpasses state-of-the-art 2D change detection and NeRF-based methods by a significant margin.
△ Less
Submitted 23 December, 2023; v1 submitted 5 December, 2023;
originally announced December 2023.
-
Amplitude Analysis of the Decays $D^0\toπ^+π^-π^+π^-$ and $π^+π^-π^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (620 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ taken at the center-of-mass energy $\sqrt{s}=3.773$~GeV with the BESIII detector, a joint amplitude analysis is performed on the decays $D^0\toπ^+π^-π^+π^-$ and $D^0\toπ^+π^-π^0π^0$(non-$η$). The fit fractions of individual components are obtained, and large interferences among the dominant components…
▽ More
Using $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ taken at the center-of-mass energy $\sqrt{s}=3.773$~GeV with the BESIII detector, a joint amplitude analysis is performed on the decays $D^0\toπ^+π^-π^+π^-$ and $D^0\toπ^+π^-π^0π^0$(non-$η$). The fit fractions of individual components are obtained, and large interferences among the dominant components of $D^{0}\to a_{1}(1260)π$, $D^{0}\toπ(1300)π$, $D^{0}\toρ(770)ρ(770)$ and $D^{0}\to2(ππ)_{S}$ are found in both channels. With the obtained amplitude model, the $CP$-even fractions of $D^0\to π^+π^-π^+π^-$ and $D^0\toπ^+π^-π^0π^0$(non-$η$) are determined to be $(75.2\pm1.1_{\rm stat.}\pm1.5_{\rm syst.})\%$ and $(68.9\pm1.5_{\rm stat.}\pm 2.4_{\rm syst.})\%$, respectively. The branching fractions of $D^0\to π^+π^-π^+π^-$ and $D^0\toπ^+π^-π^0π^0$(non-$η$) are measured to be $(0.688\pm0.010_{\rm stat.}\pm 0.010_{\rm syst.})\%$ and $(0.951\pm0.025_{\rm stat.}\pm 0.021_{\rm syst.})\%$, respectively. The amplitude analysis provides an important model for binning strategy in the measurements of the strong phase parameters of $D^0 \to 4π$ when used to determine the CKM angle $γ(φ_{3})$ via the $B^{-}\to D K^{-}$ decay.
△ Less
Submitted 3 April, 2024; v1 submitted 5 December, 2023;
originally announced December 2023.
-
TranSegPGD: Improving Transferability of Adversarial Examples on Semantic Segmentation
Authors:
Xiaojun Jia,
**dong Gu,
Yihao Huang,
Simeng Qin,
Qing Guo,
Yang Liu,
Xiaochun Cao
Abstract:
Transferability of adversarial examples on image classification has been systematically explored, which generates adversarial examples in black-box mode. However, the transferability of adversarial examples on semantic segmentation has been largely overlooked. In this paper, we propose an effective two-stage adversarial attack strategy to improve the transferability of adversarial examples on sema…
▽ More
Transferability of adversarial examples on image classification has been systematically explored, which generates adversarial examples in black-box mode. However, the transferability of adversarial examples on semantic segmentation has been largely overlooked. In this paper, we propose an effective two-stage adversarial attack strategy to improve the transferability of adversarial examples on semantic segmentation, dubbed TranSegPGD. Specifically, at the first stage, every pixel in an input image is divided into different branches based on its adversarial property. Different branches are assigned different weights for optimization to improve the adversarial performance of all pixels.We assign high weights to the loss of the hard-to-attack pixels to misclassify all pixels. At the second stage, the pixels are divided into different branches based on their transferable property which is dependent on Kullback-Leibler divergence. Different branches are assigned different weights for optimization to improve the transferability of the adversarial examples. We assign high weights to the loss of the high-transferability pixels to improve the transferability of adversarial examples. Extensive experiments with various segmentation models are conducted on PASCAL VOC 2012 and Cityscapes datasets to demonstrate the effectiveness of the proposed method. The proposed adversarial attack method can achieve state-of-the-art performance.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Exchange-of-Thought: Enhancing Large Language Model Capabilities through Cross-Model Communication
Authors:
Zhangyue Yin,
Qiushi Sun,
Cheng Chang,
Qipeng Guo,
Junqi Dai,
Xuan**g Huang,
Xipeng Qiu
Abstract:
Large Language Models (LLMs) have recently made significant strides in complex reasoning tasks through the Chain-of-Thought technique. Despite this progress, their reasoning is often constrained by their intrinsic understanding, lacking external insights. To address this, we propose Exchange-of-Thought (EoT), a novel framework that enables cross-model communication during problem-solving. Drawing…
▽ More
Large Language Models (LLMs) have recently made significant strides in complex reasoning tasks through the Chain-of-Thought technique. Despite this progress, their reasoning is often constrained by their intrinsic understanding, lacking external insights. To address this, we propose Exchange-of-Thought (EoT), a novel framework that enables cross-model communication during problem-solving. Drawing inspiration from network topology, EoT integrates four unique communication paradigms: Memory, Report, Relay, and Debate. This paper delves into the communication dynamics and volume associated with each paradigm. To counterbalance the risks of incorrect reasoning chains, we implement a robust confidence evaluation mechanism within these communications. Our experiments across diverse complex reasoning tasks demonstrate that EoT significantly surpasses established baselines, underscoring the value of external insights in enhancing LLM performance. Furthermore, we show that EoT achieves these superior results in a cost-effective manner, marking a promising advancement for efficient and collaborative AI problem-solving.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
CoLLiE: Collaborative Training of Large Language Models in an Efficient Way
Authors:
Kai Lv,
Shuo Zhang,
Tianle Gu,
Shuhao Xing,
Jiawei Hong,
Keyu Chen,
Xiaoran Liu,
Yuqing Yang,
Honglin Guo,
Tengxiao Liu,
Yu Sun,
Qipeng Guo,
Hang Yan,
Xipeng Qiu
Abstract:
Large language models (LLMs) are increasingly pivotal in a wide range of natural language processing tasks. Access to pre-trained models, courtesy of the open-source community, has made it possible to adapt these models to specific applications for enhanced performance. However, the substantial resources required for training these models necessitate efficient solutions. This paper introduces CoLL…
▽ More
Large language models (LLMs) are increasingly pivotal in a wide range of natural language processing tasks. Access to pre-trained models, courtesy of the open-source community, has made it possible to adapt these models to specific applications for enhanced performance. However, the substantial resources required for training these models necessitate efficient solutions. This paper introduces CoLLiE, an efficient library that facilitates collaborative training of large language models using 3D parallelism, parameter-efficient fine-tuning (PEFT) methods, and optimizers such as Lion, Adan, Sophia, LOMO and AdaLomo. With its modular design and comprehensive functionality, CoLLiE offers a balanced blend of efficiency, ease of use, and customization. CoLLiE has proven superior training efficiency in comparison with prevalent solutions in pre-training and fine-tuning scenarios. Furthermore, we provide an empirical evaluation of the correlation between model size and GPU memory consumption under different optimization methods, as well as an analysis of the throughput. Lastly, we carry out a comprehensive comparison of various optimizers and PEFT methods within the instruction-tuning context. CoLLiE is available at https://github.com/OpenLMLab/collie.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?
Authors:
Zhengyue Zhao,
**hao Duan,
Kaidi Xu,
Chenan Wang,
Rui Zhang,
Zidong Du,
Qi Guo,
Xing Hu
Abstract:
Stable Diffusion has established itself as a foundation model in generative AI artistic applications, receiving widespread research and application. Some recent fine-tuning methods have made it feasible for individuals to implant personalized concepts onto the basic Stable Diffusion model with minimal computational costs on small datasets. However, these innovations have also given rise to issues…
▽ More
Stable Diffusion has established itself as a foundation model in generative AI artistic applications, receiving widespread research and application. Some recent fine-tuning methods have made it feasible for individuals to implant personalized concepts onto the basic Stable Diffusion model with minimal computational costs on small datasets. However, these innovations have also given rise to issues like facial privacy forgery and artistic copyright infringement. In recent studies, researchers have explored the addition of imperceptible adversarial perturbations to images to prevent potential unauthorized exploitation and infringements when personal data is used for fine-tuning Stable Diffusion. Although these studies have demonstrated the ability to protect images, it is essential to consider that these methods may not be entirely applicable in real-world scenarios. In this paper, we systematically evaluate the use of perturbations to protect images within a practical threat model. The results suggest that these approaches may not be sufficient to safeguard image privacy and copyright effectively. Furthermore, we introduce a purification method capable of removing protected perturbations while preserving the original image structure to the greatest extent possible. Experiments reveal that Stable Diffusion can effectively learn from purified images over all protective methods.
△ Less
Submitted 24 June, 2024; v1 submitted 30 November, 2023;
originally announced December 2023.
-
Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis
Authors:
Zipeng Qi,
Guoxi Huang,
Zebin Huang,
Qin Guo,
**wen Chen,
Junyu Han,
Jian Wang,
Gang Zhang,
Lufei Liu,
Errui Ding,
**gdong Wang
Abstract:
This paper introduces innovative solutions to enhance spatial controllability in diffusion models reliant on text queries. We present two key innovations: Vision Guidance and the Layered Rendering Diffusion (LRDiff) framework. Vision Guidance, a spatial layout condition, acts as a clue in the perturbed distribution, greatly narrowing down the search space, to focus on the image sampling process ad…
▽ More
This paper introduces innovative solutions to enhance spatial controllability in diffusion models reliant on text queries. We present two key innovations: Vision Guidance and the Layered Rendering Diffusion (LRDiff) framework. Vision Guidance, a spatial layout condition, acts as a clue in the perturbed distribution, greatly narrowing down the search space, to focus on the image sampling process adhering to the spatial layout condition. The LRDiff framework constructs an image-rendering process with multiple layers, each of which applies the vision guidance to instructively estimate the denoising direction for a single object. Such a layered rendering strategy effectively prevents issues like unintended conceptual blending or mismatches, while allowing for more coherent and contextually accurate image synthesis. The proposed method provides a more efficient and accurate means of synthesising images that align with specific spatial and contextual requirements. We demonstrate through our experiments that our method provides better results than existing techniques both quantitatively and qualitatively. We apply our method to three practical applications: bounding box-to-image, semantic mask-to-image and image editing.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Measurement of Branching Fractions for $Λ_{c}^{+} \rightarrow n K_{S}^{0} π^{+}$ and $Λ_{c}^{+} \rightarrow n K_{S}^{0} K^{+}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (603 additional authors not shown)
Abstract:
Based on 4.5 fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated at center-of-mass energies between $4.600\,\mathrm{GeV}$ and $4.699\,\mathrm{GeV}$ with the BESIII detector, we measure the absolute branching fraction of the Cabibbo-favored decay $Λ_{c}^{+} \rightarrow n K_{S}^{0} π^{+}$ with the precision improved by a factor of 2.8 and report the first evidence for the singly-Cabibbo-suppressed…
▽ More
Based on 4.5 fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated at center-of-mass energies between $4.600\,\mathrm{GeV}$ and $4.699\,\mathrm{GeV}$ with the BESIII detector, we measure the absolute branching fraction of the Cabibbo-favored decay $Λ_{c}^{+} \rightarrow n K_{S}^{0} π^{+}$ with the precision improved by a factor of 2.8 and report the first evidence for the singly-Cabibbo-suppressed decay $Λ_{c}^{+} \rightarrow n K_{S}^{0} K^{+}$. The branching fractions for $Λ_{c}^{+} \rightarrow n K_{S}^{0} π^{+}$ and $Λ_{c}^{+} \rightarrow n K_{S}^{0} K^{+}$ are determined to be $(1.86\pm0.08\pm0.04)\times10^{-2}$ and $\left(4.3^{+1.9}_{-1.5}\pm0.3\right)\times10^{-4}$, respectively, where the first uncertainties are statistical and the second ones are systematic.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
$Ξ(1620)$ production in $K^- p$ scattering process
Authors:
Quan-Yun Guo,
Zi-Li Yue,
Dian-Yong Chen
Abstract:
In the present work, the production of $Ξ(1620)$ in the $K^- p$ scattering process is investigated by using an effective Lagrangian approach, where $Ξ(1620)$ is considered as a $\bar{K} Λ$ molecular state. Our estimations indicate that the cross sections for $K^-p\to K^+ Ξ(1620)^-$ are $(1.48 ^{+ 1.12}_{-0.69}) \ \mathrm{μb}$ at $P_K=2.8 \ \mathrm{GeV}$, where the uncertainties are resulted from t…
▽ More
In the present work, the production of $Ξ(1620)$ in the $K^- p$ scattering process is investigated by using an effective Lagrangian approach, where $Ξ(1620)$ is considered as a $\bar{K} Λ$ molecular state. Our estimations indicate that the cross sections for $K^-p\to K^+ Ξ(1620)^-$ are $(1.48 ^{+ 1.12}_{-0.69}) \ \mathrm{μb}$ at $P_K=2.8 \ \mathrm{GeV}$, where the uncertainties are resulted from the variation of the model parameter. As for the $K^-p\to K^+ π^0 Ξ^-$ process, the cross sections are estimated to be $(0.61 ^{+0.47}_{-0.29})\ \mathrm{μb}$ at $P_K =2.8 \ \mathrm{GeV}$, which is consistent with the experimental measurements.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Sample as You Infer: Predictive Coding With Langevin Dynamics
Authors:
Umais Zahid,
Qinghai Guo,
Zafeirios Fountas
Abstract:
We present a novel algorithm for parameter learning in generic deep generative models that builds upon the predictive coding (PC) framework of computational neuroscience. Our approach modifies the standard PC algorithm to bring performance on-par and exceeding that obtained from standard variational auto-encoder (VAE) training. By injecting Gaussian noise into the PC inference procedure we re-envi…
▽ More
We present a novel algorithm for parameter learning in generic deep generative models that builds upon the predictive coding (PC) framework of computational neuroscience. Our approach modifies the standard PC algorithm to bring performance on-par and exceeding that obtained from standard variational auto-encoder (VAE) training. By injecting Gaussian noise into the PC inference procedure we re-envision it as an overdamped Langevin sampling, which facilitates optimisation with respect to a tight evidence lower bound (ELBO). We improve the resultant encoder-free training method by incorporating an encoder network to provide an amortised warm-start to our Langevin sampling and test three different objectives for doing so. Finally, to increase robustness to the sampling step size and reduce sensitivity to curvature, we validate a lightweight and easily computable form of preconditioning, inspired by Riemann Manifold Langevin and adaptive optimizers from the SGD literature. We compare against VAEs by training like-for-like generative models using our technique against those trained with standard reparameterisation-trick-based ELBOs. We observe our method out-performs or matches performance across a number of metrics, including sample quality, while converging in a fraction of the number of SGD training iterations.
△ Less
Submitted 4 February, 2024; v1 submitted 22 November, 2023;
originally announced November 2023.
-
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus
Authors:
Tianhang Zhang,
Lin Qiu,
Qipeng Guo,
Cheng Deng,
Yue Zhang,
Zheng Zhang,
Chenghu Zhou,
Xinbing Wang,
Luoyi Fu
Abstract:
Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields. However, LLMs are prone to hallucinate untruthful or nonsensical outputs that fail to meet user expectations in many real-world applications. Existing works for detecting hallucinations in LLMs either rely on external knowledge for reference retrieval or require sampling multiple…
▽ More
Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields. However, LLMs are prone to hallucinate untruthful or nonsensical outputs that fail to meet user expectations in many real-world applications. Existing works for detecting hallucinations in LLMs either rely on external knowledge for reference retrieval or require sampling multiple responses from the LLM for consistency verification, making these methods costly and inefficient. In this paper, we propose a novel reference-free, uncertainty-based method for detecting hallucinations in LLMs. Our approach imitates human focus in factuality checking from three aspects: 1) focus on the most informative and important keywords in the given text; 2) focus on the unreliable tokens in historical context which may lead to a cascade of hallucinations; and 3) focus on the token properties such as token type and token frequency. Experimental results on relevant datasets demonstrate the effectiveness of our proposed method, which achieves state-of-the-art performance across all the evaluation metrics and eliminates the need for additional information.
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
First observation of $Λ_c^+\rightarrowΛK^+π^0$ and evidence of $Λ_c^+\rightarrowΛK^+π^+π^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (608 additional authors not shown)
Abstract:
We present the first observation of the singly Cabibbo-suppressed decay $Λ_c^+ \rightarrow ΛK^+π^0$ with a significance of $5.7σ$ and the first evidence of $Λ_c^+ \rightarrow ΛK^+π^+π^-$ decay with a significance of $3.1σ$, based on $e^+e^-$ annihilation data recorded by the BESIII detector at the BEPCII collider. The data correspond to an integrated luminosity of $6.4~{\rm fb^{-1}}$, in the cente…
▽ More
We present the first observation of the singly Cabibbo-suppressed decay $Λ_c^+ \rightarrow ΛK^+π^0$ with a significance of $5.7σ$ and the first evidence of $Λ_c^+ \rightarrow ΛK^+π^+π^-$ decay with a significance of $3.1σ$, based on $e^+e^-$ annihilation data recorded by the BESIII detector at the BEPCII collider. The data correspond to an integrated luminosity of $6.4~{\rm fb^{-1}}$, in the center-of-mass energy range from $4.600~{\rm GeV}$ to $4.950~{\rm GeV}$. We determine the branching fractions of $Λ_c^+ \rightarrow ΛK^+π^0$ and $Λ_c^+ \rightarrow ΛK^+π^+π^-$ relative to their Cabibbo-favored counterparts to be $\frac{\mathcal{B}(Λ_c^+ \rightarrow ΛK^+π^0)}{\mathcal{B}(Λ_c^+ \rightarrow Λπ^+π^0)} = (2.09\pm0.39_{\mathrm{stat.}}\pm0.07_{\mathrm{syst.}}) \times 10^{-2}$ and $\frac{\mathcal{B}(Λ_c^+ \rightarrow ΛK^+π^+π^-)}{\mathcal{B}(Λ_c^+ \rightarrow Λπ^+π^+π^-)} = (1.13\pm0.41_{\mathrm{stat.}}\pm0.06_{\mathrm{syst.}}) \times 10^{-2}$, respectively. Moreover, by combining our measured result with the world average of $\mathcal{B}(Λ^+_c\to Λπ^+π^0)$, we obtain the branching fraction $\mathcal{B}(Λ_c^+ \to ΛK^+π^0) = (1.49\pm0.27_{\mathrm{stat.}}\pm0.05_{\mathrm{syst.}}\pm0.08_{\mathrm{ref.}}) \times 10^{-3}$. This result significantly departs from theoretical predictions based on quark $SU(3)$ flavor symmetry, which is underpinned by the presumption of meson pair $S$-wave amplitude dominance.
△ Less
Submitted 25 February, 2024; v1 submitted 21 November, 2023;
originally announced November 2023.
-
Improved measurement of the decays $η' \to π^{+}π^{-}π^{+(0)}π^{-(0)}$ and search for the rare decay $η' \to 4π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (606 additional authors not shown)
Abstract:
Using a sample of 10 billion $J/ψ$ events collected with the BESIII detector, the decays $η' \to π^{+}π^{-}π^{+}π^{-}$, $η' \to π^{+}π^{-}π^{0}π^{0}$ and $η' \to 4 π^{0}$ are studied via the process $J/ψ\toγη'$. The branching fractions of $η' \to π^{+}π^{-}π^{+}π^{-}$ and $η' \to π^{+}π^{-}π^{0}$ $π^{0}$ are measured to be $( 8.56 \pm 0.25({\rm stat.}) \pm 0.23({\rm syst.}) ) \times {10^{ - 5}}$ a…
▽ More
Using a sample of 10 billion $J/ψ$ events collected with the BESIII detector, the decays $η' \to π^{+}π^{-}π^{+}π^{-}$, $η' \to π^{+}π^{-}π^{0}π^{0}$ and $η' \to 4 π^{0}$ are studied via the process $J/ψ\toγη'$. The branching fractions of $η' \to π^{+}π^{-}π^{+}π^{-}$ and $η' \to π^{+}π^{-}π^{0}$ $π^{0}$ are measured to be $( 8.56 \pm 0.25({\rm stat.}) \pm 0.23({\rm syst.}) ) \times {10^{ - 5}}$ and $(2.12 \pm 0.12({\rm stat.}) \pm 0.10({\rm syst.})) \times {10^{ - 4}}$, respectively, which are consistent with previous measurements but with improved precision. No significant $η' \to 4 π^{0}$ signal is observed, and the upper limit on the branching fraction of this decay is determined to be less than $1.24 \times {10^{-5}}$ at the $90\%$ confidence level. In addition, an amplitude analysis of $η' \to π^{+}π^{-}π^{+}π^{-}$ is performed to extract the doubly virtual isovector form factor $α$ for the first time. The measured value of $α=1.22 \pm 0.33({\rm stat.}) \pm 0.04({\rm syst.})$, is in agreement with the prediction of the VMD model.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Engage Wider Audience or Facilitate Quality Answers? a Mixed-methods Analysis of Questioning Strategies for Research Sensemaking on a Community Q&A Site
Authors:
Changyang He,
Yue Deng,
Lu He,
Qingyu Guo,
Yu Zhang,
Zhicong Lu,
Bo Li
Abstract:
Discussing research-sensemaking questions on Community Question and Answering (CQA) platforms has been an increasingly common practice for the public to participate in science communication. Nonetheless, how users strategically craft research-sensemaking questions to engage public participation and facilitate knowledge construction is a significant yet less understood problem. To fill this gap, we…
▽ More
Discussing research-sensemaking questions on Community Question and Answering (CQA) platforms has been an increasingly common practice for the public to participate in science communication. Nonetheless, how users strategically craft research-sensemaking questions to engage public participation and facilitate knowledge construction is a significant yet less understood problem. To fill this gap, we collected 837 science-related questions and 157,684 answers from Zhihu, and conducted a mixed-methods study to explore user-developed strategies in proposing research-sensemaking questions, and their potential effects on public engagement and knowledge construction. Through open coding, we captured a comprehensive taxonomy of question-crafting strategies, such as eyecatching narratives with counter-intuitive claims and rigorous descriptions with data use. Regression analysis indicated that these strategies correlated with user engagement and answer construction in different ways (e.g., emotional questions attracted more views and answers), yet there existed a general divergence between wide participation and quality knowledge establishment, when most questioning strategies could not ensure both. Based on log analysis, we further found that collaborative editing afforded unique values in refining research-sensemaking questions regarding accuracy, rigor, comprehensiveness and attractiveness. We propose design implications to facilitate accessible, accurate and engaging science communication on CQA platforms.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
Detecting Cosmic 21 cm Global Signal Using an Improved Polynomial Fitting Algorithm
Authors:
Tianyang Liu,
Junhua Gu,
Quan Guo,
Huanyuan Shan,
Qian Zheng,
**gying Wang
Abstract:
Detecting the cosmic 21 cm signal from Epoch of Reionization (EoR) has always been a difficult task. Although the Galactic foreground can be regarded as a smooth power-law spectrum, due to the chromaticity of the antenna, additional structure will be introduced into the global spectrum, making the polynomial fitting algorithm perform poorly. In this paper, we introduce an improved polynomial fitti…
▽ More
Detecting the cosmic 21 cm signal from Epoch of Reionization (EoR) has always been a difficult task. Although the Galactic foreground can be regarded as a smooth power-law spectrum, due to the chromaticity of the antenna, additional structure will be introduced into the global spectrum, making the polynomial fitting algorithm perform poorly. In this paper, we introduce an improved polynomial fitting algorithm - the Vari-Zeroth-Order Polynomial (VZOP) fitting and use it to fit the simulation data. This algorithm is developed for the upcoming Low-frequency Anechoic Chamber Experiment (LACE), yet it is a general method suitable for application in any single antenna-based global 21 cm signal experiment. VZOP defines a 24-hour averaged beam model that brings information about the antenna beam into the polynomial model. Assuming that the beam can be measured, VZOP can successfully recover the 21 cm absorption feature, even if the beam is extremely frequency-dependent. In real observations, due to various systematics, the corrected measured beam contains residual errors that are not completely random. Assuming the errors are frequency-dependent, VZOP is capable of recovering the 21 cm absorption feature even when the error reaches 10%. Even in the most extreme scenario where the errors are completely random, VZOP can at least give a fitting result that is not worse than the common polynomial fitting. In conclusion, the fitting effect of VZOP depends on the structure of the error and the accuracy of the beam measurement.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
Study of the decay $J/ψ\to φπ^{0}η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (604 additional authors not shown)
Abstract:
Based on $(10.09 \pm 0.04) \times 10^9$ $J/ψ$ events collected with the BESIII detector operating at the BEPCII collider, a partial wave analysis of the decay $J/ψ\to φπ^{0}η$ is performed. We observe for the first time two new structures on the $φη$ invariant mass distribution, with statistical significances of $24.0σ$ and $16.9σ$; the first with $J^{\rm PC}$ = $1^{+-}$, mass M = (1911 $\pm$ 6 (s…
▽ More
Based on $(10.09 \pm 0.04) \times 10^9$ $J/ψ$ events collected with the BESIII detector operating at the BEPCII collider, a partial wave analysis of the decay $J/ψ\to φπ^{0}η$ is performed. We observe for the first time two new structures on the $φη$ invariant mass distribution, with statistical significances of $24.0σ$ and $16.9σ$; the first with $J^{\rm PC}$ = $1^{+-}$, mass M = (1911 $\pm$ 6 (stat.) $\pm$ 14 (sys.))~MeV/$c^{2}$, and width $Γ= $ (149 $\pm$ 12 (stat.) $\pm$ 23 (sys.))~MeV, the second with $J^{\rm PC}$ = $1^{--}$, mass M = (1996 $\pm$ 11 (stat.) $\pm$ 30 (sys.))~MeV/$c^{2}$, and width $Γ$ = (148 $\pm$ 16 (stat.) $\pm$ 66 (sys.))~MeV. These measurements provide important input for the strangeonium spectrum. In addition, the $f_0(980)-a_0(980)^0$ mixing signal in $J/ψ\to φf_0(980) \to φa_0(980)^0$ and the corresponding electromagnetic decay $J/ψ\to φa_0(980)^0$ are measured with improved precision, providing crucial information to understand the nature of $a_0(980)^0$ and $f_0(980)$.
△ Less
Submitted 14 November, 2023; v1 submitted 12 November, 2023;
originally announced November 2023.
-
Flames: Benchmarking Value Alignment of LLMs in Chinese
Authors:
Kexin Huang,
Xiangyang Liu,
Qianyu Guo,
Tianxiang Sun,
Jiawei Sun,
Yaru Wang,
Zeyang Zhou,
Yixu Wang,
Yan Teng,
Xipeng Qiu,
Yingchun Wang,
Dahua Lin
Abstract:
The widespread adoption of large language models (LLMs) across various regions underscores the urgent need to evaluate their alignment with human values. Current benchmarks, however, fall short of effectively uncovering safety vulnerabilities in LLMs. Despite numerous models achieving high scores and 'top** the chart' in these evaluations, there is still a significant gap in LLMs' deeper alignme…
▽ More
The widespread adoption of large language models (LLMs) across various regions underscores the urgent need to evaluate their alignment with human values. Current benchmarks, however, fall short of effectively uncovering safety vulnerabilities in LLMs. Despite numerous models achieving high scores and 'top** the chart' in these evaluations, there is still a significant gap in LLMs' deeper alignment with human values and achieving genuine harmlessness. To this end, this paper proposes a value alignment benchmark named Flames, which encompasses both common harmlessness principles and a unique morality dimension that integrates specific Chinese values such as harmony. Accordingly, we carefully design adversarial prompts that incorporate complex scenarios and jailbreaking methods, mostly with implicit malice. By prompting 17 mainstream LLMs, we obtain model responses and rigorously annotate them for detailed evaluation. Our findings indicate that all the evaluated LLMs demonstrate relatively poor performance on Flames, particularly in the safety and fairness dimensions. We also develop a lightweight specified scorer capable of scoring LLMs across multiple dimensions to efficiently evaluate new models on the benchmark. The complexity of Flames has far exceeded existing benchmarks, setting a new challenge for contemporary LLMs and highlighting the need for further alignment of LLMs. Our benchmark is publicly available at https://github.com/AIFlames/Flames.
△ Less
Submitted 22 May, 2024; v1 submitted 12 November, 2023;
originally announced November 2023.
-
Evidence of the Singly Cabibbo Suppressed decay $Λ_c^+\to pπ^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (600 additional authors not shown)
Abstract:
Evidence for the singly Cabibbo suppressed decay $Λ_c^+\to pπ^0$ is reported for the first time with a statistical significance of $3.7σ$ based on 6.0 $\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.600 and 4.843 GeV with the BESIII detector at the BEPCII collider. The absolute branching fraction of $Λ_c^+\to pπ^0$ is measured to be…
▽ More
Evidence for the singly Cabibbo suppressed decay $Λ_c^+\to pπ^0$ is reported for the first time with a statistical significance of $3.7σ$ based on 6.0 $\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.600 and 4.843 GeV with the BESIII detector at the BEPCII collider. The absolute branching fraction of $Λ_c^+\to pπ^0$ is measured to be $(1.56^{+0.72}_{-0.58}\pm0.20)\times 10^{-4}$. Combining with the branching fraction of $Λ_c^+\to nπ^+$, $(6.6\pm1.3)\times10^{-4}$, the ratio of the branching fractions of $Λ_c^+\to nπ^+$ and $Λ_c^+\to pπ^0$ is calculated to be $3.2^{+2.2}_{-1.2}$. As an important input for the theoretical models describing the decay mechanisms of charmed baryons, our result indicates that the non-factorizable contributions play an essential role and their interference with the factorizable contributions should not be significant. In addition, the absolute branching fraction of $Λ_c^+\to pη$ is measured to be $(1.63\pm0.31_{\rm stat}\pm0.11_{\rm syst}) \times10^{-3}$.
△ Less
Submitted 3 June, 2024; v1 submitted 12 November, 2023;
originally announced November 2023.
-
Observation and branching fraction measurement of the decay $J\!/\!ψ\rightarrow \bar{p} Σ^{+} K_{S}^{0} + c.c.$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (602 additional authors not shown)
Abstract:
The first observation of the decays $J\!/\!ψ\rightarrow \bar{p} Σ^{+} K_{S}^{0}$ and $J\!/\!ψ\rightarrow p \barΣ^{-} K_{S}^{0}$ is reported using $(10087\pm44)\times10^{6}$ $J\!/\!ψ$ events recorded by the BESIII detector at the BEPCII storage ring. The branching fractions of each channel are determined to be…
▽ More
The first observation of the decays $J\!/\!ψ\rightarrow \bar{p} Σ^{+} K_{S}^{0}$ and $J\!/\!ψ\rightarrow p \barΣ^{-} K_{S}^{0}$ is reported using $(10087\pm44)\times10^{6}$ $J\!/\!ψ$ events recorded by the BESIII detector at the BEPCII storage ring. The branching fractions of each channel are determined to be $\mathcal{B}(J\!/\!ψ\rightarrow \bar{p} Σ^{+} K_{S}^{0})=(1.361 \pm 0.006 \pm 0.025) \times 10^{-4}$ and $\mathcal{B}(J\!/\!ψ\rightarrow p \barΣ^{-} K_{S}^{0})=(1.352 \pm 0.006 \pm 0.025) \times 10^{-4}$. The combined result is $\mathcal{B}(J\!/\!ψ\rightarrow \bar{p} Σ^{+} K_{S}^{0} +c.c.)=(2.725 \pm 0.009 \pm 0.050) \times 10^{-4}$, where the first uncertainty is statistical and the second systematic. The results presented are in good agreement with the branching fractions of the isospin partner decay $J\!/\!ψ\rightarrow p K^- \barΣ^0 + c.c.$.
△ Less
Submitted 14 November, 2023; v1 submitted 10 November, 2023;
originally announced November 2023.
-
Emergent Communication for Rules Reasoning
Authors:
Yuxuan Guo,
Yifan Hao,
Rui Zhang,
Enshuai Zhou,
Zidong Du,
Xishan Zhang,
Xinkai Song,
Yuanbo Wen,
Yongwei Zhao,
Xuehai Zhou,
Jiaming Guo,
Qi Yi,
Shaohui Peng,
Di Huang,
Ruizhi Chen,
Qi Guo,
Yunji Chen
Abstract:
Research on emergent communication between deep-learning-based agents has received extensive attention due to its inspiration for linguistics and artificial intelligence. However, previous attempts have hovered around emerging communication under perception-oriented environmental settings, that forces agents to describe low-level perceptual features intra image or symbol contexts. In this work, in…
▽ More
Research on emergent communication between deep-learning-based agents has received extensive attention due to its inspiration for linguistics and artificial intelligence. However, previous attempts have hovered around emerging communication under perception-oriented environmental settings, that forces agents to describe low-level perceptual features intra image or symbol contexts. In this work, inspired by the classic human reasoning test (namely Raven's Progressive Matrix), we propose the Reasoning Game, a cognition-oriented environment that encourages agents to reason and communicate high-level rules, rather than perceived low-level contexts. Moreover, we propose 1) an unbiased dataset (namely rule-RAVEN) as a benchmark to avoid overfitting, 2) and a two-stage curriculum agent training method as a baseline for more stable convergence in the Reasoning Game, where contexts and semantics are bilaterally drifting. Experimental results show that, in the Reasoning Game, a semantically stable and compositional language emerges to solve reasoning problems. The emerged language helps agents apply the extracted rules to the generalization of unseen context attributes, and to the transfer between different context attributes or even tasks.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
RDGCN: Reinforced Dependency Graph Convolutional Network for Aspect-based Sentiment Analysis
Authors:
Xusheng Zhao,
Hao Peng,
Qiong Dai,
Xu Bai,
Huailiang Peng,
Yanbing Liu,
Qinglang Guo,
Philip S. Yu
Abstract:
Aspect-based sentiment analysis (ABSA) is dedicated to forecasting the sentiment polarity of aspect terms within sentences. Employing graph neural networks to capture structural patterns from syntactic dependency parsing has been confirmed as an effective approach for boosting ABSA. In most works, the topology of dependency trees or dependency-based attention coefficients is often loosely regarded…
▽ More
Aspect-based sentiment analysis (ABSA) is dedicated to forecasting the sentiment polarity of aspect terms within sentences. Employing graph neural networks to capture structural patterns from syntactic dependency parsing has been confirmed as an effective approach for boosting ABSA. In most works, the topology of dependency trees or dependency-based attention coefficients is often loosely regarded as edges between aspects and opinions, which can result in insufficient and ambiguous syntactic utilization. To address these problems, we propose a new reinforced dependency graph convolutional network (RDGCN) that improves the importance calculation of dependencies in both distance and type views. Initially, we propose an importance calculation criterion for the minimum distances over dependency trees. Under the criterion, we design a distance-importance function that leverages reinforcement learning for weight distribution search and dissimilarity control. Since dependency types often do not have explicit syntax like tree distances, we use global attention and mask mechanisms to design type-importance functions. Finally, we merge these weights and implement feature aggregation and classification. Comprehensive experiments on three popular datasets demonstrate the effectiveness of the criterion and importance functions. RDGCN outperforms state-of-the-art GNN-based baselines in all validations.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Context Shift Reduction for Offline Meta-Reinforcement Learning
Authors:
Yunkai Gao,
Rui Zhang,
Jiaming Guo,
Fan Wu,
Qi Yi,
Shaohui Peng,
Siming Lan,
Ruizhi Chen,
Zidong Du,
Xing Hu,
Qi Guo,
Ling Li,
Yunji Chen
Abstract:
Offline meta-reinforcement learning (OMRL) utilizes pre-collected offline datasets to enhance the agent's generalization ability on unseen tasks. However, the context shift problem arises due to the distribution discrepancy between the contexts used for training (from the behavior policy) and testing (from the exploration policy). The context shift problem leads to incorrect task inference and fur…
▽ More
Offline meta-reinforcement learning (OMRL) utilizes pre-collected offline datasets to enhance the agent's generalization ability on unseen tasks. However, the context shift problem arises due to the distribution discrepancy between the contexts used for training (from the behavior policy) and testing (from the exploration policy). The context shift problem leads to incorrect task inference and further deteriorates the generalization ability of the meta-policy. Existing OMRL methods either overlook this problem or attempt to mitigate it with additional information. In this paper, we propose a novel approach called Context Shift Reduction for OMRL (CSRO) to address the context shift problem with only offline datasets. The key insight of CSRO is to minimize the influence of policy in context during both the meta-training and meta-test phases. During meta-training, we design a max-min mutual information representation learning mechanism to diminish the impact of the behavior policy on task representation. In the meta-test phase, we introduce the non-prior context collection strategy to reduce the effect of the exploration policy. Experimental results demonstrate that CSRO significantly reduces the context shift and improves the generalization ability, surpassing previous methods across various challenging domains.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Measurement of the absolute branching fraction of the three-body decay $Λ_{c}^+ \to Ξ^{0}K^{+}π^{0}$ and search for $Λ_{c}^+ \to nK^+π^0$, $Σ^{0}K^{+}π^{0}$ and $ΛK^{+}π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (600 additional authors not shown)
Abstract:
The Cabbibo-favored decay $Λ_{c}^+ \to Ξ^{0}K^{+}π^{0}$ is studied for the first time using 6.1 fb$^{-1}$ of $e^+e^-$ collision data at center-of-mass energies between 4.600 and 4.840 GeV, collected with the BESIII detector at the BEPCII collider. With a double-tag method, the branching fraction of the three-body decay $Λ_{c}^+ \to Ξ^{0}K^{+}π^{0}$ is measured to be…
▽ More
The Cabbibo-favored decay $Λ_{c}^+ \to Ξ^{0}K^{+}π^{0}$ is studied for the first time using 6.1 fb$^{-1}$ of $e^+e^-$ collision data at center-of-mass energies between 4.600 and 4.840 GeV, collected with the BESIII detector at the BEPCII collider. With a double-tag method, the branching fraction of the three-body decay $Λ_{c}^+ \to Ξ^{0}K^{+}π^{0}$ is measured to be $(7.79 \pm 1.46 _{\rm} \pm0.71 _{\rm}) \times 10^{ - 3}$, where the first and second uncertainties are statistical and systematic, respectively. The branching fraction of the two-body decay $Λ_{c}^+ \to Ξ(1530)^{0}K^+$ is $(5.99\pm1.04\pm0.29)\times10^{-3}$, which is consistent with the previous result of $(5.02\pm0.99\pm0.31)\times 10^{-3}$. In addition, the upper limit on the branching fraction of the doubly Cabbibo-suppressed decay $Λ_{c}^+ \to nK^+π^0$ is $7.1 \times 10^{-4}$ at the 90$\%$ confidence level. The upper limits on the branching fractions of $Λ_{c}^+ \to Σ^{0}K^{+}π^{0}$ and $ΛK^{+}π^{0}$ are also determined to be $1.8\times 10^{-3}$ and $ 2.0 \times 10^{-3}$, respectively.
△ Less
Submitted 8 May, 2024; v1 submitted 4 November, 2023;
originally announced November 2023.
-
Efficient Symbolic Policy Learning with Differentiable Symbolic Expression
Authors:
Jiaming Guo,
Rui Zhang,
Shaohui Peng,
Qi Yi,
Xing Hu,
Ruizhi Chen,
Zidong Du,
Xishan Zhang,
Ling Li,
Qi Guo,
Yunji Chen
Abstract:
Deep reinforcement learning (DRL) has led to a wide range of advances in sequential decision-making tasks. However, the complexity of neural network policies makes it difficult to understand and deploy with limited computational resources. Currently, employing compact symbolic expressions as symbolic policies is a promising strategy to obtain simple and interpretable policies. Previous symbolic po…
▽ More
Deep reinforcement learning (DRL) has led to a wide range of advances in sequential decision-making tasks. However, the complexity of neural network policies makes it difficult to understand and deploy with limited computational resources. Currently, employing compact symbolic expressions as symbolic policies is a promising strategy to obtain simple and interpretable policies. Previous symbolic policy methods usually involve complex training processes and pre-trained neural network policies, which are inefficient and limit the application of symbolic policies. In this paper, we propose an efficient gradient-based learning method named Efficient Symbolic Policy Learning (ESPL) that learns the symbolic policy from scratch in an end-to-end way. We introduce a symbolic network as the search space and employ a path selector to find the compact symbolic policy. By doing so we represent the policy with a differentiable symbolic expression and train it in an off-policy manner which further improves the efficiency. In addition, in contrast with previous symbolic policies which only work in single-task RL because of complexity, we expand ESPL on meta-RL to generate symbolic policies for unseen tasks. Experimentally, we show that our approach generates symbolic policies with higher performance and greatly improves data efficiency for single-task RL. In meta-RL, we demonstrate that compared with neural network policies the proposed symbolic policy achieves higher performance and efficiency and shows the potential to be interpretable.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks
Authors:
Yifan Wang,
Qingyan Guo,
Xinzhe Ni,
Chufan Shi,
Lemao Liu,
Haiyun Jiang,
Yujiu Yang
Abstract:
In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs), enabling them to learn input-label map**s from demonstrations and perform well on downstream tasks. However, under the standard ICL setting, LLMs may sometimes neglect query-related information in demonstrations, leading to incorrect predictions. To address this limitation, we propose a new p…
▽ More
In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs), enabling them to learn input-label map**s from demonstrations and perform well on downstream tasks. However, under the standard ICL setting, LLMs may sometimes neglect query-related information in demonstrations, leading to incorrect predictions. To address this limitation, we propose a new paradigm called Hint-enhanced In-Context Learning (HICL) to explore the power of ICL in open-domain question answering, an important form in knowledge-intensive tasks. HICL leverages LLMs' reasoning ability to extract query-related knowledge from demonstrations, then concatenates the knowledge to prompt LLMs in a more explicit way. Furthermore, we track the source of this knowledge to identify specific examples, and introduce a Hint-related Example Retriever (HER) to select informative examples for enhanced demonstrations. We evaluate HICL with HER on 3 open-domain QA benchmarks, and observe average performance gains of 2.89 EM score and 2.52 F1 score on gpt-3.5-turbo, 7.62 EM score and 7.27 F1 score on LLaMA-2-Chat-7B compared with standard setting.
△ Less
Submitted 18 April, 2024; v1 submitted 3 November, 2023;
originally announced November 2023.
-
Collaborative Large Language Model for Recommender Systems
Authors:
Yaochen Zhu,
Liang Wu,
Qi Guo,
Liangjie Hong,
Jundong Li
Abstract:
Recently, there has been growing interest in develo** the next-generation recommender systems (RSs) based on pretrained large language models (LLMs). However, the semantic gap between natural language and recommendation tasks is still not well addressed, leading to multiple issues such as spuriously correlated user/item descriptors, ineffective language modeling on user/item data, inefficient re…
▽ More
Recently, there has been growing interest in develo** the next-generation recommender systems (RSs) based on pretrained large language models (LLMs). However, the semantic gap between natural language and recommendation tasks is still not well addressed, leading to multiple issues such as spuriously correlated user/item descriptors, ineffective language modeling on user/item data, inefficient recommendations via auto-regression, etc. In this paper, we propose CLLM4Rec, the first generative RS that tightly integrates the LLM paradigm and ID paradigm of RSs, aiming to address the above challenges simultaneously. We first extend the vocabulary of pretrained LLMs with user/item ID tokens to faithfully model user/item collaborative and content semantics. Accordingly, a novel soft+hard prompting strategy is proposed to effectively learn user/item collaborative/content token embeddings via language modeling on RS-specific corpora, where each document is split into a prompt consisting of heterogeneous soft (user/item) tokens and hard (vocab) tokens and a main text consisting of homogeneous item tokens or vocab tokens to facilitate stable and effective language modeling. In addition, a novel mutual regularization strategy is introduced to encourage CLLM4Rec to capture recommendation-related information from noisy user/item content. Finally, we propose a novel recommendation-oriented finetuning strategy for CLLM4Rec, where an item prediction head with multinomial likelihood is added to the pretrained CLLM4Rec backbone to predict hold-out items based on soft+hard prompts established from masked user-item interaction history, where recommendations of multiple items can be generated efficiently without hallucination. Codes are released at https://github.com/yaochenzhu/llm4rec.
△ Less
Submitted 21 February, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.
-
Search for a muonphilic scalar $X_{0}$ or vector $X_{1}$ via $J/ψ\toμ^+μ^-+\rm{invisible}$ decays at BESII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (608 additional authors not shown)
Abstract:
A light scalar $X_{0}$ or vector $X_{1}$ particles have been introduced as a possible explanation for the $(g-2)_μ$ anomaly and dark matter phenomena.
Using $(8.998\pm 0.039)\times10^9$ $\jpsi $ events collected by the BESIII detector, we search for a light muon philic scalar $X_{0}$ or vector $X_{1}$ in the processes $J/ψ\toμ^+μ^- X_{0,1}$ with $X_{0,1}$ invisible decays. No obvious signal is f…
▽ More
A light scalar $X_{0}$ or vector $X_{1}$ particles have been introduced as a possible explanation for the $(g-2)_μ$ anomaly and dark matter phenomena.
Using $(8.998\pm 0.039)\times10^9$ $\jpsi $ events collected by the BESIII detector, we search for a light muon philic scalar $X_{0}$ or vector $X_{1}$ in the processes $J/ψ\toμ^+μ^- X_{0,1}$ with $X_{0,1}$ invisible decays. No obvious signal is found, and the upper limits on the coupling $g_{0,1}'$ between the muon and the $X_{0,1}$ particles are set to be between $1.1\times10^{-3}$ and $1.0\times10^{-2}$ for the $X_{0,1}$ mass in the range of $1<M(X_{0,1})<1000$ MeV$/c^2$ at 90$\%$ confidence level.
△ Less
Submitted 18 February, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.
-
Annotator: A Generic Active Learning Baseline for LiDAR Semantic Segmentation
Authors:
Binhui Xie,
Shuang Li,
Qingju Guo,
Chi Harold Liu,
Xin**g Cheng
Abstract:
Active learning, a label-efficient paradigm, empowers models to interactively query an oracle for labeling new data. In the realm of LiDAR semantic segmentation, the challenges stem from the sheer volume of point clouds, rendering annotation labor-intensive and cost-prohibitive. This paper presents Annotator, a general and efficient active learning baseline, in which a voxel-centric online selecti…
▽ More
Active learning, a label-efficient paradigm, empowers models to interactively query an oracle for labeling new data. In the realm of LiDAR semantic segmentation, the challenges stem from the sheer volume of point clouds, rendering annotation labor-intensive and cost-prohibitive. This paper presents Annotator, a general and efficient active learning baseline, in which a voxel-centric online selection strategy is tailored to efficiently probe and annotate the salient and exemplar voxel girds within each LiDAR scan, even under distribution shift. Concretely, we first execute an in-depth analysis of several common selection strategies such as Random, Entropy, Margin, and then develop voxel confusion degree (VCD) to exploit the local topology relations and structures of point clouds. Annotator excels in diverse settings, with a particular focus on active learning (AL), active source-free domain adaptation (ASFDA), and active domain adaptation (ADA). It consistently delivers exceptional performance across LiDAR semantic segmentation benchmarks, spanning both simulation-to-real and real-to-real scenarios. Surprisingly, Annotator exhibits remarkable efficiency, requiring significantly fewer annotations, e.g., just labeling five voxels per scan in the SynLiDAR-to-SemanticKITTI task. This results in impressive performance, achieving 87.8% fully-supervised performance under AL, 88.5% under ASFDA, and 94.4% under ADA. We envision that Annotator will offer a simple, general, and efficient solution for label-efficient 3D applications. Project page: https://binhuixie.github.io/annotator-web
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
Solitary solutions of a nonlinear Dirac equation with different frequencies
Authors:
Qi Guo,
Yuanyuan Ke
Abstract:
We study the existence and nonexistence of solitary solutions with different frequencies for a type of nonlinear extension of Dirac-Slater model. There are three main ingredients in this paper. The first is the Pohozaev's identity of nonlinear Dirac equations. Combine with variational identity, we find the nonexistence results when the frequency $ω$ is greater than $m$. The second is critical poin…
▽ More
We study the existence and nonexistence of solitary solutions with different frequencies for a type of nonlinear extension of Dirac-Slater model. There are three main ingredients in this paper. The first is the Pohozaev's identity of nonlinear Dirac equations. Combine with variational identity, we find the nonexistence results when the frequency $ω$ is greater than $m$. The second is critical point theorem of strongly indefinite functionals. With this, we obtain existence result of $ω\in (-m,m)$. The third, which is the new main ingredient of this paper, is perturbation of the functional from the second ingredient. Then we can show the existence of solitary solutions when $ω=-m$. An interesting outcome from our result is that we can see the left and right are completely different in {\it Spectrum Zero Problem} which implies a new phenomenon in quantum theory.
△ Less
Submitted 13 November, 2023; v1 submitted 29 October, 2023;
originally announced October 2023.
-
Observation of the Anomalous Shape of $X(1840)$ in $J/ψ\rightarrow γ3(π^+ π^-)$ Indicating a Second Resonance Near $p\bar{p}$ Threshold
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (604 additional authors not shown)
Abstract:
Using a sample of $(10087\pm44)\times 10^6$ $J/ψ$ events, which is about 45 times larger than that was previously analyzed, a further investigation on the $J/ψ\rightarrow γ3(π^+π^-)$ decay is performed. A significant distortion at 1.84 GeV/$c^2$ in the line-shape of the $3(π^+π^-)$ invariant mass spectrum is observed for the first time, which could be resolved by two overlap** resonant structure…
▽ More
Using a sample of $(10087\pm44)\times 10^6$ $J/ψ$ events, which is about 45 times larger than that was previously analyzed, a further investigation on the $J/ψ\rightarrow γ3(π^+π^-)$ decay is performed. A significant distortion at 1.84 GeV/$c^2$ in the line-shape of the $3(π^+π^-)$ invariant mass spectrum is observed for the first time, which could be resolved by two overlap** resonant structures, $X(1840)$ and $X(1880)$. The new state $X(1880)$ is observed with a statistical significance larger than $10σ$. The mass and width of $X(1880)$ are determined to be $1882.1\pm1.7\pm0.7$ MeV/$c^2$ and $30.7\pm5.5 \pm2.4$ MeV, respectively, which indicates the existence of a $p\bar{p}$ bound state.
△ Less
Submitted 15 April, 2024; v1 submitted 27 October, 2023;
originally announced October 2023.
-
Does or did the supernova remnant Cassiopeia A operate as a PeVatron?
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE;…
▽ More
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE; $E_γ\geq 100$~TeV) $γ$-rays. In this context, the historical SNR Cassiopeia A (Cas A) is considered one of the most promising target for UHE observations. This paper presents the observation of Cas A and its vicinity by the LHAASO KM2A detector. The exceptional sensitivity of LHAASO KM2A in the UHE band, combined with the young age of Cas A, enabled us to derive stringent model-independent limits on the energy budget of UHE protons and nuclei accelerated by Cas A at any epoch after the explosion. The results challenge the prevailing paradigm that Cas A-type SNRs are major suppliers of PeV CRs in the Milky Way.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Study of the doubly Cabibbo-suppressed decays $D^+_s\to K^+K^+π^-$ and $D^+_s\to K^+K^+π^-π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko
, et al. (604 additional authors not shown)
Abstract:
Based on 7.33 fb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector, the experimental studies of the doubly Cabibbo-suppressed decays $D^+_s\to K^+K^+π^-$ and $D^+_s\to K^+K^+π^-π^0$ are reported. We determine the absolute branching fraction of $D^+_s\to K^+K^+π^-$ to be (…
▽ More
Based on 7.33 fb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector, the experimental studies of the doubly Cabibbo-suppressed decays $D^+_s\to K^+K^+π^-$ and $D^+_s\to K^+K^+π^-π^0$ are reported. We determine the absolute branching fraction of $D^+_s\to K^+K^+π^-$ to be (${1.23^{+0.28}_{-0.25}}({\rm stat})\pm0.06({\rm syst})$) $\times 10^{-4}$. No significant signal of $D^+_s\to K^+K^+π^-π^0$ is observed and the upper limit on its decay branching fraction at 90\% confidence level is set to be $1.7\times10^{-4}$.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts
Authors:
Tengxiao Liu,
Qipeng Guo,
Yuqing Yang,
Xiangkun Hu,
Yue Zhang,
Xipeng Qiu,
Zheng Zhang
Abstract:
As large language models (LLMs) have shown effectiveness with different prompting methods, such as Chain of Thought, Program of Thought, we find that these methods have formed a great complementarity to each other on math reasoning tasks. In this work, we propose XoT, an integrated problem solving framework by prompting LLMs with diverse reasoning thoughts. For each question, XoT always begins wit…
▽ More
As large language models (LLMs) have shown effectiveness with different prompting methods, such as Chain of Thought, Program of Thought, we find that these methods have formed a great complementarity to each other on math reasoning tasks. In this work, we propose XoT, an integrated problem solving framework by prompting LLMs with diverse reasoning thoughts. For each question, XoT always begins with selecting the most suitable method then executes each method iteratively. Within each iteration, XoT actively checks the validity of the generated answer and incorporates the feedback from external executors, allowing it to dynamically switch among different prompting methods. Through extensive experiments on 10 popular math reasoning datasets, we demonstrate the effectiveness of our proposed approach and thoroughly analyze the strengths of each module. Moreover, empirical results suggest that our framework is orthogonal to recent work that makes improvements on single reasoning methods and can further generalise to logical reasoning domain. By allowing method switching, XoT provides a fresh perspective on the collaborative integration of diverse reasoning thoughts in a unified framework. The code is available at https://github.com/tengxiaoliu/XoT.
△ Less
Submitted 27 December, 2023; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Observation of the $ψ(3686)$ decays into $Σ^{+}\barΣ^{-}ω$ and $Σ^{+}\barΣ^{-}{\mathcalφ}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (604 additional authors not shown)
Abstract:
Based on $(27.08\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the $ψ(3686)\toΣ^{+}\barΣ^{-}ω$ and $Σ^{+}\barΣ^{-}φ$ decays are observed for the first time with statistical significances of 13.8$σ$ and 7.6$σ$, respectively. The corresponding branching fractions are measured to be…
▽ More
Based on $(27.08\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the $ψ(3686)\toΣ^{+}\barΣ^{-}ω$ and $Σ^{+}\barΣ^{-}φ$ decays are observed for the first time with statistical significances of 13.8$σ$ and 7.6$σ$, respectively. The corresponding branching fractions are measured to be $\mathcal{B}(ψ(3686)\toΣ^{+}\barΣ^{-}ω)=(1.90 \pm 0.18 \pm 0.21) \times 10^{-5}$ and $\mathcal{B}(ψ(3686)\toΣ^{+}\barΣ^{-}φ)=(2.96 \pm 0.54 \pm 0.41) \times 10^{-6}$, where the first uncertainties are statistical and the second systematic.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
A global product of fine-scale urban building height based on spaceborne lidar
Authors:
Xiao Ma,
Guang Zheng,
Chi Xu,
L. Monika Moskal,
Peng Gong,
Qinghua Guo,
Huabing Huang,
Xuecao Li,
Yong Pang,
Cheng Wang,
Huan Xie,
Bailang Yu,
Bo Zhao,
Yuyu Zhou
Abstract:
Characterizing urban environments with broad coverages and high precision is more important than ever for achieving the UN's Sustainable Development Goals (SDGs) as half of the world's populations are living in cities. Urban building height as a fundamental 3D urban structural feature has far-reaching applications. However, so far, producing readily available datasets of recent urban building heig…
▽ More
Characterizing urban environments with broad coverages and high precision is more important than ever for achieving the UN's Sustainable Development Goals (SDGs) as half of the world's populations are living in cities. Urban building height as a fundamental 3D urban structural feature has far-reaching applications. However, so far, producing readily available datasets of recent urban building heights with fine spatial resolutions and global coverages remains a challenging task. Here, we provide an up-to-date global product of urban building heights based on a fine grid size of 150 m around 2020 by combining the spaceborne lidar instrument of GEDI and multi-sourced data including remotely sensed images (i.e., Landsat-8, Sentinel-2, and Sentinel-1) and topographic data. Our results revealed that the estimated method of building height samples based on the GEDI data was effective with 0.78 of Pearson's r and 3.67 m of RMSE in comparison to the reference data. The map** product also demonstrated good performance as indicated by its strong correlation with the reference data (i.e., Pearson's r = 0.71, RMSE = 4.60 m). Compared with the currently existing products, our global urban building height map holds the ability to provide a higher spatial resolution (i.e., 150 m) with a great level of inherent details about the spatial heterogeneity and flexibility of updating using the GEDI samples as inputs. This work will boost future urban studies across many fields including climate, environmental, ecological, and social sciences.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
The Intensity of Diffuse Galactic Emission Reflected by Meteor Trails
Authors:
Feiyu Zhao,
Ruxi Liang,
Zepei Yang,
Huanyuan Shan,
Qian Zheng,
Qiqian Zhang,
Quan Guo
Abstract:
We calculate the reflection of diffuse galactic emission by meteor trails and investigate its potential relationship to Meteor Radio Afterglow (MRA). The formula to calculate the reflection of diffuse galactic emission is derived from a simplified case, assuming that the signals are mirrored by the cylindrical over-dense ionization trail of meteors. The overall observed reflection is simulated thr…
▽ More
We calculate the reflection of diffuse galactic emission by meteor trails and investigate its potential relationship to Meteor Radio Afterglow (MRA). The formula to calculate the reflection of diffuse galactic emission is derived from a simplified case, assuming that the signals are mirrored by the cylindrical over-dense ionization trail of meteors. The overall observed reflection is simulated through a ray tracing algorithm together with the diffuse galactic emission modelled by the GSM sky model. We demonstrate that the spectrum of the reflected signal is broadband and follows a power law with a negative spectral index of around -1.3. The intensity of the reflected signal varies with local sidereal time and the brightness of the meteor and can reach 2000 Jy. These results agree with some previous observations of MRAs. Therefore, we think that the reflection of galactic emission by meteor trails can be a possible mechanism causing MRAs, which is worthy of further research.
△ Less
Submitted 15 November, 2023; v1 submitted 21 October, 2023;
originally announced October 2023.
-
StoryAnalogy: Deriving Story-level Analogies from Large Language Models to Unlock Analogical Understanding
Authors:
Cheng Jiayang,
Lin Qiu,
Tsz Ho Chan,
Tianqing Fang,
Weiqi Wang,
Chunkit Chan,
Dongyu Ru,
Qipeng Guo,
Hongming Zhang,
Yangqiu Song,
Yue Zhang,
Zheng Zhang
Abstract:
Analogy-making between narratives is crucial for human reasoning. In this paper, we evaluate the ability to identify and generate analogies by constructing a first-of-its-kind large-scale story-level analogy corpus, \textsc{StoryAnalogy}, which contains 24K story pairs from diverse domains with human annotations on two similarities from the extended Structure-Map** Theory. We design a set of tes…
▽ More
Analogy-making between narratives is crucial for human reasoning. In this paper, we evaluate the ability to identify and generate analogies by constructing a first-of-its-kind large-scale story-level analogy corpus, \textsc{StoryAnalogy}, which contains 24K story pairs from diverse domains with human annotations on two similarities from the extended Structure-Map** Theory. We design a set of tests on \textsc{StoryAnalogy}, presenting the first evaluation of story-level analogy identification and generation. Interestingly, we find that the analogy identification tasks are incredibly difficult not only for sentence embedding models but also for the recent large language models (LLMs) such as ChatGPT and LLaMa. ChatGPT, for example, only achieved around 30% accuracy in multiple-choice questions (compared to over 85% accuracy for humans). Furthermore, we observe that the data in \textsc{StoryAnalogy} can improve the quality of analogy generation in LLMs, where a fine-tuned FlanT5-xxl model achieves comparable performance to zero-shot ChatGPT.
△ Less
Submitted 23 October, 2023; v1 submitted 19 October, 2023;
originally announced October 2023.
-
Micro-seismic Elastic Reflection Full Waveform Inversion with An Equivalent Source
Authors:
Hanchen Wang,
Qiang Guo,
Tariq Alkhalifah
Abstract:
In micro-seismic event measurements, pinpointing the passive source's exact spatial and temporal location is paramount. This research advocates for the combined use of both P- and S-wave data, captured by geophone monitoring systems, to improve source inversion accuracy. Drawing inspiration from the secondary source concept in Elastic Reflection Full Waveform Inversion (ERFWI), we introduce an equ…
▽ More
In micro-seismic event measurements, pinpointing the passive source's exact spatial and temporal location is paramount. This research advocates for the combined use of both P- and S-wave data, captured by geophone monitoring systems, to improve source inversion accuracy. Drawing inspiration from the secondary source concept in Elastic Reflection Full Waveform Inversion (ERFWI), we introduce an equivalent source term. This term combines source functions and source images. Our optimization strategy iteratively refines the spatial locations of the source, its temporal functions, and associated velocities using a full waveform inversion framework. Under the premise of an isotropic medium with consistent density, the source is defined by two spatial and three temporal components. This offers a nuanced source representation in contrast to the conventional seismic moment tensor. To address gradient computation, we employ the adjoint-state method. However, we encountered pronounced non-linearity in waveform inversion of micro-seismic events, primarily due to the unknown source origin time, resulting in cycle skip** challenges. To counteract this, we devised an objective function that is decoupled from the source origin time. This function is formulated by convolving reference traces with both observed and predicted data. Through the concurrent inversion of the source image, source time function, and velocity model, our method offers precise estimations of these parameters, as validated by a synthetic 2D example based on a modified Marmousi model. This nested inversion approach promises enhanced accuracy in determining the source image, time function, and velocity model.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
IRAD: Implicit Representation-driven Image Resampling against Adversarial Attacks
Authors:
Yue Cao,
Tianlin Li,
Xiaofeng Cao,
Ivor Tsang,
Yang Liu,
Qing Guo
Abstract:
We introduce a novel approach to counter adversarial attacks, namely, image resampling. Image resampling transforms a discrete image into a new one, simulating the process of scene recapturing or rerendering as specified by a geometrical transformation. The underlying rationale behind our idea is that image resampling can alleviate the influence of adversarial perturbations while preserving essent…
▽ More
We introduce a novel approach to counter adversarial attacks, namely, image resampling. Image resampling transforms a discrete image into a new one, simulating the process of scene recapturing or rerendering as specified by a geometrical transformation. The underlying rationale behind our idea is that image resampling can alleviate the influence of adversarial perturbations while preserving essential semantic information, thereby conferring an inherent advantage in defending against adversarial attacks. To validate this concept, we present a comprehensive study on leveraging image resampling to defend against adversarial attacks. We have developed basic resampling methods that employ interpolation strategies and coordinate shifting magnitudes. Our analysis reveals that these basic methods can partially mitigate adversarial attacks. However, they come with apparent limitations: the accuracy of clean images noticeably decreases, while the improvement in accuracy on adversarial examples is not substantial. We propose implicit representation-driven image resampling (IRAD) to overcome these limitations. First, we construct an implicit continuous representation that enables us to represent any input image within a continuous coordinate space. Second, we introduce SampleNet, which automatically generates pixel-wise shifts for resampling in response to different inputs. Furthermore, we can extend our approach to the state-of-the-art diffusion-based method, accelerating it with fewer time steps while preserving its defense capability. Extensive experiments demonstrate that our method significantly enhances the adversarial robustness of diverse deep models against various attacks while maintaining high accuracy on clean images.
△ Less
Submitted 13 April, 2024; v1 submitted 18 October, 2023;
originally announced October 2023.
-
Measurement of the cross sections for $e^+e^-\toηπ^+π^-$ at center-of-mass energies between 2.00 and 3.08 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (605 additional authors not shown)
Abstract:
Using data samples collected at center-of-mass energies between 2.000 and 3.080 GeV with the BESIII detector operating at the BEPCII collider, a partial-wave analysis is performed on the process $e^+e^-\toηπ^+π^-$. In addition to the dominant $e^+e^-\toρη$ component, the $e^+e^-\to a_2(1320)π$ process is also sizeable, contributing up to 24% of the total reaction. The measured cross sections of th…
▽ More
Using data samples collected at center-of-mass energies between 2.000 and 3.080 GeV with the BESIII detector operating at the BEPCII collider, a partial-wave analysis is performed on the process $e^+e^-\toηπ^+π^-$. In addition to the dominant $e^+e^-\toρη$ component, the $e^+e^-\to a_2(1320)π$ process is also sizeable, contributing up to 24% of the total reaction. The measured cross sections of the process $e^+e^-\toηπ^+π^-$ are systematically higher than those of BaBar by more than $3σ$ at center-of-mass energies between 2.000 and 2.300 GeV. In the cross section lineshape for $e^+e^-\to a_2(1320)π$, a resonant structure is observed with a significance of $5.5σ$, with $M=(2044\pm31\pm4)$ MeV/$c^2$, $Γ=(163\pm69\pm24)$ MeV and $\mathcal{B_{R}}\cdotΓ_{e^+e^-}^{R}=(34.6\pm17.1\pm6.0)$ eV or $(137.1\pm73.3\pm2.1)$ eV. In the cross section lineshape for $e^+e^-\toρη$, an evidence of a dip structure around 2180 MeV/$c^2$ is observed with statistical significance of $3.0σ$.
△ Less
Submitted 28 November, 2023; v1 submitted 16 October, 2023;
originally announced October 2023.
-
AdaLomo: Low-memory Optimization with Adaptive Learning Rate
Authors:
Kai Lv,
Hang Yan,
Qipeng Guo,
Haijun Lv,
Xipeng Qiu
Abstract:
Large language models have achieved remarkable success, but their extensive parameter size necessitates substantial memory for training, thereby setting a high threshold. While the recently proposed low-memory optimization (LOMO) reduces memory footprint, its optimization technique, akin to stochastic gradient descent, is sensitive to hyper-parameters and exhibits suboptimal convergence, failing t…
▽ More
Large language models have achieved remarkable success, but their extensive parameter size necessitates substantial memory for training, thereby setting a high threshold. While the recently proposed low-memory optimization (LOMO) reduces memory footprint, its optimization technique, akin to stochastic gradient descent, is sensitive to hyper-parameters and exhibits suboptimal convergence, failing to match the performance of the prevailing optimizer for large language models, AdamW. Through empirical analysis of the Adam optimizer, we found that, compared to momentum, the adaptive learning rate is more critical for bridging the gap. Building on this insight, we introduce the low-memory optimization with adaptive learning rate (AdaLomo), which offers an adaptive learning rate for each parameter. To maintain memory efficiency, we employ non-negative matrix factorization for the second-order moment estimation in the optimizer state. Additionally, we suggest the use of a grouped update normalization to stabilize convergence. Our experiments with instruction-tuning and further pre-training demonstrate that AdaLomo achieves results on par with AdamW, while significantly reducing memory requirements, thereby lowering the hardware barrier to training large language models. The code is accessible at https://github.com/OpenLMLab/LOMO.
△ Less
Submitted 6 June, 2024; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Efficient and Effective Deep Multi-view Subspace Clustering
Authors:
Yuxiu Lin,
Hui Liu,
Ren Wang,
Qiang Guo,
Caiming Zhang
Abstract:
Recent multi-view subspace clustering achieves impressive results utilizing deep networks, where the self-expressive correlation is typically modeled by a fully connected (FC) layer. However, they still suffer from two limitations. i) The parameter scale of the FC layer is quadratic to sample numbers, resulting in high time and memory costs that significantly degrade their feasibility in large-sca…
▽ More
Recent multi-view subspace clustering achieves impressive results utilizing deep networks, where the self-expressive correlation is typically modeled by a fully connected (FC) layer. However, they still suffer from two limitations. i) The parameter scale of the FC layer is quadratic to sample numbers, resulting in high time and memory costs that significantly degrade their feasibility in large-scale datasets. ii) It is under-explored to extract a unified representation that simultaneously satisfies minimal sufficiency and discriminability. To this end, we propose a novel deep framework, termed Efficient and Effective deep Multi-View Subspace Clustering (E$^2$MVSC). Instead of a parameterized FC layer, we design a Relation-Metric Net that decouples network parameter scale from sample numbers for greater computational efficiency. Most importantly, the proposed method devises a multi-type auto-encoder to explicitly decouple consistent, complementary, and superfluous information from every view, which is supervised by a soft clustering assignment similarity constraint. Following information bottleneck theory and the maximal coding rate reduction principle, a sufficient yet minimal unified representation can be obtained, as well as pursuing intra-cluster aggregation and inter-cluster separability within it. Extensive experiments show that E$^2$MVSC yields comparable results to existing methods and achieves state-of-the-art performance in various types of multi-view datasets.
△ Less
Submitted 3 December, 2023; v1 submitted 14 October, 2023;
originally announced October 2023.
-
Causality and Independence Enhancement for Biased Node Classification
Authors:
Guoxin Chen,
Yongqing Wang,
Fangda Guo,
Qinglang Guo,
Jiangli Shao,
Huawei Shen,
Xueqi Cheng
Abstract:
Most existing methods that address out-of-distribution (OOD) generalization for node classification on graphs primarily focus on a specific type of data biases, such as label selection bias or structural bias. However, anticipating the type of bias in advance is extremely challenging, and designing models solely for one specific type may not necessarily improve overall generalization performance.…
▽ More
Most existing methods that address out-of-distribution (OOD) generalization for node classification on graphs primarily focus on a specific type of data biases, such as label selection bias or structural bias. However, anticipating the type of bias in advance is extremely challenging, and designing models solely for one specific type may not necessarily improve overall generalization performance. Moreover, limited research has focused on the impact of mixed biases, which are more prevalent and demanding in real-world scenarios. To address these limitations, we propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs). Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations through the backdoor adjustment. Meanwhile, independence constraint is introduced to improve the discriminability and stability of causal and spurious features in complex biased environments. Essentially, CIE eliminates different types of data biases from a unified perspective, without the need to design separate methods for each bias as before. To evaluate the performance under specific types of data biases, mixed biases, and low-resource scenarios, we conducted comprehensive experiments on five publicly available datasets. Experimental results demonstrate that our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
△ Less
Submitted 4 November, 2023; v1 submitted 14 October, 2023;
originally announced October 2023.
-
Plug-and-Play Feature Generation for Few-Shot Medical Image Classification
Authors:
Qianyu Guo,
Huifang Du,
Xing Jia,
Shuyong Gao,
Yan Teng,
Haofen Wang,
Wenqiang Zhang
Abstract:
Few-shot learning (FSL) presents immense potential in enhancing model generalization and practicality for medical image classification with limited training data; however, it still faces the challenge of severe overfitting in classifier training due to distribution bias caused by the scarce training samples. To address the issue, we propose MedMFG, a flexible and lightweight plug-and-play method d…
▽ More
Few-shot learning (FSL) presents immense potential in enhancing model generalization and practicality for medical image classification with limited training data; however, it still faces the challenge of severe overfitting in classifier training due to distribution bias caused by the scarce training samples. To address the issue, we propose MedMFG, a flexible and lightweight plug-and-play method designed to generate sufficient class-distinctive features from limited samples. Specifically, MedMFG first re-represents the limited prototypes to assign higher weights for more important information features. Then, the prototypes are variationally generated into abundant effective features. Finally, the generated features and prototypes are together to train a more generalized classifier. Experiments demonstrate that MedMFG outperforms the previous state-of-the-art methods on cross-domain benchmarks involving the transition from natural images to medical images, as well as medical images with different lesions. Notably, our method achieves over 10% performance improvement compared to several baselines. Fusion experiments further validate the adaptability of MedMFG, as it seamlessly integrates into various backbones and baselines, consistently yielding improvements of over 2.9% across all results.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
SAIR: Learning Semantic-aware Implicit Representation
Authors:
Canyu Zhang,
Xiaoguang Li,
Qing Guo,
Song Wang
Abstract:
Implicit representation of an image can map arbitrary coordinates in the continuous domain to their corresponding color values, presenting a powerful capability for image reconstruction. Nevertheless, existing implicit representation approaches only focus on building continuous appearance map**, ignoring the continuities of the semantic information across pixels. As a result, they can hardly ach…
▽ More
Implicit representation of an image can map arbitrary coordinates in the continuous domain to their corresponding color values, presenting a powerful capability for image reconstruction. Nevertheless, existing implicit representation approaches only focus on building continuous appearance map**, ignoring the continuities of the semantic information across pixels. As a result, they can hardly achieve desired reconstruction results when the semantic information within input images is corrupted, for example, a large region misses. To address the issue, we propose to learn semantic-aware implicit representation (SAIR), that is, we make the implicit representation of each pixel rely on both its appearance and semantic information (\eg, which object does the pixel belong to). To this end, we propose a framework with two modules: (1) building a semantic implicit representation (SIR) for a corrupted image whose large regions miss. Given an arbitrary coordinate in the continuous domain, we can obtain its respective text-aligned embedding indicating the object the pixel belongs. (2) building an appearance implicit representation (AIR) based on the SIR. Given an arbitrary coordinate in the continuous domain, we can reconstruct its color whether or not the pixel is missed in the input. We validate the novel semantic-aware implicit representation method on the image inpainting task, and the extensive experiments demonstrate that our method surpasses state-of-the-art approaches by a significant margin.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Very high energy gamma-ray emission beyond 10 TeV from GRB 221009A
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
A. Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the t…
▽ More
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the trigger. The intrinsic energy spectrum of gamma-rays can be described by a power-law after correcting for extragalactic background light (EBL) absorption. Such a hard spectrum challenges the synchrotron self-Compton (SSC) scenario of relativistic electrons for the afterglow emission above several TeV. Observations of gamma-rays up to 13 TeV from a source with a measured redshift of z=0.151 hints more transparency in intergalactic space than previously expected. Alternatively, one may invoke new physics such as Lorentz Invariance Violation (LIV) or an axion origin of very high energy (VHE) signals.
△ Less
Submitted 22 November, 2023; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Nonrelativistic Limit of Normalized Solutions to a class of nonlinear Dirac equations
Authors:
Pan Chen,
Yanheng Ding,
Qi Guo,
Huayang Wang
Abstract:
In this paper, we investigate the nonrelativistic limit of normalized solutions to a nonlinear Dirac equation as given below: \begin{equation*}
\begin{cases} &-i c\sum\limits_{k=1}^3α_k\partial_k u +mc^2 β{u}- Γ* (K |{u}|^κ) K|{u}|^{κ-2}{u}- P |{u}|^{s-2}{u}=ω{u}, \\ &\displaystyle\int_{\mathbb{R}^3}\vert u \vert^2 dx =1.
\end{cases} \end{equation*} Here, $c>0$ represents the speed of light,…
▽ More
In this paper, we investigate the nonrelativistic limit of normalized solutions to a nonlinear Dirac equation as given below: \begin{equation*}
\begin{cases} &-i c\sum\limits_{k=1}^3α_k\partial_k u +mc^2 β{u}- Γ* (K |{u}|^κ) K|{u}|^{κ-2}{u}- P |{u}|^{s-2}{u}=ω{u}, \\ &\displaystyle\int_{\mathbb{R}^3}\vert u \vert^2 dx =1.
\end{cases} \end{equation*} Here, $c>0$ represents the speed of light, $m > 0$ is the mass of the Dirac particle, $ω\in\mathbb{R}$ emerges as an indeterminate Lagrange multiplier, $Γ$, $K$, $P$ are real-valued function defined on $\mathbb{R}^3$, also known as potential functions. Our research first confirms the presence of normalized solutions to the Dirac equation under high-speed light conditions. We then illustrate that these solutions progress to become the ground states of a system of nonlinear Schrödinger equations with a normalized constraint, exhibiting uniform boundedness and exponential decay irrespective of the light speed. Our results form the first discussion on nonrelativistic limit of normalized solutions to nonlinear Dirac equations. This not only aids in the study of normalized solutions of the nonlinear Schrödinger equations, but also physically explains that the normalized ground states of high-speed particles and low-speed motion particles are consistent.
△ Less
Submitted 16 October, 2023; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Search for $J/ψ$ weak decays containing $D$ meson
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (600 additional authors not shown)
Abstract:
Using a sample of about 10 billion $J/ψ$ events with the BESIII detector, we search for the weak decays of $J/ψ\to \bar{D}^0π^0 + c.c.$, $J/ψ\to \bar{D}^0η+ c.c.$, $J/ψ\to \bar{D}^0ρ^0 + c.c.$, $J/ψ\to D^-π^+ + c.c.$, and $J/ψ\to D^-ρ^+ + c.c.$. Since no significant signal is observed, we set the upper limits of the branching fractions of these decays to be…
▽ More
Using a sample of about 10 billion $J/ψ$ events with the BESIII detector, we search for the weak decays of $J/ψ\to \bar{D}^0π^0 + c.c.$, $J/ψ\to \bar{D}^0η+ c.c.$, $J/ψ\to \bar{D}^0ρ^0 + c.c.$, $J/ψ\to D^-π^+ + c.c.$, and $J/ψ\to D^-ρ^+ + c.c.$. Since no significant signal is observed, we set the upper limits of the branching fractions of these decays to be $\mathcal{B}(J/ψ\to \bar{D}^0π^0 + c.c.) < 4.7 \times 10^{-7}$, $\mathcal{B}(J/ψ\to \bar{D}^0η+ c.c.) < 6.8 \times 10^{-7}$, $\mathcal{B}(J/ψ\to \bar{D}^0ρ^0 + c.c.) < 5.2 \times 10^{-7}$, $\mathcal{B}(J/ψ\to D^-π^+ + c.c.) < 7.0 \times 10^{-8}$, and $\mathcal{B}(J/ψ\to D^-ρ^+ + c.c.) < 6.0 \times 10^{-7}$ at the 90\% confidence level.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Retromorphic Testing: A New Approach to the Test Oracle Problem
Authors:
Boxi Yu,
Qiuyang Mang,
Qingshuo Guo,
Pinjia He
Abstract:
A test oracle serves as a criterion or mechanism to assess the correspondence between software output and the anticipated behavior for a given input set. In automated testing, black-box techniques, known for their non-intrusive nature in test oracle construction, are widely used, including notable methodologies like differential testing and metamorphic testing. Inspired by the mathematical concept…
▽ More
A test oracle serves as a criterion or mechanism to assess the correspondence between software output and the anticipated behavior for a given input set. In automated testing, black-box techniques, known for their non-intrusive nature in test oracle construction, are widely used, including notable methodologies like differential testing and metamorphic testing. Inspired by the mathematical concept of inverse function, we present Retromorphic Testing, a novel black-box testing methodology. It leverages an auxiliary program in conjunction with the program under test, which establishes a dual-program structure consisting of a forward program and a backward program. The input data is first processed by the forward program and then its program output is reversed to its original input format using the backward program. In particular, the auxiliary program can operate as either the forward or backward program, leading to different testing modes. The process concludes by examining the relationship between the initial input and the transformed output within the input domain. For example, to test the implementation of the sine function $\sin(x)$, we can employ its inverse function, $\arcsin(x)$, and validate the equation $x = \sin(\arcsin(x)+2kπ), \forall k \in \mathbb{Z}$. In addition to the high-level concept of Retromorphic Testing, this paper presents its three testing modes with illustrative use cases across diverse programs, including algorithms, traditional software, and AI applications.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.