Search | arXiv e-print repository

arXiv:2403.19091 [pdf, other]

Observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (600 additional authors not shown)

Abstract: By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fra… ▽ More By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fractions are measured to be $\mathcal{B}(D^0\rightarrow {K}_1(1270)^-(\to K^0_Sπ^-π^0)e^+ν_e)=(1.69^{+0.53}_{-0.46}\pm0.15)\times10^{-4}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0(\to K^0_Sπ^+π^-)e^+ν_e)=(1.47^{+0.45}_{-0.40}\pm0.20)\times10^{-4}$ with statistical significance of 5.4$σ$ and 5.6$σ$, respectively. When combined with measurements of the $K_1(1270)\to K^+π^-π$ decays, the absolute branching fractions are determined to be $\mathcal{B}(D^0\to K_1(1270)^-e^+ν_e)=(1.05^{+0.33}_{-0.28}\pm0.12\pm0.12)\times10^{-3}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0e^+ν_e)=(1.29^{+0.40}_{-0.35}\pm0.18\pm0.15)\times10^{-3}$. The first and second uncertainties are statistical and systematic, respectively, and the third uncertainties originate from the assumed branching fractions of the $K_1(1270)\to Kππ$ decays. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: 19pages

arXiv:2403.18923 [pdf, other]

Nature-Guided Cognitive Evolution for Predicting Dissolved Oxygen Concentrations in North Temperate Lakes

Authors: Runlong Yu, Robert Ladwig, Xiang Xu, Peijun Zhu, Paul C. Hanson, Yiqun Xie, Xiaowei Jia

Abstract: Predicting dissolved oxygen (DO) concentrations in north temperate lakes requires a comprehensive study of phenological patterns across various ecosystems, which highlights the significance of selecting phenological features and feature interactions. Process-based models are limited by partial process knowledge or oversimplified feature representations, while machine learning models face challenge… ▽ More Predicting dissolved oxygen (DO) concentrations in north temperate lakes requires a comprehensive study of phenological patterns across various ecosystems, which highlights the significance of selecting phenological features and feature interactions. Process-based models are limited by partial process knowledge or oversimplified feature representations, while machine learning models face challenges in efficiently selecting relevant feature interactions for different lake types and tasks, especially under the infrequent nature of DO data collection. In this paper, we propose a Nature-Guided Cognitive Evolution (NGCE) strategy, which represents a multi-level fusion of adaptive learning with natural processes. Specifically, we utilize metabolic process-based models to generate simulated DO labels. Using these simulated labels, we implement a multi-population cognitive evolutionary search, where models, mirroring natural organisms, adaptively evolve to select relevant feature interactions within populations for different lake types and tasks. These models are not only capable of undergoing crossover and mutation mechanisms within intra-populations but also, albeit infrequently, engage in inter-population crossover. The second stage involves refining these models by retraining them with real observed labels. We have tested the performance of our NGCE strategy in predicting daily DO concentrations across a wide range of lakes in the Midwest, USA. These lakes, varying in size, depth, and trophic status, represent a broad spectrum of north temperate lakes. Our findings demonstrate that NGCE not only produces accurate predictions with few observed labels but also, through gene maps of models, reveals sophisticated phenological patterns of different lakes. △ Less

Submitted 15 February, 2024; originally announced March 2024.

arXiv:2403.18871 [pdf]

doi 10.1016/j.jbi.2024.104673

Clinical Domain Knowledge-Derived Template Improves Post Hoc AI Explanations in Pneumothorax Classification

Authors: Han Yuan, Chuan Hong, Pengtao Jiang, Gangming Zhao, Nguyen Tuan Anh Tran, Xinxing Xu, Yet Yen Yan, Nan Liu

Abstract: Background: Pneumothorax is an acute thoracic disease caused by abnormal air collection between the lungs and chest wall. To address the opaqueness often associated with deep learning (DL) models, explainable artificial intelligence (XAI) methods have been introduced to outline regions related to pneumothorax diagnoses made by DL models. However, these explanations sometimes diverge from actual le… ▽ More Background: Pneumothorax is an acute thoracic disease caused by abnormal air collection between the lungs and chest wall. To address the opaqueness often associated with deep learning (DL) models, explainable artificial intelligence (XAI) methods have been introduced to outline regions related to pneumothorax diagnoses made by DL models. However, these explanations sometimes diverge from actual lesion areas, highlighting the need for further improvement. Method: We propose a template-guided approach to incorporate the clinical knowledge of pneumothorax into model explanations generated by XAI methods, thereby enhancing the quality of these explanations. Utilizing one lesion delineation created by radiologists, our approach first generates a template that represents potential areas of pneumothorax occurrence. This template is then superimposed on model explanations to filter out extraneous explanations that fall outside the template's boundaries. To validate its efficacy, we carried out a comparative analysis of three XAI methods with and without our template guidance when explaining two DL models in two real-world datasets. Results: The proposed approach consistently improved baseline XAI methods across twelve benchmark scenarios built on three XAI methods, two DL models, and two datasets. The average incremental percentages, calculated by the performance improvements over the baseline performance, were 97.8% in Intersection over Union (IoU) and 94.1% in Dice Similarity Coefficient (DSC) when comparing model explanations and ground-truth lesion areas. Conclusions: In the context of pneumothorax diagnoses, we proposed a template-guided approach for improving AI explanations. We anticipate that our template guidance will forge a fresh approach to elucidating AI models by integrating clinical domain expertise. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.18299 [pdf, other]

Scaling Enhancement of Photon Blockade in Output Fields

Authors: Zhi-Hao Liu, Xun-Wei Xu

Abstract: Photon blockade enhancement is an exciting and promising subject that has been well studied for photons in cavities. However, whether photon blockade can be enhanced in the output fields remains largely unexplored. We show that photon blockade can be greatly enhanced in the mixing output field of a nonlinear cavity and an auxiliary (linear) cavity, where no direct coupling between the nonlinear an… ▽ More Photon blockade enhancement is an exciting and promising subject that has been well studied for photons in cavities. However, whether photon blockade can be enhanced in the output fields remains largely unexplored. We show that photon blockade can be greatly enhanced in the mixing output field of a nonlinear cavity and an auxiliary (linear) cavity, where no direct coupling between the nonlinear and auxiliary cavities is needed. We uncover a biquadratic scaling relation between the second-order correlation of the photons in the output field and intracavity nonlinear interaction strength, in contrast to a quadratic scaling relation for the photons in a nonlinear cavity. We identify that this scaling enhancement of photon blockade in the output field is induced by the destructive interference between two of the paths for two photons passing through the two cavities. We then extend the theory to the experimentally feasible Jaynes-Cummings model consisting of a two-level system strongly coupled to one of the two uncoupled cavities, and also predict a biquadratic scaling law in the mixing output field. Our proposed scheme is universal and can be extended to enhance blockade in other bosonic systems. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: 7 pages, 4 figures

arXiv:2403.18189 [pdf]

doi 10.1038/s41467-024-45318-8

Interfacial magnetic spin Hall effect in van der Waals Fe3GeTe2/MoTe2 heterostructure

Authors: Yudi Dai, Junlin Xiong, Yanfeng Ge, Bin Cheng, Lizheng Wang, Pengfei Wang, Zenglin Liu, Shengnan Yan, Cuiwei Zhang, Xianghan Xu, Youguo Shi, Sang-Wook Cheong, Cong Xiao, Shengyuan A. Yang, Shi-Jun Liang, Feng Miao

Abstract: The spin Hall effect (SHE) allows efficient generation of spin polarization or spin current through charge current and plays a crucial role in the development of spintronics. While SHE typically occurs in non-magnetic materials and is time-reversal even, exploring time-reversal-odd (T-odd) SHE, which couples SHE to magnetization in ferromagnetic materials, offers a new charge-spin conversion mecha… ▽ More The spin Hall effect (SHE) allows efficient generation of spin polarization or spin current through charge current and plays a crucial role in the development of spintronics. While SHE typically occurs in non-magnetic materials and is time-reversal even, exploring time-reversal-odd (T-odd) SHE, which couples SHE to magnetization in ferromagnetic materials, offers a new charge-spin conversion mechanism with new functionalities. Here, we report the observation of giant T-odd SHE in Fe3GeTe2/MoTe2 van der Waals heterostructure, representing a previously unidentified interfacial magnetic spin Hall effect (interfacial-MSHE). Through rigorous symmetry analysis and theoretical calculations, we attribute the interfacial-MSHE to a symmetry-breaking induced spin current dipole at the vdW interface. Furthermore, we show that this linear effect can be used for implementing multiply-accumulate operations and binary convolutional neural networks with cascaded multi-terminal devices. Our findings uncover an interfacial T-odd charge-spin conversion mechanism with promising potential for energy-efficient in-memory computing. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Journal ref: Nature Communications 15, 1129 (2024)

arXiv:2403.17974 [pdf]

Coherent Modulation of Two-Dimensional Moiré States with On-Chip THz Waves

Authors: Yiliu Li, Eric A. Arsenault, Birui Yang, Xi Wang, Heonjoon Park, Yinjie Guo, Takashi Taniguchi, Kenji Watanabe, Daniel Gamelin, James C. Hone, Cory R. Dean, Sebastian F. Maehrlein, Xiaodong Xu, Xiaoyang Zhu

Abstract: Van der Waals (vdW) structures of two-dimensional materials host a broad range of physical phenomena. New opportunities arise if different functional layers may be remotely modulated or coupled in a device structure. Here we demonstrate the in-situ coherent modulation of moiré excitons and correlated Mott insulators in transition metal dichalcogenide (TMD) homo- or hetero-bilayers with on-chip ter… ▽ More Van der Waals (vdW) structures of two-dimensional materials host a broad range of physical phenomena. New opportunities arise if different functional layers may be remotely modulated or coupled in a device structure. Here we demonstrate the in-situ coherent modulation of moiré excitons and correlated Mott insulators in transition metal dichalcogenide (TMD) homo- or hetero-bilayers with on-chip terahertz (THz) waves. Using common dual-gated device structures, each consisting of a TMD moiré bilayer sandwiched between two few-layer graphene (fl-Gr) gates with hexagonal boron nitride (h-BN) spacers, we launch coherent phonon wavepackets at ~0.4-1 THz from the fl-Gr gates by femtosecond laser excitation. The waves travel through the h-BN spacer, arrive at the TMD bilayer with precise timing, and coherently modulate the moiré excitons or the Mott states. These results demonstrate that the fl-Gr gates, often used for electrical control of the material properties, can serve as effective on-chip opto-elastic transducers to generate THz waves for the coherent control and vibrational entanglement of functional layers in commonly used moiré devices. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 11 pages, 4 figures, 12 pages SI. arXiv admin note: substantial text overlap with arXiv:2307.16563

arXiv:2403.17906 [pdf, other]

WKB asymptotics of Stokes matrices, spectral curves and rhombus inequalities

Authors: Anton Alekseev, Andrew Neitzke, Xiaomeng Xu, Yan Zhou

Abstract: We consider an $n\times n$ system of ODEs on $\mathbb{P}^1$ with a simple pole $A$ at $z=0$ and a double pole $u={\rm diag}(u_1, \dots, u_n)$ at $z=\infty$. This is the simplest situation in which the monodromy data of the system are described by upper and lower triangular Stokes matrices $S_\pm$, and we impose reality conditions which imply $S_-=S_+^\dagger$. We study leading WKB exponents of Sto… ▽ More We consider an $n\times n$ system of ODEs on $\mathbb{P}^1$ with a simple pole $A$ at $z=0$ and a double pole $u={\rm diag}(u_1, \dots, u_n)$ at $z=\infty$. This is the simplest situation in which the monodromy data of the system are described by upper and lower triangular Stokes matrices $S_\pm$, and we impose reality conditions which imply $S_-=S_+^\dagger$. We study leading WKB exponents of Stokes matrices in parametrizations given by generalized minors and by spectral coordinates, and we show that for $u$ on the caterpillar line (which corresponds to the limit $(u_{j+1}-u_j)/(u_j - u_{j-1}) \to \infty$ for $j=2, \cdots, n-1$), the real parts of these exponents are given by periods of certain cycles on the degenerate spectral curve $Γ(u_{\rm cat}(t), A)$. These cycles admit unique deformations for $u$ near the caterpillar line. Using the spectral network theory, we give for $n=2$, and $n=3$ exact WKB predictions for asymptotics of generalized minors in terms of periods of these cycles. Boalch's theorem from Poisson geometry implies that real parts of leading WKB exponents satisfy the rhombus (or interlacing) inequalities. We show that these inequalities are in correspondence with finite webs of the canonical foliation on the root curve $Γ^r(u, A)$, and that they follow from the positivity of the corresponding periods. We conjecture that a similar mechanism applies for $n>3$. We also outline the relation of the spectral coordinates with the cluster structures considered by Goncharov-Shen, and with ${\mathcal N}=2$ supersymmetric quantum field theories in dimension four associated with some simple quivers. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.17277 [pdf, other]

Relational Network Verification

Authors: Xieyang Xu, Yifei Yuan, Zachary Kincaid, Arvind Krishnamurthy, Ratul Mahajan, David Walker, Ennan Zhai

Abstract: Relational network verification is a new approach to validating network changes. In contrast to traditional network verification, which analyzes specifications for a single network snapshot, relational network verification analyzes specifications concerning two network snapshots (e.g., pre- and post-change snapshots) and captures their similarities and differences. Relational change specifications… ▽ More Relational network verification is a new approach to validating network changes. In contrast to traditional network verification, which analyzes specifications for a single network snapshot, relational network verification analyzes specifications concerning two network snapshots (e.g., pre- and post-change snapshots) and captures their similarities and differences. Relational change specifications are compact and precise because they specify the flows or paths that change between snapshots and then simply mandate that other behaviors of the network "stay the same", without enumerating them. To achieve similar guarantees, single-snapshot specifications need to enumerate all flow and path behaviors that are not expected to change, so we can check that nothing has accidentally changed. Thus, precise single-snapshot specifications are proportional to network size, which makes them impractical to generate for many real-world networks. To demonstrate the value of relational reasoning, we develop a high-level relational specification language and a tool called Rela to validate network changes. Rela first compiles input specifications and network snapshot representations to finite state transducers. It then checks compliance using decision procedures for automaton equivalence. Our experiments using data on complex changes to a global backbone (with over 10^3 routers) find that Rela specifications need fewer than 10 terms for 93% of them and it validates 80% of them within 20 minutes. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.17188 [pdf, other]

LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning

Authors: Siyuan Cheng, Guanhong Tao, Yingqi Liu, Guangyu Shen, Shengwei An, Shiwei Feng, Xiangzhe Xu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang

Abstract: Backdoor attack poses a significant security threat to Deep Learning applications. Existing attacks are often not evasive to established backdoor detection techniques. This susceptibility primarily stems from the fact that these attacks typically leverage a universal trigger pattern or transformation function, such that the trigger can cause misclassification for any input. In response to this, re… ▽ More Backdoor attack poses a significant security threat to Deep Learning applications. Existing attacks are often not evasive to established backdoor detection techniques. This susceptibility primarily stems from the fact that these attacks typically leverage a universal trigger pattern or transformation function, such that the trigger can cause misclassification for any input. In response to this, recent papers have introduced attacks using sample-specific invisible triggers crafted through special transformation functions. While these approaches manage to evade detection to some extent, they reveal vulnerability to existing backdoor mitigation techniques. To address and enhance both evasiveness and resilience, we introduce a novel backdoor attack LOTUS. Specifically, it leverages a secret function to separate samples in the victim class into a set of partitions and applies unique triggers to different partitions. Furthermore, LOTUS incorporates an effective trigger focusing mechanism, ensuring only the trigger corresponding to the partition can induce the backdoor behavior. Extensive experimental results show that LOTUS can achieve high attack success rate across 4 datasets and 7 model structures, and effectively evading 13 backdoor detection and mitigation techniques. The code is available at https://github.com/Megum1/LOTUS. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)

arXiv:2403.17042 [pdf, other]

Provably Robust Score-Based Diffusion Posterior Sampling for Plug-and-Play Image Reconstruction

Authors: Xingyu Xu, Yuejie Chi

Abstract: In a great number of tasks in science and engineering, the goal is to infer an unknown image from a small number of measurements collected from a known forward model describing certain sensing or imaging modality. Due to resource constraints, this task is often extremely ill-posed, which necessitates the adoption of expressive prior information to regularize the solution space. Score-based diffusi… ▽ More In a great number of tasks in science and engineering, the goal is to infer an unknown image from a small number of measurements collected from a known forward model describing certain sensing or imaging modality. Due to resource constraints, this task is often extremely ill-posed, which necessitates the adoption of expressive prior information to regularize the solution space. Score-based diffusion models, due to its impressive empirical success, have emerged as an appealing candidate of an expressive prior in image reconstruction. In order to accommodate diverse tasks at once, it is of great interest to develop efficient, consistent and robust algorithms that incorporate unconditional score functions of an image prior distribution in conjunction with flexible choices of forward models. This work develops an algorithmic framework for employing score-based diffusion models as an expressive data prior in general nonlinear inverse problems. Motivated by the plug-and-play framework in the imaging community, we introduce a diffusion plug-and-play method (DPnP) that alternatively calls two samplers, a proximal consistency sampler based solely on the likelihood function of the forward model, and a denoising diffusion sampler based solely on the score functions of the image prior. The key insight is that denoising under white Gaussian noise can be solved rigorously via both stochastic (i.e., DDPM-type) and deterministic (i.e., DDIM-type) samplers using the unconditional score functions. We establish both asymptotic and non-asymptotic performance guarantees of DPnP, and provide numerical experiments to illustrate its promise in solving both linear and nonlinear image reconstruction tasks. To the best of our knowledge, DPnP is the first provably-robust posterior sampling method for nonlinear inverse problems using unconditional diffusion priors. △ Less

Submitted 11 June, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.17010 [pdf, other]

Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding

Authors: Lingdong Kong, Xiang Xu, Jun Cen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu

Abstract: Safety-critical 3D scene understanding tasks necessitate not only accurate but also confident predictions from 3D perception models. This study introduces Calib3D, a pioneering effort to benchmark and scrutinize the reliability of 3D scene understanding models from an uncertainty estimation viewpoint. We comprehensively evaluate 28 state-of-the-art models across 10 diverse 3D datasets, uncovering… ▽ More Safety-critical 3D scene understanding tasks necessitate not only accurate but also confident predictions from 3D perception models. This study introduces Calib3D, a pioneering effort to benchmark and scrutinize the reliability of 3D scene understanding models from an uncertainty estimation viewpoint. We comprehensively evaluate 28 state-of-the-art models across 10 diverse 3D datasets, uncovering insightful phenomena that cope with both the aleatoric and epistemic uncertainties in 3D scene understanding. We discover that despite achieving impressive levels of accuracy, existing models frequently fail to provide reliable uncertainty estimates -- a pitfall that critically undermines their applicability in safety-sensitive contexts. Through extensive analysis of key factors such as network capacity, LiDAR representations, rasterization resolutions, and 3D data augmentation techniques, we correlate these aspects directly with the model calibration efficacy. Furthermore, we introduce DeptS, a novel depth-aware scaling approach aimed at enhancing 3D model calibration. Extensive experiments across a wide range of configurations validate the superiority of our method. We hope this work could serve as a cornerstone for fostering reliable 3D scene understanding. Code and benchmark toolkits are publicly available. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: Preprint; 37 pages, 8 figures, 11 tables; Code at https://github.com/ldkong1205/Calib3D

arXiv:2403.17009 [pdf, other]

Optimizing LiDAR Placements for Robust Driving Perception in Adverse Conditions

Authors: Ye Li, Lingdong Kong, Hanjiang Hu, Xiaohao Xu, Xiaonan Huang

Abstract: The robustness of driving perception systems under unprecedented conditions is crucial for safety-critical usages. Latest advancements have prompted increasing interests towards multi-LiDAR perception. However, prevailing driving datasets predominantly utilize single-LiDAR systems and collect data devoid of adverse conditions, failing to capture the complexities of real-world environments accurate… ▽ More The robustness of driving perception systems under unprecedented conditions is crucial for safety-critical usages. Latest advancements have prompted increasing interests towards multi-LiDAR perception. However, prevailing driving datasets predominantly utilize single-LiDAR systems and collect data devoid of adverse conditions, failing to capture the complexities of real-world environments accurately. Addressing these gaps, we proposed Place3D, a full-cycle pipeline that encompasses LiDAR placement optimization, data generation, and downstream evaluations. Our framework makes three appealing contributions. 1) To identify the most effective configurations for multi-LiDAR systems, we introduce a Surrogate Metric of the Semantic Occupancy Grids (M-SOG) to evaluate LiDAR placement quality. 2) Leveraging the M-SOG metric, we propose a novel optimization strategy to refine multi-LiDAR placements. 3) Centered around the theme of multi-condition multi-LiDAR perception, we collect a 364,000-frame dataset from both clean and adverse conditions. Extensive experiments demonstrate that LiDAR placements optimized using our approach outperform various baselines. We showcase exceptional robustness in both 3D object detection and LiDAR semantic segmentation tasks, under diverse adverse weather and sensor failure conditions. Code and benchmark toolkit are publicly available. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: Preprint; 40 pages, 11 figures, 15 tables; Code at https://github.com/ywyeli/Place3D

arXiv:2403.16811 [pdf, ps, other]

Cross section measurement of $e^+e^-\to ηψ(2S)$ and search for $e^+e^-\toη\tilde{X}(3872)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: The energy-dependent cross section for $e^+e^-\to ηψ(2S)$ is measured at eighteen center of mass energies from 4.288 GeV to 4.951 GeV using the BESIII detector. Using the same data samples, we also perform the first search for the reaction $e^+e^-\toη\tilde{X}(3872)$, but no evidence is found for the $\tilde{X}(3872)$ in the $π^+π^- J/ψ$ mass distribution. At each of the eighteen center of mass en… ▽ More The energy-dependent cross section for $e^+e^-\to ηψ(2S)$ is measured at eighteen center of mass energies from 4.288 GeV to 4.951 GeV using the BESIII detector. Using the same data samples, we also perform the first search for the reaction $e^+e^-\toη\tilde{X}(3872)$, but no evidence is found for the $\tilde{X}(3872)$ in the $π^+π^- J/ψ$ mass distribution. At each of the eighteen center of mass energies, upper limits at the 90\% confidence level on the cross section for $e^+e^-\toηψ(2S)$ and on the product of the $e^+e^-\toη\tilde{X}(3872)$ cross section with the branching fraction of $\tilde{X}(3872)\toπ^+π^- J/ψ$ are reported. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.16583 [pdf]

Amino Acids and Their Biological Derivatives Modulate Protein-Protein Interactions In an Additive Way

Authors: Xufeng Xu, Francesco Stellacci

Abstract: Protein-protein interactions (PPI) differ when measured in test tubes and cells due to the complexity of the intracellular environment. Free amino acids (AAs) and their derivatives constitute a significant fraction of the intracellular volume and mass. Recently, we have found that AAs have a general property of rendering protein dispersions more stable by reducing the net attractive part of PPI. H… ▽ More Protein-protein interactions (PPI) differ when measured in test tubes and cells due to the complexity of the intracellular environment. Free amino acids (AAs) and their derivatives constitute a significant fraction of the intracellular volume and mass. Recently, we have found that AAs have a general property of rendering protein dispersions more stable by reducing the net attractive part of PPI. Here, we study the effects on PPI of different AA derivatives, AA mixtures, and short peptides. We find that all the tested AA derivatives modulate PPI in solution as well as AAs. Furthermore, we show that the modulation effect is additive when AAs form mixtures or are bound into short peptides. Therefore, this study demonstrates the universal effect of a class of small molecules (i.e. AAs and their biological derivatives) on the modulation of PPI and provides insights into rationally designing biocompatible molecules for stabilizing protein interactions and consequently tuning protein functions. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 13 pages, 5 figures

arXiv:2403.16482 [pdf, other]

Determined Multi-Label Learning via Similarity-Based Prompt

Authors: Meng Wei, Zhongnian Li, Peng Ying, Yong Zhou, Xinzheng Xu

Abstract: In multi-label classification, each training instance is associated with multiple class labels simultaneously. Unfortunately, collecting the fully precise class labels for each training instance is time- and labor-consuming for real-world applications. To alleviate this problem, a novel labeling setting termed \textit{Determined Multi-Label Learning} (DMLL) is proposed, aiming to effectively allev… ▽ More In multi-label classification, each training instance is associated with multiple class labels simultaneously. Unfortunately, collecting the fully precise class labels for each training instance is time- and labor-consuming for real-world applications. To alleviate this problem, a novel labeling setting termed \textit{Determined Multi-Label Learning} (DMLL) is proposed, aiming to effectively alleviate the labeling cost inherent in multi-label tasks. In this novel labeling setting, each training instance is associated with a \textit{determined label} (either "Yes" or "No"), which indicates whether the training instance contains the provided class label. The provided class label is randomly and uniformly selected from the whole candidate labels set. Besides, each training instance only need to be determined once, which significantly reduce the annotation cost of the labeling task for multi-label datasets. In this paper, we theoretically derive an risk-consistent estimator to learn a multi-label classifier from these determined-labeled training data. Additionally, we introduce a similarity-based prompt learning method for the first time, which minimizes the risk-consistent loss of large-scale pre-trained models to learn a supplemental prompt with richer semantic information. Extensive experimental validation underscores the efficacy of our approach, demonstrating superior performance compared to existing state-of-the-art methods. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 10 pages, 4 figures

arXiv:2403.16469 [pdf, other]

Learning from Reduced Labels for Long-Tailed Data

Authors: Meng Wei, Zhongnian Li, Yong Zhou, Xinzheng Xu

Abstract: Long-tailed data is prevalent in real-world classification tasks and heavily relies on supervised information, which makes the annotation process exceptionally labor-intensive and time-consuming. Unfortunately, despite being a common approach to mitigate labeling costs, existing weakly supervised learning methods struggle to adequately preserve supervised information for tail samples, resulting in… ▽ More Long-tailed data is prevalent in real-world classification tasks and heavily relies on supervised information, which makes the annotation process exceptionally labor-intensive and time-consuming. Unfortunately, despite being a common approach to mitigate labeling costs, existing weakly supervised learning methods struggle to adequately preserve supervised information for tail samples, resulting in a decline in accuracy for the tail classes. To alleviate this problem, we introduce a novel weakly supervised labeling setting called Reduced Label. The proposed labeling setting not only avoids the decline of supervised information for the tail samples, but also decreases the labeling costs associated with long-tailed data. Additionally, we propose an straightforward and highly efficient unbiased framework with strong theoretical guarantees to learn from these Reduced Labels. Extensive experiments conducted on benchmark datasets including ImageNet validate the effectiveness of our approach, surpassing the performance of state-of-the-art weakly supervised methods. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 12 pages, 3 figures

arXiv:2403.15805 [pdf, other]

AirCrab: A Hybrid Aerial-Ground Manipulator with An Active Wheel

Authors: Muqing Cao, Jiayan Zhao, Xinhang Xu, Lihua Xie

Abstract: Inspired by the behavior of birds, we present AirCrab, a hybrid aerial ground manipulator (HAGM) with a single active wheel and a 3-degree of freedom (3-DoF) manipulator. AirCrab leverages a single point of contact with the ground to reduce position drift and improve manipulation accuracy. The single active wheel enables locomotion on narrow surfaces without adding significant weight to the robot.… ▽ More Inspired by the behavior of birds, we present AirCrab, a hybrid aerial ground manipulator (HAGM) with a single active wheel and a 3-degree of freedom (3-DoF) manipulator. AirCrab leverages a single point of contact with the ground to reduce position drift and improve manipulation accuracy. The single active wheel enables locomotion on narrow surfaces without adding significant weight to the robot. To realize accurate attitude maintenance using propellers on the ground, we design a control allocation method for AirCrab that prioritizes attitude control and dynamically adjusts the thrust input to reduce energy consumption. Experiments verify the effectiveness of the proposed control method and the gain in manipulation accuracy with ground contact. A series of operations to complete the letters 'NTU' demonstrates the capability of the robot to perform challenging hybrid aerial-ground manipulation missions. △ Less

Submitted 23 March, 2024; originally announced March 2024.

arXiv:2403.14998 [pdf, other]

Precise measurement of the $e^+e^-\to D_s^+D_s^-$ cross sections at center-of-mass energies from threshold to 4.95 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using the $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, at center-of-mass energies from the threshold to $4.95$~GeV, we present precise measurements of the cross sections for the process $e^+e^-\to D_s^+D_s^-$ using a single tag method. The resulting cross section lineshape exhibits several new structures, thereby offering an input for coupled channel… ▽ More Using the $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, at center-of-mass energies from the threshold to $4.95$~GeV, we present precise measurements of the cross sections for the process $e^+e^-\to D_s^+D_s^-$ using a single tag method. The resulting cross section lineshape exhibits several new structures, thereby offering an input for coupled channel analysis and model tests, which are critical to understand vector charmonium-like states with masses between 4 and 5~GeV. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: 9 pages, 4 figures, published to PRL

arXiv:2403.14447 [pdf, other]

Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset

Authors: Andrea Avogaro, Andrea Toaiari, Federico Cunico, Xiangmin Xu, Haralambos Dafas, Alessandro Vinciarelli, Emma Li, Marco Cristani

Abstract: We introduce HARPER, a novel dataset for 3D body pose estimation and forecast in dyadic interactions between users and Spot, the quadruped robot manufactured by Boston Dynamics. The key-novelty is the focus on the robot's perspective, i.e., on the data captured by the robot's sensors. These make 3D body pose analysis challenging because being close to the ground captures humans only partially. The… ▽ More We introduce HARPER, a novel dataset for 3D body pose estimation and forecast in dyadic interactions between users and Spot, the quadruped robot manufactured by Boston Dynamics. The key-novelty is the focus on the robot's perspective, i.e., on the data captured by the robot's sensors. These make 3D body pose analysis challenging because being close to the ground captures humans only partially. The scenario underlying HARPER includes 15 actions, of which 10 involve physical contact between the robot and users. The Corpus contains not only the recordings of the built-in stereo cameras of Spot, but also those of a 6-camera OptiTrack system (all recordings are synchronized). This leads to ground-truth skeletal representations with a precision lower than a millimeter. In addition, the Corpus includes reproducible benchmarks on 3D Human Pose Estimation, Human Pose Forecasting, and Collision Prediction, all based on publicly available baseline approaches. This enables future HARPER users to rigorously compare their results with those we provide in this work. △ Less

Submitted 23 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.14376 [pdf, other]

InfNeRF: Towards Infinite Scale NeRF Rendering with O(log n) Space Complexity

Authors: Jiabin Liang, Lanqing Zhang, Zhuoran Zhao, Xiangyu Xu

Abstract: The conventional mesh-based Level of Detail (LoD) technique, exemplified by applications such as Google Earth and many game engines, exhibits the capability to holistically represent a large scene even the Earth, and achieves rendering with a space complexity of O(log n). This constrained data requirement not only enhances rendering efficiency but also facilitates dynamic data fetching, thereby en… ▽ More The conventional mesh-based Level of Detail (LoD) technique, exemplified by applications such as Google Earth and many game engines, exhibits the capability to holistically represent a large scene even the Earth, and achieves rendering with a space complexity of O(log n). This constrained data requirement not only enhances rendering efficiency but also facilitates dynamic data fetching, thereby enabling a seamless 3D navigation experience for users. In this work, we extend this proven LoD technique to Neural Radiance Fields (NeRF) by introducing an octree structure to represent the scenes in different scales. This innovative approach provides a mathematically simple and elegant representation with a rendering space complexity of O(log n), aligned with the efficiency of mesh-based LoD techniques. We also present a novel training strategy that maintains a complexity of O(n). This strategy allows for parallel training with minimal overhead, ensuring the scalability and efficiency of our proposed method. Our contribution is not only in extending the capabilities of existing techniques but also in establishing a foundation for scalable and efficient large-scale scene representation using NeRF and octree structures. △ Less

Submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.14171 [pdf, other]

MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation

Authors: Longzheng Wang, Xiaohan Xu, Lei Zhang, Jiarui Lu, Yongxiu Xu, Hongbo Xu, Minghao Tang, Chuang Zhang

Abstract: Automatic detection of multimodal misinformation has gained a widespread attention recently. However, the potential of powerful Large Language Models (LLMs) for multimodal misinformation detection remains underexplored. Besides, how to teach LLMs to interpret multimodal misinformation in cost-effective and accessible way is still an open question. To address that, we propose MMIDR, a framework des… ▽ More Automatic detection of multimodal misinformation has gained a widespread attention recently. However, the potential of powerful Large Language Models (LLMs) for multimodal misinformation detection remains underexplored. Besides, how to teach LLMs to interpret multimodal misinformation in cost-effective and accessible way is still an open question. To address that, we propose MMIDR, a framework designed to teach LLMs in providing fluent and high-quality textual explanations for their decision-making process of multimodal misinformation. To convert multimodal misinformation into an appropriate instruction-following format, we present a data augmentation perspective and pipeline. This pipeline consists of a visual information processing module and an evidence retrieval module. Subsequently, we prompt the proprietary LLMs with processed contents to extract rationales for interpreting the authenticity of multimodal misinformation. Furthermore, we design an efficient knowledge distillation approach to distill the capability of proprietary LLMs in explaining multimodal misinformation into open-source LLMs. To explore several research questions regarding the performance of LLMs in multimodal misinformation detection tasks, we construct an instruction-following multimodal misinformation dataset and conduct comprehensive experiments. The experimental findings reveal that our MMIDR exhibits sufficient detection performance and possesses the capacity to provide compelling rationales to support its assessments. △ Less

Submitted 8 April, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: 10 pages, 3 figures

arXiv:2403.13846 [pdf, other]

A Clustering Method with Graph Maximum Decoding Information

Authors: Xinrun Xu, Manying Lv, Zhanbiao Lian, Yurong Wu, ** Yan, Shan Jiang, Zhiming Ding

Abstract: The clustering method based on graph models has garnered increased attention for its widespread applicability across various knowledge domains. Its adaptability to integrate seamlessly with other relevant applications endows the graph model-based clustering analysis with the ability to robustly extract "natural associations" or "graph structures" within datasets, facilitating the modelling of rela… ▽ More The clustering method based on graph models has garnered increased attention for its widespread applicability across various knowledge domains. Its adaptability to integrate seamlessly with other relevant applications endows the graph model-based clustering analysis with the ability to robustly extract "natural associations" or "graph structures" within datasets, facilitating the modelling of relationships between data points. Despite its efficacy, the current clustering method utilizing the graph-based model overlooks the uncertainty associated with random walk access between nodes and the embedded structural information in the data. To address this gap, we present a novel Clustering method for Maximizing Decoding Information within graph-based models, named CMDI. CMDI innovatively incorporates two-dimensional structural information theory into the clustering process, consisting of two phases: graph structure extraction and graph vertex partitioning. Within CMDI, graph partitioning is reformulated as an abstract clustering problem, leveraging maximum decoding information to minimize uncertainty associated with random visits to vertices. Empirical evaluations on three real-world datasets demonstrate that CMDI outperforms classical baseline methods, exhibiting a superior decoding information ratio (DI-R). Furthermore, CMDI showcases heightened efficiency, particularly when considering prior knowledge (PK). These findings underscore the effectiveness of CMDI in enhancing decoding information quality and computational efficiency, positioning it as a valuable tool in graph-based clustering analyses. △ Less

Submitted 18 April, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: 9 pages, 9 figures, IJCNN 2024

arXiv:2403.13844 [pdf, other]

Scheduled Knowledge Acquisition on Lightweight Vector Symbolic Architectures for Brain-Computer Interfaces

Authors: Yejia Liu, Shi** Duan, Xiaolin Xu, Shaolei Ren

Abstract: Brain-Computer interfaces (BCIs) are typically designed to be lightweight and responsive in real-time to provide users timely feedback. Classical feature engineering is computationally efficient but has low accuracy, whereas the recent neural networks (DNNs) improve accuracy but are computationally expensive and incur high latency. As a promising alternative, the low-dimensional computing (LDC) cl… ▽ More Brain-Computer interfaces (BCIs) are typically designed to be lightweight and responsive in real-time to provide users timely feedback. Classical feature engineering is computationally efficient but has low accuracy, whereas the recent neural networks (DNNs) improve accuracy but are computationally expensive and incur high latency. As a promising alternative, the low-dimensional computing (LDC) classifier based on vector symbolic architecture (VSA), achieves small model size yet higher accuracy than classical feature engineering methods. However, its accuracy still lags behind that of modern DNNs, making it challenging to process complex brain signals. To improve the accuracy of a small model, knowledge distillation is a popular method. However, maintaining a constant level of distillation between the teacher and student models may not be the best way for a growing student during its progressive learning stages. In this work, we propose a simple scheduled knowledge distillation method based on curriculum data order to enable the student to gradually build knowledge from the teacher model, controlled by an $α$ scheduler. Meanwhile, we employ the LDC/VSA as the student model to enhance the on-device inference efficiency for tiny BCI devices that demand low latency. The empirical results have demonstrated that our approach achieves better tradeoff between accuracy and hardware efficiency compared to other methods. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: Accepted as a full paper by the tinyML Research Symposium 2024

arXiv:2403.13437 [pdf, other]

Search for $ΔS=2$ nonleptonic hyperon decays $Ω^-\toΣ^{0}π^{-}$ and $Ω^-\to nK^{-}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the center-of-mass energy of $\sqrt{s} = 3.686$ GeV, we search for the first time for two nonleptonic hyperon decays that change strangeness by two units, $Ω^-\toΣ^{0}π^-$ and $Ω^-\to nK^{-}$. No significant signal is observed. The upper limits on their decay branching fractions are determined to be… ▽ More Using $(27.12 \pm 0.14) \times 10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the center-of-mass energy of $\sqrt{s} = 3.686$ GeV, we search for the first time for two nonleptonic hyperon decays that change strangeness by two units, $Ω^-\toΣ^{0}π^-$ and $Ω^-\to nK^{-}$. No significant signal is observed. The upper limits on their decay branching fractions are determined to be $\mathcal{B}(Ω^-\toΣ^{0}π^-) < 5.4\times 10^{-4}$ and $\mathcal{B}(Ω^-\to nK^{-}) < 2.4\times 10^{-4}$ at the $90\%$ confidence level. △ Less

Submitted 14 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.13436 [pdf, other]

Multi-photon super-linear image scanning microscopy using upconversion nanoparticles

Authors: Yao Wang, Baolei Liu, Lei Ding, Chaohao Chen, Xuchen Shan, Da**g Wang, Menghan Tian, Jiaqi Song, Ze Zheng, Xiaoxue Xu, Xiaolan Zhong, Fan Wang

Abstract: Super-resolution fluorescence microscopy is of great interest in life science studies for visualizing subcellular structures at the nanometer scale. Among various kinds of super-resolution approaches, image scanning microscopy (ISM) offers a doubled resolution enhancement in a simple and straightforward manner, based on the commonly used confocal microscopes. ISM is also suitable to be integrated… ▽ More Super-resolution fluorescence microscopy is of great interest in life science studies for visualizing subcellular structures at the nanometer scale. Among various kinds of super-resolution approaches, image scanning microscopy (ISM) offers a doubled resolution enhancement in a simple and straightforward manner, based on the commonly used confocal microscopes. ISM is also suitable to be integrated with multi-photon microscopy techniques, such as two-photon excitation and second-harmonic generation imaging, for deep tissue imaging, but it remains the twofold limited resolution enhancement and requires expensive femtosecond lasers. Here, we present and experimentally demonstrate the super-linear ISM (SL-ISM) to push the resolution enhancement beyond the factor of two, with a single low-power, continuous-wave, and near-infrared laser, by harnessing the emission nonlinearity within the multiphoton excitation process of lanthanide-doped upconversion nanoparticles (UCNPs). Based on a modified confocal microscope, we achieve a resolution of about 120 nm, 1/8th of the excitation wavelength. Furthermore, we demonstrate a parallel detection strategy of SL-ISM with the multifocal structured excitation pattern, to speed up the acquisition frame rate. This method suggests a new perspective for super-resolution imaging or sensing, multi-photon imaging, and deep-tissue imaging with simple, low-cost, and straightforward implementations. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 9 pages, 4 figures

arXiv:2403.13212 [pdf, ps, other]

Stability for a multi-frequency inverse random source problem

Authors: Tianjiao Wang, Xiang Xu, Yue Zhao

Abstract: We present stability estimates for the inverse source problem of the stochastic Helmholtz equation in two and three dimensions by either near-field or far-field data. The random source is assumed to be a microlocally isotropic generalized Gaussian random function. For the direct problem, by exploring the regularity of the Green function, we demonstrate that the direct problem admits a unique bound… ▽ More We present stability estimates for the inverse source problem of the stochastic Helmholtz equation in two and three dimensions by either near-field or far-field data. The random source is assumed to be a microlocally isotropic generalized Gaussian random function. For the direct problem, by exploring the regularity of the Green function, we demonstrate that the direct problem admits a unique bounded solution with an explicit integral representation, which enhances the existing regularity result. For the case using near-field data, the analysis of the inverse problem employs microlocal analysis to achieve an estimate for the Fourier transform of the micro-correlation strength by the near-field correlation data and a high-frequency tail. The stability follows by showing the analyticity of the data and applying a novel analytic continuation principle. The stability estimate by far-field data is derived by investigating the correlation of the far-field data. The stability estimate consists of the Lipschitz type data discrepancy and the logarithmic stability. The latter decreases as the upper bound of the frequency increases, which exhibits the phenomenon of increasing stability. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: 19pages

MSC Class: 35Q74; 35R30; 78A46

arXiv:2403.13018 [pdf, other]

Invisible Backdoor Attack Through Singular Value Decomposition

Authors: Wenmin Chen, Xiaowei Xu

Abstract: With the widespread application of deep learning across various domains, concerns about its security have grown significantly. Among these, backdoor attacks pose a serious security threat to deep neural networks (DNNs). In recent years, backdoor attacks on neural networks have become increasingly sophisticated, aiming to compromise the security and trustworthiness of models by implanting hidden, u… ▽ More With the widespread application of deep learning across various domains, concerns about its security have grown significantly. Among these, backdoor attacks pose a serious security threat to deep neural networks (DNNs). In recent years, backdoor attacks on neural networks have become increasingly sophisticated, aiming to compromise the security and trustworthiness of models by implanting hidden, unauthorized functionalities or triggers, leading to misleading predictions or behaviors. To make triggers less perceptible and imperceptible, various invisible backdoor attacks have been proposed. However, most of them only consider invisibility in the spatial domain, making it easy for recent defense methods to detect the generated toxic images.To address these challenges, this paper proposes an invisible backdoor attack called DEBA. DEBA leverages the mathematical properties of Singular Value Decomposition (SVD) to embed imperceptible backdoors into models during the training phase, thereby causing them to exhibit predefined malicious behavior under specific trigger conditions. Specifically, we first perform SVD on images, and then replace the minor features of trigger images with those of clean images, using them as triggers to ensure the effectiveness of the attack. As minor features are scattered throughout the entire image, the major features of clean images are preserved, making poisoned images visually indistinguishable from clean ones. Extensive experimental evaluations demonstrate that DEBA is highly effective, maintaining high perceptual quality and a high attack success rate for poisoned images. Furthermore, we assess the performance of DEBA under existing defense measures, showing that it is robust and capable of significantly evading and resisting the effects of these defense measures. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.12845 [pdf, ps, other]

doi 10.1103/PhysRevB.109.205409

Giant electrode effect on tunneling magnetoresistance and electroresistance in van der Waals intrinsic multiferroic tunnel junctions using VS2

Authors: Zhi Yan, Ruixia Yang, Cheng Fang, Wentian Lu, Xiaohong Xu

Abstract: Van der Waals multiferroic tunnel junctions (vdW-MFTJs) with multiple nonvolatile resistive states are highly suitable for new physics and next-generation storage electronics. However, currently reported vdW-MFTJs are based on two types of materials, i.e., vdW ferromagnetic and ferroelectric materials, forming a multiferroic system. This undoubtedly introduces additional interfaces, increasing the… ▽ More Van der Waals multiferroic tunnel junctions (vdW-MFTJs) with multiple nonvolatile resistive states are highly suitable for new physics and next-generation storage electronics. However, currently reported vdW-MFTJs are based on two types of materials, i.e., vdW ferromagnetic and ferroelectric materials, forming a multiferroic system. This undoubtedly introduces additional interfaces, increasing the complexity of experimental preparation. Herein, we engineer vdW intrinsic MFTJs utilizing bilayer VS$_2$. By employing the nonequilibrium Green's function combined with density functional theory, we systematically investigate the influence of three types of electrodes (including non-vdW pure metal Ag/Au, vdW metallic 1T-MoS$_2$/2H-PtTe$_2$, and vdW ferromagnetic metallic Fe$_3$GaTe$_2$/Fe$_3$GeTe$_2$) on the electronic transport properties of VS$_2$-based intrinsic MFTJs. We demonstrate that these MFTJs manifest a giant electrode-dependent electronic transport characteristic effect. Comprehensively comparing these electrode pairs, the Fe$_3$GaTe$_2$/Fe$_3$GeTe$_2$ electrode combination exhibits optimal transport properties, the maximum TMR (TER) can reach 10949\% (69\%) and the minimum resistance-area product (RA) is 0.45 $Ω$$μ$m$^{2}$, as well as the perfect spin filtering and negative differential resistance effects. More intriguingly, TMR (TER) can be further enhanced to 34000\% (380\%) by applying an external bias voltage (0.1 V), while RA can be reduced to 0.16 $Ω$$μ$m$^{2}$ under the influence of biaxial stress (-3\%). Our proposed concept of designing vdW-MFTJs using intrinsic multiferroic materials points towards new avenues in experimental exploration. △ Less

Submitted 7 May, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

Comments: 13 pages, 9 figures

arXiv:2403.12778 [pdf, other]

ViTGaze: Gaze Following with Interaction Features in Vision Transformers

Authors: Yuehao Song, Xinggang Wang, **gfeng Yao, Wenyu Liu, **glin Zhang, Xiangmin Xu

Abstract: Gaze following aims to interpret human-scene interactions by predicting the person's focal point of gaze. Prevailing approaches often use multi-modality inputs, most of which adopt a two-stage framework. Hence their performance highly depends on the previous prediction accuracy. Others use a single-modality approach with complex decoders, increasing network computational load. Inspired by the rema… ▽ More Gaze following aims to interpret human-scene interactions by predicting the person's focal point of gaze. Prevailing approaches often use multi-modality inputs, most of which adopt a two-stage framework. Hence their performance highly depends on the previous prediction accuracy. Others use a single-modality approach with complex decoders, increasing network computational load. Inspired by the remarkable success of pre-trained plain Vision Transformers (ViTs), we introduce a novel single-modality gaze following framework, ViTGaze. In contrast to previous methods, ViTGaze creates a brand new gaze following framework based mainly on powerful encoders (dec. param. less than 1%). Our principal insight lies in that the inter-token interactions within self-attention can be transferred to interactions between humans and scenes. Leveraging this presumption, we formulate a framework consisting of a 4D interaction encoder and a 2D spatial guidance module to extract human-scene interaction information from self-attention maps. Furthermore, our investigation reveals that ViT with self-supervised pre-training exhibits an enhanced ability to extract correlated information. A large number of experiments have been conducted to demonstrate the performance of the proposed method. Our method achieves state-of-the-art (SOTA) performance among all single-modality methods (3.4% improvement on AUC, 5.1% improvement on AP) and very comparable performance against multi-modality methods with 59% number of parameters less. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.12388 [pdf, other]

Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models

Authors: Ying-Chun Lin, Jennifer Neville, Jack W. Stokes, Longqi Yang, Tara Safavi, Mengting Wan, Scott Counts, Siddharth Suri, Reid Andersen, Xiaofeng Xu, Deepak Gupta, Sujay Kumar Jauhar, Xia Song, Georg Buscher, Saurabh Tiwary, Brent Hecht, Jaime Teevan

Abstract: Accurate and interpretable user satisfaction estimation (USE) is critical for understanding, evaluating, and continuously improving conversational systems. Users express their satisfaction or dissatisfaction with diverse conversational patterns in both general-purpose (ChatGPT and Bing Copilot) and task-oriented (customer service chatbot) conversational systems. Existing approaches based on featur… ▽ More Accurate and interpretable user satisfaction estimation (USE) is critical for understanding, evaluating, and continuously improving conversational systems. Users express their satisfaction or dissatisfaction with diverse conversational patterns in both general-purpose (ChatGPT and Bing Copilot) and task-oriented (customer service chatbot) conversational systems. Existing approaches based on featurized ML models or text embeddings fall short in extracting generalizable patterns and are hard to interpret. In this work, we show that LLMs can extract interpretable signals of user satisfaction from their natural language utterances more effectively than embedding-based approaches. Moreover, an LLM can be tailored for USE via an iterative prompting framework using supervision from labeled examples. The resulting method, Supervised Prompting for User satisfaction Rubrics (SPUR), not only has higher accuracy but is more interpretable as it scores user satisfaction via learned rubrics with a detailed breakdown. △ Less

Submitted 8 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.11990 [pdf, other]

GetMesh: A Controllable Model for High-quality Mesh Generation and Manipulation

Authors: Zhaoyang Lyu, Ben Fei, **yi Wang, Xudong Xu, Ya Zhang, Weidong Yang, Bo Dai

Abstract: Mesh is a fundamental representation of 3D assets in various industrial applications, and is widely supported by professional softwares. However, due to its irregular structure, mesh creation and manipulation is often time-consuming and labor-intensive. In this paper, we propose a highly controllable generative model, GetMesh, for mesh generation and manipulation across different categories. By ta… ▽ More Mesh is a fundamental representation of 3D assets in various industrial applications, and is widely supported by professional softwares. However, due to its irregular structure, mesh creation and manipulation is often time-consuming and labor-intensive. In this paper, we propose a highly controllable generative model, GetMesh, for mesh generation and manipulation across different categories. By taking a varying number of points as the latent representation, and re-organizing them as triplane representation, GetMesh generates meshes with rich and sharp details, outperforming both single-category and multi-category counterparts. Moreover, it also enables fine-grained control over the generation process that previous mesh generative models cannot achieve, where changing global/local mesh topologies, adding/removing mesh parts, and combining mesh parts across categories can be intuitively, efficiently, and robustly accomplished by adjusting the number, positions or features of latent points. Project page is https://getmesh.github.io. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.11635 [pdf, other]

doi 10.1093/mnras/stae888

The FAST Galactic Plane Pulsar Snapshot Survey -- V. PSR J1901+0658 in a double neutron star system

Authors: W. Q. Su, J. L. Han, Z. L. Yang, P. F. Wang, J. P. Yuan, C. Wang, D. J. Zhou, T. Wang, Y. Yan, W. C. **g, N. N. Cai, L. Xie, J. Xu, H. G. Wang, R. X. Xu, X. P. You

Abstract: Double neutron star (DNS) systems offer excellent opportunities to test gravity theories. We report the timing results of PSR J1901+0658, the first pulsar discovered in the FAST Galactic Plane Pulsar Snapshot (GPPS) Survey. Based on timing observations by FAST over 5 yr, we obtain the phase-coherent timing solutions and derive the precise measurements of its position, spin parameters, orbital para… ▽ More Double neutron star (DNS) systems offer excellent opportunities to test gravity theories. We report the timing results of PSR J1901+0658, the first pulsar discovered in the FAST Galactic Plane Pulsar Snapshot (GPPS) Survey. Based on timing observations by FAST over 5 yr, we obtain the phase-coherent timing solutions and derive the precise measurements of its position, spin parameters, orbital parameters, and dispersion measure. It has a period of 75.7 ms, a period derivative of 2.169(6)$\times 10^{-19}$ s s$^{-1}$, and a characteristic age of 5.5 Gyr. This pulsar is in an orbit with a period of 14.45 d and an eccentricity of 0.366. One post-Keplerian parameter, periastron advance, has been well-measured as being 0.00531(9) deg yr$^{-1}$, from which the total mass of this system is derived to be 2.79(7) M$_{\odot}$. The pulsar has the mass upper limit of 1.68 M$_{\odot}$, so the lower limit for the companion mass is 1.11 M$_{\odot}$. Because PSR J1901+0658 is a partially recycled pulsar in an eccentric binary orbit with such a large companion mass, it should be in a DNS system according to the evolution history of the binary system. △ Less

Submitted 24 April, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: 6 pages, 6 figures

Journal ref: Monthly Notices of the Royal Astronomical Society, Volume 530, Issue 2, May 2024, Pages 1506-1511

arXiv:2403.11083 [pdf, other]

Customizing Visual-Language Foundation Models for Multi-modal Anomaly Detection and Reasoning

Authors: Xiaohao Xu, Yunkang Cao, Yongqi Chen, Weiming Shen, Xiaonan Huang

Abstract: Anomaly detection is vital in various industrial scenarios, including the identification of unusual patterns in production lines and the detection of manufacturing defects for quality control. Existing techniques tend to be specialized in individual scenarios and lack generalization capacities. In this study, we aim to develop a generic anomaly detection model applicable across multiple scenarios.… ▽ More Anomaly detection is vital in various industrial scenarios, including the identification of unusual patterns in production lines and the detection of manufacturing defects for quality control. Existing techniques tend to be specialized in individual scenarios and lack generalization capacities. In this study, we aim to develop a generic anomaly detection model applicable across multiple scenarios. To achieve this, we customize generic visual-language foundation models that possess extensive knowledge and robust reasoning abilities into anomaly detectors and reasoners. Specifically, we introduce a multi-modal prompting strategy that incorporates domain knowledge from experts as conditions to guide the models. Our approach considers multi-modal prompt types, including task descriptions, class context, normality rules, and reference images. In addition, we unify the input representation of multi-modality into a 2D image format, enabling multi-modal anomaly detection and reasoning. Our preliminary studies demonstrate that combining visual and language prompts as conditions for customizing the models enhances anomaly detection performance. The customized models showcase the ability to detect anomalies across different data modalities such as images and point clouds. Qualitative case studies further highlight the anomaly detection and reasoning capabilities, particularly for multi-object scenes and temporal data. Our code is available at https://github.com/Xiaohao-Xu/Customizable-VLM. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.10877 [pdf, ps, other]

Test of lepton universality and measurement of the form factors of $D^0\to K^{*}(892)^-μ^+ν_μ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (637 additional authors not shown)

Abstract: We report a first study of the semileptonic decay $D^0\rightarrow K^-π^0μ^{+}ν_μ$ by analyzing an $e^+e^-$ annihilation data sample of $7.9~\mathrm{fb}^{-1}$ collected at the center-of-mass energy of 3.773 GeV with the BESIII detector. The absolute branching fraction of $D^0\to K^-π^0μ^{+}ν_μ$ is measured for the first time to be $(0.729 \pm 0.014_{\rm stat} \pm 0.011_{\rm syst})\%$. Based on an a… ▽ More We report a first study of the semileptonic decay $D^0\rightarrow K^-π^0μ^{+}ν_μ$ by analyzing an $e^+e^-$ annihilation data sample of $7.9~\mathrm{fb}^{-1}$ collected at the center-of-mass energy of 3.773 GeV with the BESIII detector. The absolute branching fraction of $D^0\to K^-π^0μ^{+}ν_μ$ is measured for the first time to be $(0.729 \pm 0.014_{\rm stat} \pm 0.011_{\rm syst})\%$. Based on an amplitude analysis, the $S\text{-}{\rm wave}$ contribution is determined to be $(5.76 \pm 0.35_{\rm stat} \pm 0.29_{\rm syst})\%$ of the total decay rate in addition to the dominated $K^{*}(892)^-$ component. The branching fraction of $D^0\to K^{*}(892)^-μ^+ν_μ$ is given to be $(2.062 \pm 0.039_{\rm stat} \pm 0.032_{\rm syst})\%$, which improves the precision of the world average by a factor of 5. Combining with the world average of ${\mathcal B}(D^0\to K^{*}(892)^-e^+ν_e)$, the ratio of the branching fractions obtained is $\frac{{\mathcal B}(D^0\to K^{*}(892)^-μ^+ν_μ)}{{\mathcal B}(D^0\to K^{*}(892)^-e^+ν_e)} = 0.96\pm0.08$, in agreement with lepton flavor universality. Furthermore, assuming single-pole dominance parameterization, the most precise hadronic form factor ratios for $D^0\to K^{*}(892)^{-} μ^+ν_μ$ are extracted to be $r_{V}=V(0)/A_1(0)=1.37 \pm 0.09_{\rm stat} \pm 0.03_{\rm syst}$ and $r_{2}=A_2(0)/A_1(0)=0.76 \pm 0.06_{\rm stat} \pm 0.02_{\rm syst}$. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: 9 pages, 3 figures

arXiv:2403.10754 [pdf, other]

doi 10.1093/mnras/stae762

CSST large-scale structure analysis pipeline: I. constructing reference mock galaxy redshift surveys

Authors: Yizhou Gu, Xiaohu Yang, Jiaxin Han, Yirong Wang, Qingyang Li, Zhenlin Tan, Wenkang Jiang, Yaru Wang, Jiaqi Wang, Antonios Katsianis, Xiaoju Xu, Haojie Xu, Wensheng Hong, Houjun Mo, Run Wen, Xianzhong Zheng, Feng Shi, Pengjie Zhang, Zhongxu Zhai, Chengze Liu, Wenting Wang, Ying Zu, Hong Guo, Youcai Zhang, Yi Lu , et al. (7 additional authors not shown)

Abstract: In this paper, we set out to construct a set of reference mock galaxy redshift surveys (MGRSs) for the future Chinese Space-station Survey Telescope (CSST) observation, where subsequent survey selection effects can be added and evaluated. This set of MGRSs is generated using the dark matter subhalos extracted from a high-resolution Jiutian $N$-body simulation of the standard $Λ$CDM cosmogony with… ▽ More In this paper, we set out to construct a set of reference mock galaxy redshift surveys (MGRSs) for the future Chinese Space-station Survey Telescope (CSST) observation, where subsequent survey selection effects can be added and evaluated. This set of MGRSs is generated using the dark matter subhalos extracted from a high-resolution Jiutian $N$-body simulation of the standard $Λ$CDM cosmogony with $Ω_m=0.3111$, $Ω_Λ=0.6889$, and $σ_8=0.8102$. The simulation has a boxsize of $1~h^{-1} {\rm Gpc}$, and consists of $6144^3$ particles with mass resolution $3.723 \times 10^{8} h^{-1} M_\odot $. In order to take into account the effect of redshift evolution, we first use all 128 snapshots in the Jiutian simulation to generate a light-cone halo/subhalo catalog. Next, galaxy luminosities are assigned to the main and subhalo populations using the subhalo abundance matching (SHAM) method with the DESI $z$-band luminosity functions at different redshifts. Multi-band photometries, as well as images, are then assigned to each mock galaxy using a 3-dimensional parameter space nearest neighbor sampling of the DESI LS observational galaxies and groups. Finally, the CSST and DESI LS survey geometry and magnitude limit cuts are applied to generate the required MGRSs. As we have checked, this set of MGRSs can generally reproduce the observed galaxy luminosity/mass functions within 0.1 dex for galaxies with $L > 10^8 L_\odot$ (or $M_* > 10^{8.5} M_\odot$) and within 1-$σ$ level for galaxies with $L < 10^8L_\odot$ (or $M_* < 10^{8.5} M_\odot$). Together with the CSST slitless spectra and redshifts for our DESI LS seed galaxies that are under construction, we will set out to test various slitless observational selection effects in subsequent probes. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 13 pages, 9 figures, accepted for publication in MNRAS

arXiv:2403.10553 [pdf, other]

Learning to Watermark LLM-generated Text via Reinforcement Learning

Authors: Xiaojun Xu, Yuanshun Yao, Yang Liu

Abstract: We study how to watermark LLM outputs, i.e. embedding algorithmically detectable signals into LLM-generated text to track misuse. Unlike the current mainstream methods that work with a fixed LLM, we expand the watermark design space by including the LLM tuning stage in the watermark pipeline. While prior works focus on token-level watermark that embeds signals into the output, we design a model-le… ▽ More We study how to watermark LLM outputs, i.e. embedding algorithmically detectable signals into LLM-generated text to track misuse. Unlike the current mainstream methods that work with a fixed LLM, we expand the watermark design space by including the LLM tuning stage in the watermark pipeline. While prior works focus on token-level watermark that embeds signals into the output, we design a model-level watermark that embeds signals into the LLM weights, and such signals can be detected by a paired detector. We propose a co-training framework based on reinforcement learning that iteratively (1) trains a detector to detect the generated watermarked text and (2) tunes the LLM to generate text easily detectable by the detector while kee** its normal utility. We empirically show that our watermarks are more accurate, robust, and adaptable (to new attacks). It also allows watermarked model open-sourcing. In addition, if used together with alignment, the extra overhead introduced is low - only training an extra reward model (i.e. our detector). We hope our work can bring more effort into studying a broader watermark design that is not limited to working with a fixed LLM. We open-source the code: https://github.com/xiaojunxu/learning-to-watermark-llm . △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.10483 [pdf, ps, other]

Central limit theorems for the derivatives of self-intersection local time for $d$-dimensional Brownian motion

Authors: Xiaoyan Xu, Xianye Yu

Abstract: Let $\{B_t,t\geq0\}$ be a d-dimensional Brownian motion. We prove that the approximation of the higher derivative of renormalized self-intersection local time $$ \int_{0}^{1}\int_{0}^{s}\left(p^{(|k|)}_{d,ε}(B_{s}-B_{r})-E[p^{(|k|)}_{d,ε}(B_{s}-B_{r})]\right)drds, $$ where the multiindex $k=(k_{1},\cdots,k_{d})$, $ p_{d,ε}^{(|k|)}(x_1,x_2,\cdots,x_d):=\partial^{k_1}_{x_1}\partial^{k_2}_{x_2}$… ▽ More Let $\{B_t,t\geq0\}$ be a d-dimensional Brownian motion. We prove that the approximation of the higher derivative of renormalized self-intersection local time $$ \int_{0}^{1}\int_{0}^{s}\left(p^{(|k|)}_{d,ε}(B_{s}-B_{r})-E[p^{(|k|)}_{d,ε}(B_{s}-B_{r})]\right)drds, $$ where the multiindex $k=(k_{1},\cdots,k_{d})$, $ p_{d,ε}^{(|k|)}(x_1,x_2,\cdots,x_d):=\partial^{k_1}_{x_1}\partial^{k_2}_{x_2}$ $\cdots\partial^{k_d}_{x_d}p_{d,ε}(x_1,x_2,\cdots,x_d)$ and $p_{d,ε}(x)=\frac{1}{(2πε)^{d/2}}e^{-\frac{|x|^{2}}{2ε}}, x\in\mathbb{R}^d$, satisfies the central limit theorems when renormalized by $(\log\frac{1}ε)^{-1}$ in the case $d=2$, $|k|=1$ and by $ε^{\frac{d+|k|-3}{2}}$ in the case $d\geq 3$, $|k|\geq 1$, which gives a complete answer to the conjecture of Markowsky [In Séminaire de Probabilitiés \uppercase\expandafter{\romannumeral10\romannumeral50\romannumeral4} (2012) 141-148 Springer]. We as well prove that its m-th Wiener chaotic component satisfies the central limit theorems when renormalized by a multiplicative factor in different cases. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.10418 [pdf]

doi 10.1021/acs.analchem.3c05006

Versatile Capillary Cells for Handling Concentrated Samples in Analytical Ultracentrifugation

Authors: Quy Ong, Xufeng Xu, Francesco Stellacci

Abstract: In concentrated macromolecular dispersions, far-from-ideal intermolecular interactions determine the dispersion behaviors including phase transition, crystallization, and liquid-liquid phase separation. Here, we present a novel versatile capillary-cell design for analytical ultracentrifugation-sedimentation equilibrium (AUC-SE), ideal for studying samples at high concentrations. Current setups for… ▽ More In concentrated macromolecular dispersions, far-from-ideal intermolecular interactions determine the dispersion behaviors including phase transition, crystallization, and liquid-liquid phase separation. Here, we present a novel versatile capillary-cell design for analytical ultracentrifugation-sedimentation equilibrium (AUC-SE), ideal for studying samples at high concentrations. Current setups for such studies are difficult and unreliable to handle, leading to a low experimental success rate. The design presented here is easy to use, robust, and reusable for samples in both aqueous and organic solvents while requiring no special tools or chemical modification of AUC cells. The key and unique feature is the fabrication of liquid reservoirs directly on the bottom window of AUC cells, which can be easily realized by laser ablation or mechanical drilling. The channel length and optical path length are therefore tunable. The success rate for assembling this new cell is close to 100%. We demonstrate the practicality of this cell by studying: 1) the equation of state and second virial coefficients of concentrated gold nanoparticle dispersions in water and bovine serum albumin (BSA) as well as lysozyme solution in aqueous buffers, 2) the gelation phase transition of DNA and BSA solutions, and 3) liquid-liquid phase separation of concentrated BSA/polyethylene glycol (PEG) droplets. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 19 pages, 6 figures

Journal ref: Analytical Chemistry(2024)96,6,2567

arXiv:2403.10299 [pdf, other]

A Multi-constraint and Multi-objective Allocation Model for Emergency Rescue in IoT Environment

Authors: Xinrun Xu, Zhanbiao Lian, Yurong Wu, Manying Lv, Zhiming Ding, Jian Yan, Shang Jiang

Abstract: Emergency relief operations are essential in disaster aftermaths, necessitating effective resource allocation to minimize negative impacts and maximize benefits. In prolonged crises or extensive disasters, a systematic, multi-cycle approach is key for timely and informed decision-making. Leveraging advancements in IoT and spatio-temporal data analytics, we've developed the Multi-Objective Shuffled… ▽ More Emergency relief operations are essential in disaster aftermaths, necessitating effective resource allocation to minimize negative impacts and maximize benefits. In prolonged crises or extensive disasters, a systematic, multi-cycle approach is key for timely and informed decision-making. Leveraging advancements in IoT and spatio-temporal data analytics, we've developed the Multi-Objective Shuffled Gray-Wolf Frog Lea** Model (MSGW-FLM). This multi-constraint, multi-objective resource allocation model has been rigorously tested against 28 diverse challenges, showing superior performance in comparison to established models such as NSGA-II, IBEA, and MOEA/D. MSGW-FLM's effectiveness is particularly notable in complex, multi-cycle emergency rescue scenarios, which involve numerous constraints and objectives. This model represents a significant step forward in optimizing resource distribution in emergency response situations. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 5 pages, 5 figures, ISCAS 2024

arXiv:2403.10249 [pdf, other]

A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges

Authors: Xinrun Xu, Yuxin Wang, Chaoyi Xu, Ziluo Ding, Jiechuan Jiang, Zhiming Ding, Börje F. Karlsson

Abstract: The swift evolution of Large-scale Models (LMs), either language-focused or multi-modal, has garnered extensive attention in both academy and industry. But despite the surge in interest in this rapidly evolving area, there are scarce systematic reviews on their capabilities and potential in distinct impactful scenarios. This paper endeavours to help bridge this gap, offering a thorough examination… ▽ More The swift evolution of Large-scale Models (LMs), either language-focused or multi-modal, has garnered extensive attention in both academy and industry. But despite the surge in interest in this rapidly evolving area, there are scarce systematic reviews on their capabilities and potential in distinct impactful scenarios. This paper endeavours to help bridge this gap, offering a thorough examination of the current landscape of LM usage in regards to complex game playing scenarios and the challenges still open. Here, we seek to systematically review the existing architectures of LM-based Agents (LMAs) for games and summarize their commonalities, challenges, and any other insights. Furthermore, we present our perspective on promising future research avenues for the advancement of LMs in games. We hope to assist researchers in gaining a clear understanding of the field and to generate more interest in this highly impactful research direction. A corresponding resource, continuously updated, can be found in our GitHub repository. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 13 pages, 3 figures

arXiv:2403.10066 [pdf, other]

Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment

Authors: Ziyu Shan, Yujie Zhang, Qi Yang, Haichen Yang, Yiling Xu, Jenq-Neng Hwang, Xiaozhong Xu, Shan Liu

Abstract: No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference, which have achieved tremendous improvements due to the utilization of deep neural networks. However, learning-based NR-PCQA methods suffer from the scarcity of labeled data and usually perform suboptimally in terms of generalization. To… ▽ More No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference, which have achieved tremendous improvements due to the utilization of deep neural networks. However, learning-based NR-PCQA methods suffer from the scarcity of labeled data and usually perform suboptimally in terms of generalization. To solve the problem, we propose a novel contrastive pre-training framework tailored for PCQA (CoPA), which enables the pre-trained model to learn quality-aware representations from unlabeled data. To obtain anchors in the representation space, we project point clouds with different distortions into images and randomly mix their local patches to form mixed images with multiple distortions. Utilizing the generated anchors, we constrain the pre-training process via a quality-aware contrastive loss following the philosophy that perceptual quality is closely related to both content and distortion. Furthermore, in the model fine-tuning stage, we propose a semantic-guided multi-view fusion module to effectively integrate the features of projected images from multiple perspectives. Extensive experiments show that our method outperforms the state-of-the-art PCQA methods on popular benchmarks. Further investigations demonstrate that CoPA can also benefit existing learning-based PCQA models. △ Less

Submitted 26 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.10010 [pdf, other]

doi 10.1103/PhysRevLett.132.131002

Measurements of All-Particle Energy Spectrum and Mean Logarithmic Mass of Cosmic Rays from 0.3 to 30 PeV with LHAASO-KM2A

Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen , et al. (256 additional authors not shown)

Abstract: We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at… ▽ More We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at $3.67 \pm 0.05 \pm 0.15$ PeV. Below the knee, the spectral index is found to be -$2.7413 \pm 0.0004 \pm 0.0050$, while above the knee, it is -$3.128 \pm 0.005 \pm 0.027$, with the sharpness of the transition measured with a statistical error of 2%. The mean logarithmic mass of cosmic rays is almost heavier than helium in the whole measured energy range. It decreases from 1.7 at 0.3 PeV to 1.3 at 3 PeV, representing a 24% decline following a power law with an index of -$0.1200 \pm 0.0003 \pm 0.0341$. This is equivalent to an increase in abundance of light components. Above the knee, the mean logarithmic mass exhibits a power law trend towards heavier components, which is reversal to the behavior observed in the all-particle energy spectrum. Additionally, the knee position and the change in power-law index are approximately the same. These findings suggest that the knee observed in the all-particle spectrum corresponds to the knee of the light component, rather than the medium-heavy components. △ Less

Submitted 26 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

Comments: 8 pages, 3 figures

Journal ref: Physical Review Letters 132, 131002 (2024)

arXiv:2403.10004 [pdf, other]

ST-LDM: A Universal Framework for Text-Grounded Object Generation in Real Images

Authors: Xiangtian Xue, Jiasong Wu, Youyong Kong, Lotfi Senhadji, Huazhong Shu

Abstract: We present a novel image editing scenario termed Text-grounded Object Generation (TOG), defined as generating a new object in the real image spatially conditioned by textual descriptions. Existing diffusion models exhibit limitations of spatial perception in complex real-world scenes, relying on additional modalities to enforce constraints, and TOG imposes heightened challenges on scene comprehens… ▽ More We present a novel image editing scenario termed Text-grounded Object Generation (TOG), defined as generating a new object in the real image spatially conditioned by textual descriptions. Existing diffusion models exhibit limitations of spatial perception in complex real-world scenes, relying on additional modalities to enforce constraints, and TOG imposes heightened challenges on scene comprehension under the weak supervision of linguistic information. We propose a universal framework ST-LDM based on Swin-Transformer, which can be integrated into any latent diffusion model with training-free backward guidance. ST-LDM encompasses a global-perceptual autoencoder with adaptable compression scales and hierarchical visual features, parallel with deformable multimodal transformer to generate region-wise guidance for the subsequent denoising process. We transcend the limitation of traditional attention mechanisms that only focus on existing visual features by introducing deformable feature alignment to hierarchically refine spatial positioning fused with multi-scale visual and linguistic information. Extensive Experiments demonstrate that our model enhances the localization of attention mechanisms while preserving the generative capabilities inherent to diffusion models. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.09783 [pdf, other]

Impact of the cosmic neutrino background on long-range force searches

Authors: Garv Chauhan, Xun-Jie Xu

Abstract: Light bosons can mediate long-range forces. We show that light bosonic mediators interacting with a background medium, in particular, with the cosmic neutrino background (C$ν$B), may induce medium-dependent masses which could effectively screen long-range forces from detection. This leads to profound implications for long-range force searches in e.g. the Eöt-Wash, MICROSCOPE, and lunar laser-rangi… ▽ More Light bosons can mediate long-range forces. We show that light bosonic mediators interacting with a background medium, in particular, with the cosmic neutrino background (C$ν$B), may induce medium-dependent masses which could effectively screen long-range forces from detection. This leads to profound implications for long-range force searches in e.g. the Eöt-Wash, MICROSCOPE, and lunar laser-ranging (LLR) experiments. For instance, we find that when the coupling of the mediator to neutrinos is above $3\times10^{-10}$ or $5\times10^{-13}$, bounds from LLR and experiments employing the Sun as an attractor, respectively, would be entirely eliminated. Larger values of the coupling can also substantially alleviate bounds from searches conducted at shorter distances. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 18 pages, 2 figures

arXiv:2403.09718 [pdf]

Comprehensive Implementation of TextCNN for Enhanced Collaboration between Natural Language Processing and System Recommendation

Authors: Xiaonan Xu, Zheng Xu, Zhipeng Ling, Zhengyu **, ShuQian Du

Abstract: Natural Language Processing (NLP) is an important branch of artificial intelligence that studies how to enable computers to understand, process, and generate human language. Text classification is a fundamental task in NLP, which aims to classify text into different predefined categories. Text classification is the most basic and classic task in natural language processing, and most of the tasks i… ▽ More Natural Language Processing (NLP) is an important branch of artificial intelligence that studies how to enable computers to understand, process, and generate human language. Text classification is a fundamental task in NLP, which aims to classify text into different predefined categories. Text classification is the most basic and classic task in natural language processing, and most of the tasks in natural language processing can be regarded as classification tasks. In recent years, deep learning has achieved great success in many research fields, and today, it has also become a standard technology in the field of NLP, which is widely integrated into text classification tasks. Unlike numbers and images, text processing emphasizes fine-grained processing ability. Traditional text classification methods generally require preprocessing the input model's text data. Additionally, they also need to obtain good sample features through manual annotation and then use classical machine learning algorithms for classification. Therefore, this paper analyzes the application status of deep learning in the three core tasks of NLP (including text representation, word order modeling, and knowledge representation). This content explores the improvement and synergy achieved through natural language processing in the context of text classification, while also taking into account the challenges posed by adversarial techniques in text generation, text classification, and semantic parsing. An empirical study on text classification tasks demonstrates the effectiveness of interactive integration training, particularly in conjunction with TextCNN, highlighting the significance of these advancements in text classification augmentation and enhancement. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.09129 [pdf, other]

All-pay Auction Based Profit Maximization in End-to-End Computation Offloading System

Authors: Hai Xue, Yun Xia, Di Zhang, Honghua Wei, Xiaolong Xu

Abstract: Pricing is an important issue in mobile edge computing. How to appropriately determine the bid of end user (EU) is an incentive factor for edge cloud (EC) to offer service. In this letter, we propose an equilibrium pricing scheme based on the all-pay auction model in end-to-end collaboration environment, wherein all EUs can acquire the service at a lower price than the own value of the required re… ▽ More Pricing is an important issue in mobile edge computing. How to appropriately determine the bid of end user (EU) is an incentive factor for edge cloud (EC) to offer service. In this letter, we propose an equilibrium pricing scheme based on the all-pay auction model in end-to-end collaboration environment, wherein all EUs can acquire the service at a lower price than the own value of the required resource. In addition, we propose a set allocation algorithm to divide all the bidders into different sets according to the price, and the EUs in each set get the service, which averts the case of getting no service due to the low price. Extensive simulation results demonstrate that the proposed scheme can effectively maximize the total profit of the edge offloading system, and guarantee all EUs can access the service. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.09128 [pdf, other]

Rethinking Referring Object Removal

Authors: Xiangtian Xue, Jiasong Wu, Youyong Kong, Lotfi Senhadji, Huazhong Shu

Abstract: Referring object removal refers to removing the specific object in an image referred by natural language expressions and filling the missing region with reasonable semantics. To address this task, we construct the ComCOCO, a synthetic dataset consisting of 136,495 referring expressions for 34,615 objects in 23,951 image pairs. Each pair contains an image with referring expressions and the ground t… ▽ More Referring object removal refers to removing the specific object in an image referred by natural language expressions and filling the missing region with reasonable semantics. To address this task, we construct the ComCOCO, a synthetic dataset consisting of 136,495 referring expressions for 34,615 objects in 23,951 image pairs. Each pair contains an image with referring expressions and the ground truth after elimination. We further propose an end-to-end syntax-aware hybrid map** network with an encoding-decoding structure. Linguistic features are hierarchically extracted at the syntactic level and fused in the downsampling process of visual features with multi-head attention. The feature-aligned pyramid network is leveraged to generate segmentation masks and replace internal pixels with region affinity learned from external semantics in high-level feature maps. Extensive experiments demonstrate that our model outperforms diffusion models and two-stage methods which process the segmentation and inpainting task separately by a significant margin. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.09051 [pdf]

doi 10.1364/OE.519235

Dual-polarization RF Channelizer Based on Kerr Soliton Microcomb Sources

Authors: Xingyuan Xu, David J. Moss

Abstract: We report a dual-polarization radio frequency (RF) channelizer based on microcombs. With the tailored mismatch between the FSRs of the active and passive MRRs, wideband RF spectra can be channelized into multiple segments featuring digital compatible bandwidths via the Vernier effect. Due to the use of dual polarization states, the number of channelized spectral segments, and thus the RF instantan… ▽ More We report a dual-polarization radio frequency (RF) channelizer based on microcombs. With the tailored mismatch between the FSRs of the active and passive MRRs, wideband RF spectra can be channelized into multiple segments featuring digital compatible bandwidths via the Vernier effect. Due to the use of dual polarization states, the number of channelized spectral segments, and thus the RF instantaneous bandwidth (with a certain spectral resolution), can be doubled. In our experiments, we used 20 microcomb lines with 49 GHz FSR to achieve 20 channels for each polarization, with high RF spectra slicing resolutions at 144 MHz (TE) and 163 MHz (TM), respectively; achieving an instantaneous RF operation bandwidth of 3.1 GHz (TE) and 2.2 GHz (TM). Our approach paves the path towards monolithically integrated photonic RF receivers (the key components active and passive MRRs are all fabricated on the same platform) with reduced complexity, size, and unprecedented performance, which is important for wide RF applications with digital compatible signal detection. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 7 pages 4 figures, 82 references

Journal ref: Optics Express Volume 32, No. 7, page 11281 (2024)

arXiv:2403.08008 [pdf, other]

Distribution and Properties of Molecular Gas Toward the Monoceros OB1 Region

Authors: Zi Zhuang, Yang Su, Shiyu Zhang, Xuepeng Chen, Qing-Zeng Yan, Haoran Feng, Li Sun, Xiaoyun Xu, Yan Sun, Xin Zhou, Hongchi Wang, Ji Yang

Abstract: We perform a comprehensive CO study toward the Monoceros OB1 (Mon OB1) region based on the MWISP survey at an angular resolution of about $50''$. The high-sensitivity data, together with the high dynamic range, shows that molecular gas in the $\rm 8^{\circ}\times4^{\circ}$ region displays complicated hierarchical structures and various morphology (e.g., filamentary, cavity-like, shell-like, and ot… ▽ More We perform a comprehensive CO study toward the Monoceros OB1 (Mon OB1) region based on the MWISP survey at an angular resolution of about $50''$. The high-sensitivity data, together with the high dynamic range, shows that molecular gas in the $\rm 8^{\circ}\times4^{\circ}$ region displays complicated hierarchical structures and various morphology (e.g., filamentary, cavity-like, shell-like, and other irregular structures). Based on Gaussian decomposition and clustering for $\mathrm{^{13}CO}$ data, a total of 263 $\mathrm{^{13}CO}$ structures are identified in the whole region, and 88% of raw data flux is recovered. The dense gas with relatively high column density from the integrated CO emission is mainly concentrated in the region where multiple $\rm ^{13}CO$ structures are overlapped. Combining the results of 32 large $\mathrm{^{13}CO}$ structures with distances from Gaia DR3, we estimate an average distance of $\rm 729^{+45}_{-45}~pc$ for the GMC complex. The total mass of the GMC Complex traced by $\mathrm{^{12}CO}$, $\mathrm{^{13}CO}$, and $\mathrm{C^{18}O}$ are $1.1\times10^5~M_\odot$, $4.3\times10^4~M_\odot$, and $8.4\times10^3~M_\odot$, respectively. The dense gas fraction shows a clear difference between Mon OB1 GMC East (12.4%) and Mon OB1 GMC West (3.3%). Our results show that the dense gas environment is closely linked to the nearby star-forming regions. On the other hand, star-forming activities have a great influence on the physical properties of the surrounding molecular gas (e.g., greater velocity dispersion, higher temperatures, and more complex velocity structures, etc.). We also discuss the distribution/kinematics of molecular gas associated with nearby star-forming activities. △ Less

Submitted 14 May, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

Comments: 28 pages, 17 figures, match to the version of APJ, 966, 202. The dataset has been released on https://doi.org/10.57760/sciencedb.17451

arXiv:2403.07997 [pdf, other]

doi 10.1145/3613904.3642158

Fast-Forward Reality: Authoring Error-Free Context-Aware Policies with Real-Time Unit Tests in Extended Reality

Authors: Xun Qian, Tianyi Wang, Xuhai Xu, Tanya R Jonker, Kashyap Todi

Abstract: Advances in ubiquitous computing have enabled end-user authoring of context-aware policies (CAPs) that control smart devices based on specific contexts of the user and environment. However, authoring CAPs accurately and avoiding run-time errors is challenging for end-users as it is difficult to foresee CAP behaviors under complex real-world conditions. We propose Fast-Forward Reality, an Extended… ▽ More Advances in ubiquitous computing have enabled end-user authoring of context-aware policies (CAPs) that control smart devices based on specific contexts of the user and environment. However, authoring CAPs accurately and avoiding run-time errors is challenging for end-users as it is difficult to foresee CAP behaviors under complex real-world conditions. We propose Fast-Forward Reality, an Extended Reality (XR) based authoring workflow that enables end-users to iteratively author and refine CAPs by validating their behaviors via simulated unit test cases. We develop a computational approach to automatically generate test cases based on the authored CAP and the user's context history. Our system delivers each test case with immersive visualizations in XR, facilitating users to verify the CAP behavior and identify necessary refinements. We evaluated Fast-Forward Reality in a user study (N=12). Our authoring and validation process improved the accuracy of CAPs and the users provided positive feedback on the system usability. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: 17 pages, 7 figures, ACM CHI 2024 Full Paper

ACM Class: H.5.2

Showing 301–350 of 5,369 results for author: Xu, X