Search | arXiv e-print repository

Spin-Orbit Locked Coupling of Localized Microwaves to Magnons

Authors: Chengyuan Cai, Zubiao Zhang, Ji Zou, Gerrit E. W. Bauer, Tao Yu

Abstract: We address the photonic spin-orbit coupling known from nano-optics and plasmonics in the microwave regime. The transverse spin $\mathbf{S}$ and momentum $\mathbf{q}$ of microwaves emitted by an excited magnetic dot are locked by $\mathbf{q}\cdot\mathbf{S}=0$ with a fixed chirality $\hat{\mathbf{n}}\cdot(\hat{\bf S}\times\hat{\bf q})=1$ when evanescent along $\hat{\mathbf{n}}\perp {\bf q}$. This fi… ▽ More We address the photonic spin-orbit coupling known from nano-optics and plasmonics in the microwave regime. The transverse spin $\mathbf{S}$ and momentum $\mathbf{q}$ of microwaves emitted by an excited magnetic dot are locked by $\mathbf{q}\cdot\mathbf{S}=0$ with a fixed chirality $\hat{\mathbf{n}}\cdot(\hat{\bf S}\times\hat{\bf q})=1$ when evanescent along $\hat{\mathbf{n}}\perp {\bf q}$. This field excites magnons in a nearby magnetic film in the form of directional beams that rotate with the magnetization direction. The exchange of these magnons through a magnetic substrate leads to a highly tunable strong coupling and entanglement between two distant nanomagnets. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 7 pages, 4 figures

arXiv:2406.02374 [pdf]

Direct measurement of the viscocapillary lift force near a liquid interface

Authors: Hao Zhang, Zaicheng Zhang, Aditya Jha, Yacine Amarouchene, Thomas Salez, Thomas Guérin, Chaouqi Misbah, Abdelhamid Maali

Abstract: Lift force of viscous origin is widespread across disciplines, from mechanics to biology. Here, we present the first direct measurement of the lift force acting on a particle moving in a viscous fluid along the liquid interface that separates two liquids. The force arises from the coupling between the viscous flow induced by the particle motion and the capillary deformation of the interface. The m… ▽ More Lift force of viscous origin is widespread across disciplines, from mechanics to biology. Here, we present the first direct measurement of the lift force acting on a particle moving in a viscous fluid along the liquid interface that separates two liquids. The force arises from the coupling between the viscous flow induced by the particle motion and the capillary deformation of the interface. The measurements show that the lift force increases as the distance between the sphere and the interface decreases, reaching saturation at small distances. The experimental results are in good agreement with the model and numerical calculation developed within the framework of the soft lubrication theory. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02370 [pdf, other]

Query-based Semantic Gaussian Field for Scene Representation in Reinforcement Learning

Authors: Jiaxu Wang, Ziyi Zhang, Qiang Zhang, Jia Li, **gkai Sun, Mingyuan Sun, Junhao He, Ren**g Xu

Abstract: Latent scene representation plays a significant role in training reinforcement learning (RL) agents. To obtain good latent vectors describing the scenes, recent works incorporate the 3D-aware latent-conditioned NeRF pipeline into scene representation learning. However, these NeRF-related methods struggle to perceive 3D structural information due to the inefficient dense sampling in volumetric rend… ▽ More Latent scene representation plays a significant role in training reinforcement learning (RL) agents. To obtain good latent vectors describing the scenes, recent works incorporate the 3D-aware latent-conditioned NeRF pipeline into scene representation learning. However, these NeRF-related methods struggle to perceive 3D structural information due to the inefficient dense sampling in volumetric rendering. Moreover, they lack fine-grained semantic information included in their scene representation vectors because they evenly consider free and occupied spaces. Both of them can destroy the performance of downstream RL tasks. To address the above challenges, we propose a novel framework that adopts the efficient 3D Gaussian Splatting (3DGS) to learn 3D scene representation for the first time. In brief, we present the Query-based Generalizable 3DGS to bridge the 3DGS technique and scene representations with more geometrical awareness than those in NeRFs. Moreover, we present the Hierarchical Semantics Encoding to ground the fine-grained semantic features to 3D Gaussians and further distilled to the scene representation vectors. We conduct extensive experiments on two RL platforms including Maniskill2 and Robomimic across 10 different tasks. The results show that our method outperforms the other 5 baselines by a large margin. We achieve the best success rates on 8 tasks and the second-best on the other two tasks. △ Less

Submitted 9 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02159 [pdf, other]

Quantum Statistical Effects on Warm Dark Matter and the Mass Constraint from the Cosmic Large Scale Structure

Authors: Zhijian Zhang, Weikang Lin

Abstract: The suppression of small-scale matter power spectrum is a distinct feature of Warm Dark Matter (WDM), which permits a constraint on the WDM mass from galaxy surveys. In the thermal relic WDM scenario, quantum statistical effects are not manifest. In a unified framework, we investigate the quantum statistical effects for a fermion case with a degenerate pressure and a boson case with a Bose-Einstei… ▽ More The suppression of small-scale matter power spectrum is a distinct feature of Warm Dark Matter (WDM), which permits a constraint on the WDM mass from galaxy surveys. In the thermal relic WDM scenario, quantum statistical effects are not manifest. In a unified framework, we investigate the quantum statistical effects for a fermion case with a degenerate pressure and a boson case with a Bose-Einstein condensation (BEC). Compared to the thermal relic case, the degenerate fermion case only slightly lowers the mass bound while the boson case with a high initial BEC fraction ($\gtrsim90\%$) significantly lowers it. On the other hand, the BEC fraction drops during the relativistic-to-nonrelativistic transition and completely disappears if the initial fraction is below $\sim64\%$. Given the rising interest in resolving the late-time galaxy-scale problems with boson condensation, a question is posed on how a high initial BEC fraction can be dynamically created so that a DM condensed component remains today. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 10 pages, 4 figures, comments welcome!

arXiv:2406.02148 [pdf, other]

Synergetic Event Understanding: A Collaborative Approach to Cross-Document Event Coreference Resolution with Large Language Models

Authors: Qingkai Min, Qipeng Guo, Xiangkun Hu, Songfang Huang, Zheng Zhang, Yue Zhang

Abstract: Cross-document event coreference resolution (CDECR) involves clustering event mentions across multiple documents that refer to the same real-world events. Existing approaches utilize fine-tuning of small language models (SLMs) like BERT to address the compatibility among the contexts of event mentions. However, due to the complexity and diversity of contexts, these models are prone to learning sim… ▽ More Cross-document event coreference resolution (CDECR) involves clustering event mentions across multiple documents that refer to the same real-world events. Existing approaches utilize fine-tuning of small language models (SLMs) like BERT to address the compatibility among the contexts of event mentions. However, due to the complexity and diversity of contexts, these models are prone to learning simple co-occurrences. Recently, large language models (LLMs) like ChatGPT have demonstrated impressive contextual understanding, yet they encounter challenges in adapting to specific information extraction (IE) tasks. In this paper, we propose a collaborative approach for CDECR, leveraging the capabilities of both a universally capable LLM and a task-specific SLM. The collaborative strategy begins with the LLM accurately and comprehensively summarizing events through prompting. Then, the SLM refines its learning of event representations based on these insights during fine-tuning. Experimental results demonstrate that our approach surpasses the performance of both the large and small language models individually, forming a complementary advantage. Across various datasets, our approach achieves state-of-the-art performance, underscoring its effectiveness in diverse scenarios. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: Accepted to ACL-24 Main

arXiv:2406.02119 [pdf, other]

A novel model reduction method to solve inverse problems of parabolic type

Authors: Wenlong Zhang, Zhiwen Zhang

Abstract: In this paper, we propose novel proper orthogonal decomposition (POD)--based model reduction methods that effectively address the issue of inverse crime in solving parabolic inverse problems. Both the inverse initial value problems and inverse source problems are studied. By leveraging the inherent low-dimensional structures present in the data, our approach enables a reduction in the forward mode… ▽ More In this paper, we propose novel proper orthogonal decomposition (POD)--based model reduction methods that effectively address the issue of inverse crime in solving parabolic inverse problems. Both the inverse initial value problems and inverse source problems are studied. By leveraging the inherent low-dimensional structures present in the data, our approach enables a reduction in the forward model complexity without compromising the accuracy of the inverse problem solution. Besides, we prove the convergence analysis of the proposed methods for solving parabolic inverse problems. Through extensive experimentation and comparative analysis, we demonstrate the effectiveness of our method in overcoming inverse crime and achieving improved inverse problem solutions. The proposed POD model reduction method offers a promising direction for improving the reliability and applicability of inverse problem-solving techniques in various domains. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.01977 [pdf, other]

What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding

Authors: Hongkang Li, Meng Wang, Tengfei Ma, Sijia Liu, Zaixi Zhang, Pin-Yu Chen

Abstract: Graph Transformers, which incorporate self-attention and positional encoding, have recently emerged as a powerful architecture for various graph learning tasks. Despite their impressive performance, the complex non-convex interactions across layers and the recursive graph structure have made it challenging to establish a theoretical foundation for learning and generalization. This study introduces… ▽ More Graph Transformers, which incorporate self-attention and positional encoding, have recently emerged as a powerful architecture for various graph learning tasks. Despite their impressive performance, the complex non-convex interactions across layers and the recursive graph structure have made it challenging to establish a theoretical foundation for learning and generalization. This study introduces the first theoretical investigation of a shallow Graph Transformer for semi-supervised node classification, comprising a self-attention layer with relative positional encoding and a two-layer perceptron. Focusing on a graph data model with discriminative nodes that determine node labels and non-discriminative nodes that are class-irrelevant, we characterize the sample complexity required to achieve a desirable generalization error by training with stochastic gradient descent (SGD). This paper provides the quantitative characterization of the sample complexity and number of iterations for convergence dependent on the fraction of discriminative nodes, the dominant patterns, and the initial model errors. Furthermore, we demonstrate that self-attention and positional encoding enhance generalization by making the attention map sparse and promoting the core neighborhood during training, which explains the superior feature representation of Graph Transformers. Our theoretical results are supported by empirical experiments on synthetic and real-world benchmarks. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: ICML 2024

arXiv:2406.01934 [pdf, other]

Optimal Transport Guided Correlation Assignment for Multimodal Entity Linking

Authors: Zefeng Zhang, Jiawei Sheng, Chuang Zhang, Yunzhi Liang, Wenyuan Zhang, Siqi Wang, Tingwen Liu

Abstract: Multimodal Entity Linking (MEL) aims to link ambiguous mentions in multimodal contexts to entities in a multimodal knowledge graph. A pivotal challenge is to fully leverage multi-element correlations between mentions and entities to bridge modality gap and enable fine-grained semantic matching. Existing methods attempt several local correlative mechanisms, relying heavily on the automatically lear… ▽ More Multimodal Entity Linking (MEL) aims to link ambiguous mentions in multimodal contexts to entities in a multimodal knowledge graph. A pivotal challenge is to fully leverage multi-element correlations between mentions and entities to bridge modality gap and enable fine-grained semantic matching. Existing methods attempt several local correlative mechanisms, relying heavily on the automatically learned attention weights, which may over-concentrate on partial correlations. To mitigate this issue, we formulate the correlation assignment problem as an optimal transport (OT) problem, and propose a novel MEL framework, namely OT-MEL, with OT-guided correlation assignment. Thereby, we exploit the correlation between multimodal features to enhance multimodal fusion, and the correlation between mentions and entities to enhance fine-grained matching. To accelerate model prediction, we further leverage knowledge distillation to transfer OT assignment knowledge to attention mechanism. Experimental results show that our model significantly outperforms previous state-of-the-art baselines and confirm the effectiveness of the OT-guided correlation assignment. △ Less

Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

Comments: Findings of ACL 2024

arXiv:2406.01931 [pdf, other]

Dishonesty in Helpful and Harmless Alignment

Authors: Youcheng Huang, **gkun Tang, Duanyu Feng, Zheng Zhang, Wenqiang Lei, Jiancheng Lv, Anthony G. Cohn

Abstract: People tell lies when seeking rewards. Large language models (LLMs) are aligned to human values with reinforcement learning where they get rewards if they satisfy human preference. We find that this also induces dishonesty in helpful and harmless alignment where LLMs tell lies in generating harmless responses. Using the latest interpreting tools, we detect dishonesty, show how LLMs can be harmful… ▽ More People tell lies when seeking rewards. Large language models (LLMs) are aligned to human values with reinforcement learning where they get rewards if they satisfy human preference. We find that this also induces dishonesty in helpful and harmless alignment where LLMs tell lies in generating harmless responses. Using the latest interpreting tools, we detect dishonesty, show how LLMs can be harmful if their honesty is increased, and analyze such conflicts at the parameter-level. Given these preliminaries and the hypothesis that reward-seeking stimulates dishonesty, we theoretically show that the dishonesty can in-turn decrease the alignment performances and augment reward-seeking alignment with representation regularization. Extensive results, including GPT-4 annotated win-rates, perplexities, and cases studies demonstrate that we can train more honest, helpful, and harmless LLMs. We will make all our codes and results be open-sourced upon this paper's acceptance. △ Less

Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.01916 [pdf, other]

FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Map**

Authors: Yuzhou Ji, He Zhu, Junshu Tang, Wuyi Liu, Zhizhong Zhang, Yuan Xie, Lizhuang Ma, Xin Tan

Abstract: The semantically interactive radiance field has always been an appealing task for its potential to facilitate user-friendly and automated real-world 3D scene understanding applications. However, it is a challenging task to achieve high quality, efficiency and zero-shot ability at the same time with semantics in radiance fields. In this work, we present FastLGS, an approach that supports real-time… ▽ More The semantically interactive radiance field has always been an appealing task for its potential to facilitate user-friendly and automated real-world 3D scene understanding applications. However, it is a challenging task to achieve high quality, efficiency and zero-shot ability at the same time with semantics in radiance fields. In this work, we present FastLGS, an approach that supports real-time open-vocabulary query within 3D Gaussian Splatting (3DGS) under high resolution. We propose the semantic feature grid to save multi-view CLIP features which are extracted based on Segment Anything Model (SAM) masks, and map the grids to low dimensional features for semantic field training through 3DGS. Once trained, we can restore pixel-aligned CLIP embeddings through feature grids from rendered features for open-vocabulary queries. Comparisons with other state-of-the-art methods prove that FastLGS can achieve the first place performance concerning both speed and accuracy, where FastLGS is 98x faster than LERF and 4x faster than LangSplat. Meanwhile, experiments show that FastLGS is adaptive and compatible with many downstream tasks, such as 3D segmentation and 3D object inpainting, which can be easily applied to other 3D manipulation systems. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.01555 [pdf, other]

Towards Flexible Interactive Reflection Removal with Human Guidance

Authors: Xiao Chen, Xudong Jiang, Yunkang Tao, Zhen Lei, Qing Li, Chenyang Lei, Zhaoxiang Zhang

Abstract: Single image reflection removal is inherently ambiguous, as both the reflection and transmission components requiring separation may follow natural image statistics. Existing methods attempt to address the issue by using various types of low-level and physics-based cues as sources of reflection signals. However, these cues are not universally applicable, since they are only observable in specific… ▽ More Single image reflection removal is inherently ambiguous, as both the reflection and transmission components requiring separation may follow natural image statistics. Existing methods attempt to address the issue by using various types of low-level and physics-based cues as sources of reflection signals. However, these cues are not universally applicable, since they are only observable in specific capture scenarios. This leads to a significant performance drop when test images do not align with their assumptions. In this paper, we aim to explore a novel flexible interactive reflection removal approach that leverages various forms of sparse human guidance, such as points and bounding boxes, as auxiliary high-level prior to achieve robust reflection removal. However, incorporating the raw user guidance naively into the existing reflection removal network does not result in performance gains. To this end, we innovatively transform raw user input into a unified form -- reflection masks using an Interactive Segmentation Foundation Model. Such a design absorbs the quintessence of the foundational segmentation model and flexible human guidance, thereby mitigating the challenges of reflection separations. Furthermore, to fully utilize user guidance and reduce user annotation costs, we design a mask-guided reflection removal network, comprising our proposed self-adaptive prompt block. This block adaptively incorporates user guidance as anchors and refines transmission features via cross-attention mechanisms. Extensive results on real-world images validate that our method demonstrates state-of-the-art performance on various datasets with the help of flexible and sparse user guidance. Our code and dataset will be publicly available here https://github.com/ShawnChenn/FlexibleReflectionRemoval. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.01421 [pdf]

doi 10.14627/537752069

Problematizing AI Omnipresence in Landscape Architecture

Authors: Phillip Fernberg, Zihao Zhang

Abstract: This position paper argues for, and offers, a critical lens through which to examine the current AI frenzy in the landscape architecture profession. In it, the authors propose five archetypes or mental modes that landscape architects might inhabit when thinking about AI. Rather than limiting judgments of AI use to a single axis of acceleration, these archetypes and corresponding narratives exist a… ▽ More This position paper argues for, and offers, a critical lens through which to examine the current AI frenzy in the landscape architecture profession. In it, the authors propose five archetypes or mental modes that landscape architects might inhabit when thinking about AI. Rather than limiting judgments of AI use to a single axis of acceleration, these archetypes and corresponding narratives exist along a relational spectrum and are permeable, allowing LAs to take on and switch between them according to context. We model these relationships between the archetypes and their contributions to AI advancement using a causal loop diagram (CLD), and with those interactions argue that more nuanced ways of approaching AI might also open new modes of practice in the new digital economy. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Journal ref: Journal of Digital Landscape Architecture, 2024

arXiv:2406.01416 [pdf, other]

Adapting Conformal Prediction to Distribution Shifts Without Labels

Authors: Kevin Kasa, Zhiyu Zhang, Heng Yang, Graham W. Taylor

Abstract: Conformal prediction (CP) enables machine learning models to output prediction sets with guaranteed coverage rate, assuming exchangeable data. Unfortunately, the exchangeability assumption is frequently violated due to distribution shifts in practice, and the challenge is often compounded by the lack of ground truth labels at test time. Focusing on classification in this paper, our goal is to impr… ▽ More Conformal prediction (CP) enables machine learning models to output prediction sets with guaranteed coverage rate, assuming exchangeable data. Unfortunately, the exchangeability assumption is frequently violated due to distribution shifts in practice, and the challenge is often compounded by the lack of ground truth labels at test time. Focusing on classification in this paper, our goal is to improve the quality of CP-generated prediction sets using only unlabeled data from the test domain. This is achieved by two new methods called ECP and EACP, that adjust the score function in CP according to the base model's uncertainty on the unlabeled test data. Through extensive experiments on a number of large-scale datasets and neural network architectures, we show that our methods provide consistent improvement over existing baselines and nearly match the performance of supervised algorithms. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.01349 [pdf, other]

Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation

Authors: Enhui Ma, Lijun Zhou, Tao Tang, Zhan Zhang, Dong Han, Junpeng Jiang, Kun Zhan, Peng Jia, Xianpeng Lang, Haiyang Sun, Di Lin, Kaicheng Yu

Abstract: Using generative models to synthesize new data has become a de-facto standard in autonomous driving to address the data scarcity issue. Though existing approaches are able to boost perception models, we discover that these approaches fail to improve the performance of planning of end-to-end autonomous driving models as the generated videos are usually less than 8 frames and the spatial and tempora… ▽ More Using generative models to synthesize new data has become a de-facto standard in autonomous driving to address the data scarcity issue. Though existing approaches are able to boost perception models, we discover that these approaches fail to improve the performance of planning of end-to-end autonomous driving models as the generated videos are usually less than 8 frames and the spatial and temporal inconsistencies are not negligible. To this end, we propose Delphi, a novel diffusion-based long video generation method with a shared noise modeling mechanism across the multi-views to increase spatial consistency, and a feature-aligned module to achieves both precise controllability and temporal consistency. Our method can generate up to 40 frames of video without loss of consistency which is about 5 times longer compared with state-of-the-art methods. Instead of randomly generating new data, we further design a sampling policy to let Delphi generate new data that are similar to those failure cases to improve the sample efficiency. This is achieved by building a failure-case driven framework with the help of pre-trained visual language models. Our extensive experiment demonstrates that our Delphi generates a higher quality of long videos surpassing previous state-of-the-art methods. Consequentially, with only generating 4% of the training dataset size, our framework is able to go beyond perception and prediction tasks, for the first time to the best of our knowledge, boost the planning performance of the end-to-end autonomous driving model by a margin of 25%. △ Less

Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

Comments: Project Page: https://westlake-autolab.github.io/delphi.github.io/, 8 figures

arXiv:2406.01332 [pdf, ps, other]

Measurements of the branching fractions of semileptonic $D^{+}_s$ decays via $e^+e^-\to D_s^{*+}D_s^{*-}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are… ▽ More We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are ${\mathcal B}(D_s^+\to ηe^+ν_e)=(2.35\pm0.11_{\rm stat}\pm 0.10_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to η^\prime e^+ν_e)=(0.82\pm0.09_{\rm stat}\pm 0.04_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to φe^+ν_e)=(2.21\pm0.16_{\rm stat}\pm 0.11_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to f_0(980) e^+ν_e,f_0(980)\toπ^+π^-)=(0.15\pm0.02_{\rm stat}\pm 0.01_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to K^0 e^+ν_e)=(0.24\pm0.04_{\rm stat}\pm 0.01_{\rm syst})\%,$ and ${\mathcal B}(D_s^+\to K^{*0} e^+ν_e)=(0.19\pm0.03_{\rm stat}\pm 0.01_{\rm syst})\%.$ These results are consistent with those measured via the $e^+e^-\to D_s^{*\pm}D_s^{\mp}$ process by BESIII and CLEO. The hadronic transition form factors $D^+_s\to ηe^+ν_e$, $D^+_s\to η^\prime e^+ν_e$, and $D^+_s\to K^0 e^+ν_e$ at four-momentum transfer squared $q^2$ = 0 are determined to be $f^η_+(0) = 0.482 \pm 0.011_{\rm stat} \pm 0.009_{\rm syst}\pm0.004_{\rm input},$ $f^{η^{\prime}}_+(0) = 0.562 \pm 0.031_{\rm stat} \pm 0.014_{\rm syst}\pm0.003_{\rm input},$ and $f^{K^0}_+(0) = 0.624 \pm 0.052_{\rm stat} \pm 0.013_{\rm syst}\pm0.002_{\rm input}.$ △ Less

Submitted 4 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

Comments: 14 pages, 3 figures

arXiv:2406.01234 [pdf, other]

Achieving Tractable Minimax Optimal Regret in Average Reward MDPs

Authors: Victor Boone, Zihan Zhang

Abstract: In recent years, significant attention has been directed towards learning average-reward Markov Decision Processes (MDPs). However, existing algorithms either suffer from sub-optimal regret guarantees or computational inefficiencies. In this paper, we present the first tractable algorithm with minimax optimal regret of $\widetilde{\mathrm{O}}(\sqrt{\mathrm{sp}(h^*) S A T})$, where… ▽ More In recent years, significant attention has been directed towards learning average-reward Markov Decision Processes (MDPs). However, existing algorithms either suffer from sub-optimal regret guarantees or computational inefficiencies. In this paper, we present the first tractable algorithm with minimax optimal regret of $\widetilde{\mathrm{O}}(\sqrt{\mathrm{sp}(h^*) S A T})$, where $\mathrm{sp}(h^*)$ is the span of the optimal bias function $h^*$, $S \times A$ is the size of the state-action space and $T$ the number of learning steps. Remarkably, our algorithm does not require prior information on $\mathrm{sp}(h^*)$. Our algorithm relies on a novel subroutine, Projected Mitigated Extended Value Iteration (PMEVI), to compute bias-constrained optimal policies efficiently. This subroutine can be applied to various previous algorithms to improve regret bounds. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.01154 [pdf, other]

UniUSNet: A Promptable Framework for Universal Ultrasound Disease Prediction and Tissue Segmentation

Authors: Zehui Lin, Zhuoneng Zhang, Xindi Hu, Zhifan Gao, Xin Yang, Yue Sun, Dong Ni, Tao Tan

Abstract: Ultrasound is a widely used imaging modality in clinical practice due to its low cost, portability, and safety. Current research in general AI for healthcare focuses on large language models and general segmentation models, with insufficient attention to solutions addressing both disease prediction and tissue segmentation. In this study, we propose a novel universal framework for ultrasound, namel… ▽ More Ultrasound is a widely used imaging modality in clinical practice due to its low cost, portability, and safety. Current research in general AI for healthcare focuses on large language models and general segmentation models, with insufficient attention to solutions addressing both disease prediction and tissue segmentation. In this study, we propose a novel universal framework for ultrasound, namely UniUSNet, which is a promptable framework for ultrasound image classification and segmentation. The universality of this model is derived from its versatility across various aspects. It proficiently manages any ultrasound nature, any anatomical position, any input type and excelling not only in segmentation tasks but also in classification tasks. We introduce a novel module that incorporates this information as a prompt and seamlessly embedding it within the model's learning process. To train and validate our proposed model, we curated a comprehensive ultrasound dataset from publicly accessible sources, encompassing up to 7 distinct anatomical positions with over 9.7K annotations. Experimental results demonstrate that our model achieves performance comparable to state-of-the-art models, and surpasses both a model trained on a single dataset and an ablated version of the network lacking prompt guidance. Additionally, we conducted zero-shot and fine-tuning experiments on new datasets, which proved that our model possesses strong generalization capabilities and can be effectively adapted to new data at low cost through its adapter module. We will continuously expand the dataset and optimize the task specific prompting mechanism towards the universality in medical ultrasound. Model weights, data processing workflows, and code will be open source to the public (https://github.com/Zehui-Lin/UniUSNet). △ Less

Submitted 20 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.01080 [pdf, other]

No Vandalism: Privacy-Preserving and Byzantine-Robust Federated Learning

Authors: Zhibo Xing, Zijian Zhang, Zi'ang Zhang, Jiamou Liu, Liehuang Zhu, Giovanni Russello

Abstract: Federated learning allows several clients to train one machine learning model jointly without sharing private data, providing privacy protection. However, traditional federated learning is vulnerable to poisoning attacks, which can not only decrease the model performance, but also implant malicious backdoors. In addition, direct submission of local model parameters can also lead to the privacy lea… ▽ More Federated learning allows several clients to train one machine learning model jointly without sharing private data, providing privacy protection. However, traditional federated learning is vulnerable to poisoning attacks, which can not only decrease the model performance, but also implant malicious backdoors. In addition, direct submission of local model parameters can also lead to the privacy leakage of the training dataset. In this paper, we aim to build a privacy-preserving and Byzantine-robust federated learning scheme to provide an environment with no vandalism (NoV) against attacks from malicious participants. Specifically, we construct a model filter for poisoned local models, protecting the global model from data and model poisoning attacks. This model filter combines zero-knowledge proofs to provide further privacy protection. Then, we adopt secret sharing to provide verifiable secure aggregation, removing malicious clients that disrupting the aggregation process. Our formal analysis proves that NoV can protect data privacy and weed out Byzantine attackers. Our experiments illustrate that NoV can effectively address data and model poisoning attacks, including PGD, and outperforms other related schemes. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.01007 [pdf, other]

Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay

Authors: Daya Bay collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, J. Cheng, Y. -C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng , et al. (177 additional authors not shown)

Abstract: This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive… ▽ More This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive region, the relative $\overlineν_{e}$ rates and energy spectra variation among the near and far detectors gives $\mathrm{sin}^22θ_{13} = 0.0759_{-0.0049}^{+0.0050}$ and $Δm^2_{32} = (2.72^{+0.14}_{-0.15})\times10^{-3}$ eV$^2$ assuming the normal neutrino mass ordering, and $Δm^2_{32} = (-2.83^{+0.15}_{-0.14})\times10^{-3}$ eV$^2$ for the inverted neutrino mass ordering. This estimate of $\sin^2 2θ_{13}$ is consistent with and essentially independent from the one obtained using the capture-on-gadolinium sample at Daya Bay. The combination of these two results yields $\mathrm{sin}^22θ_{13}= 0.0833\pm0.0022$, which represents an 8% relative improvement in precision regarding the Daya Bay full 3158-day capture-on-gadolinium result. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.00986 [pdf, ps, other]

Non-reductive special cycles and Twisted Arithmetic Fundamental Lemma

Authors: Zhiyu Zhang

Abstract: We consider arithmetic analogs of the relative Langlands program and applications of new non-reductive geometry. Firstly, we introduce mirabolic special cycles, which produce special cycles on many Hodge type Rapoport-Zink spaces via pullbacks e.g. Kudla--Rapoport cycles. Secondly, we formulate arithmetic intersection problems for these cycles and formulate a method of arithmetic induction. As a m… ▽ More We consider arithmetic analogs of the relative Langlands program and applications of new non-reductive geometry. Firstly, we introduce mirabolic special cycles, which produce special cycles on many Hodge type Rapoport-Zink spaces via pullbacks e.g. Kudla--Rapoport cycles. Secondly, we formulate arithmetic intersection problems for these cycles and formulate a method of arithmetic induction. As a main example, we formulate arithmetic twisted Gan--Gross--Prasad conjectures on unitary Shimura varieties and prove a key twisted arithmetic fundamental lemma using mirabolic special cycles, arithmetic inductions, and Weil type relative trace formulas. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 42 pages, comments welcome!

MSC Class: 11F67; 11G40; 14G35

arXiv:2406.00970 [pdf, ps, other]

Limits of manifolds with boundary

Authors: Takao Yamaguchi, Zhilang Zhang

Abstract: In this paper, as a continuation of \cite{YZ:inrdius},we develop the geometry of the limit spaces of compact Riemannian manifolds with boundary, where we assume a lower sectional curvature bound,two sides bounds on the second fundamental forms of boundaries and an upper diameter bound. We mainly focus on the general case of non inradius collapse/convergence, where inradii of manifolds are uniforml… ▽ More In this paper, as a continuation of \cite{YZ:inrdius},we develop the geometry of the limit spaces of compact Riemannian manifolds with boundary, where we assume a lower sectional curvature bound,two sides bounds on the second fundamental forms of boundaries and an upper diameter bound. We mainly focus on the general case of non inradius collapse/convergence, where inradii of manifolds are uniformly bounded away from zero. In this case, many limit spaces have wild geometry, which arise from the boundary behavior of manifolds.Therefore the study of boundary singular points is a key to understand such limit spaces. We also present some global convergence/collapsing results. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: 6 figures

MSC Class: 53C20

arXiv:2406.00946 [pdf, other]

Higgs boson decays $h\rightarrow MZ$ in the TNMSSM

Authors: Huai-cong Hu, Zhao-Yang Zhang, Ning-Yu Zhu, Hai-Xiang Chen

Abstract: We study the SM-like Higgs boson decays $h\rightarrow MZ$ in the Triplet extended NMSSM (TNMSSM),where M is a vector meson ($ρ$, $ω$, $φ$, $J/Ψ$, $Υ$). Compared to the minimal supersymmetric standard model (MSSM), the TNMSSM includes two new SU(2) triplets with hypercharge $\pm 1$ and a SM gauge singlet which are coupled to each other. The indirect contributions to the decays $h \rightarrow MZ$ ar… ▽ More We study the SM-like Higgs boson decays $h\rightarrow MZ$ in the Triplet extended NMSSM (TNMSSM),where M is a vector meson ($ρ$, $ω$, $φ$, $J/Ψ$, $Υ$). Compared to the minimal supersymmetric standard model (MSSM), the TNMSSM includes two new SU(2) triplets with hypercharge $\pm 1$ and a SM gauge singlet which are coupled to each other. The indirect contributions to the decays $h \rightarrow MZ$ are produced from the effective $hγZ$ vertex, and they are more important than the direct contributions. The results of this work would encourage a detection on $h \rightarrow Zγ$ at the future high energy colliders for exploring new physics beyond the SM. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: Accepted for published in Chinese Physics C

arXiv:2406.00905 [pdf, other]

Exploration of mass splitting and muon/tau mixing parameters for an eV-scale sterile neutrino with IceCube

Authors: R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, L. Ausborm, S. N. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus, J. Beise, C. Bellenghi , et al. (400 additional authors not shown)

Abstract: We present the first three-parameter fit to a 3+1 sterile neutrino model using 7.634 years of data from the IceCube Neutrino Observatory on $ν_μ+\overlineν_μ$ charged-current interactions in the energy range 500-9976 GeV. Our analysis is sensitive to the mass-squared splitting between the heaviest and lightest mass state ($Δm_{41}^2$), the mixing matrix element connecting muon flavor to the fourth… ▽ More We present the first three-parameter fit to a 3+1 sterile neutrino model using 7.634 years of data from the IceCube Neutrino Observatory on $ν_μ+\overlineν_μ$ charged-current interactions in the energy range 500-9976 GeV. Our analysis is sensitive to the mass-squared splitting between the heaviest and lightest mass state ($Δm_{41}^2$), the mixing matrix element connecting muon flavor to the fourth mass state ($|U_{\mu4}|^2$), and the element connecting tau flavor to the fourth mass state ($|U_{\tau4}|^2$). Predicted propagation effects in matter enhance the signature through a resonance as atmospheric neutrinos from the Northern Hemisphere traverse the Earth to the IceCube detector at the South Pole. The result is consistent with the no-sterile neutrino hypothesis with a probability of 4.3 %. Profiling the likelihood of each parameter yields the 90 % confidence levels: $ 2.4\,\mathrm{eV}^{2} < Δm_{41}^2 <9.6\,\mathrm{eV}^{2} $ , $0.0081 < |U_{\mu4}|^2 < 0.10$ , and $|U_{\tau4}|^2< 0.035$, which narrows the allowed parameter-space for $|U_{\tau4}|^2$. However, the primary result of this analysis is the first map of the 3+1 parameter space exploring the interdependence of $Δm_{41}^2$, $|U_{\mu4}|^2$, and $|U_{\tau4}|^2$. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.00476 [pdf, other]

Revisiting Energy Distribution and Formation Rate of CHIME Fast Radio Bursts

Authors: K. J. Zhang, X. F. Dong, A. E. Rodin, V. A. Fedorova, Y. F. Huang, D. Li, P. Wang, Q. M. Li, C. Du, F. Xu, Z. B. Zhang

Abstract: Using a large sample of fast radio bursts (FRBs) from the first CHIME/FRB catalog, we apply the Lynden-Bell's c$^-$ method to study their energy function and formation rate evolutions with redshift. It is found with the non-parametric Kendell's $τ$ statistics that the FRB energy strongly evolves with the cosmological redshift as $E(z)\propto(1 + z)^{5.23}$. After removing the redshift dependence,… ▽ More Using a large sample of fast radio bursts (FRBs) from the first CHIME/FRB catalog, we apply the Lynden-Bell's c$^-$ method to study their energy function and formation rate evolutions with redshift. It is found with the non-parametric Kendell's $τ$ statistics that the FRB energy strongly evolves with the cosmological redshift as $E(z)\propto(1 + z)^{5.23}$. After removing the redshift dependence, the local energy distribution can be described by a broken power-law form of $Ψ(E_{0})\propto E_{0}^{-0.38}$ for the low-energy segment and $Ψ(E_{0})\propto E_{0}^{-2.01}$ for the high-energy segment with a dividing line of $\sim2.1\times10^{40} \rm erg$. Interestingly, we find that the formation rate of CHIME FRBs also evolves with redshift as $ρ(z)\propto(1+z)^{-4.73\pm0.08}$. The local formation rate $ρ(0)$ of the CHIME FRBs is constrained to be about $ 1.25\times 10^4\rm{\,Gpc^{-3}yr^{-1}}$ that is comparable with some previous estimations. In addition, we notice the formation rate not only exceeds the star formation rate at the lower redshifts but also always declines with the increase of redshift, which does not match the star formation history at all. Consequently, we suggest that most FRBs could originate from the older stellar populations. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2406.00468 [pdf, other]

Molecular Modelling of Aqueous Batteries

Authors: Alicia van Hees, Zhan-Yun Zhang, Aishwarya Sudhama, Chao Zhang

Abstract: Aqueous batteries play an increasingly important role for the development of sustainable and safety-prioritised energy storage solutions. Compared to conventional lithium-ion batteries, the cell chemistry in aqueous batteries share many common features with those of electrolyzer and pseudo-capacitor systems because of the involvement of aqueous electrolyte and proton activity. This imposes the nee… ▽ More Aqueous batteries play an increasingly important role for the development of sustainable and safety-prioritised energy storage solutions. Compared to conventional lithium-ion batteries, the cell chemistry in aqueous batteries share many common features with those of electrolyzer and pseudo-capacitor systems because of the involvement of aqueous electrolyte and proton activity. This imposes the needs for a better understanding of the corresponding ion solvation, intercalation and electron transfer processes at atomistic scale. Therefore, this chapter provides an up-to-date overview of molecular modelling techniques and their applications in aqueous batteries. In particular, we emphasize on the dynamical and reactive description of aqueous battery systems brought in by density functional theory-based molecular dynamics simulation (DFTMD) and its machine-learning (ML) accelerated counterpart. Moreover, we also cover the recent advancement of generative artificial intelligence (AI) in molecular and materials design of aqueous batteries. Case studies presented here include popular aqueous battery systems, such as water-in-salt electrolytes, proton-coupled cathode materials, Zn-ion batteries as well as organic redox flow batteries. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2406.00444 [pdf, other]

Exploring Channel Estimation and Signal Detection for ODDM-based ISAC Systems

Authors: Dezhi Wang, Chongwen Huang, Lei Liu, Xiaoming Chen, Wei Wang, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

Abstract: Inspired by providing reliable communications for high-mobility scenarios, in this letter, we investigate the channel estimation and signal detection in integrated sensing and communication~(ISAC) systems based on the orthogonal delay-Doppler multiplexing~(ODDM) modulation, which consists of a pulse-train that can achieve the orthogonality with respect to the resolution of the delay-Doppler~(DD) p… ▽ More Inspired by providing reliable communications for high-mobility scenarios, in this letter, we investigate the channel estimation and signal detection in integrated sensing and communication~(ISAC) systems based on the orthogonal delay-Doppler multiplexing~(ODDM) modulation, which consists of a pulse-train that can achieve the orthogonality with respect to the resolution of the delay-Doppler~(DD) plane. To enhance the communication performance in the ODDM-based ISAC systems, we first propose a low-complexity approximation algorithm for channel estimation, which addresses the challenge of the high complexity from high resolution in the ODDM modulation, and achieves performance close to that of the maximum likelihood estimator scheme. Then, we employ the orthogonal approximate message-passing scheme to detect the symbols in the communication process based on the estimated channel information. Finally, simulation results show that the detection performance of ODDM is better than other multi-carrier modulation schemes. Specifically, the ODDM outperforms the orthogonal time frequency space scheme by 2.3 dB when the bit error ratio is $10^{-6}$. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: accepted by IEEE Wireless Communications Letters

arXiv:2406.00346 [pdf, other]

Details Enhancement in Unsigned Distance Field Learning for High-fidelity 3D Surface Reconstruction

Authors: Cheng Xu, Fei Hou, Wencheng Wang, Hong Qin, Zhebin Zhang, Ying He

Abstract: While Signed Distance Fields (SDF) are well-established for modeling watertight surfaces, Unsigned Distance Fields (UDF) broaden the scope to include open surfaces and models with complex inner structures. Despite their flexibility, UDFs encounter significant challenges in high-fidelity 3D reconstruction, such as non-differentiability at the zero level set, difficulty in achieving the exact zero v… ▽ More While Signed Distance Fields (SDF) are well-established for modeling watertight surfaces, Unsigned Distance Fields (UDF) broaden the scope to include open surfaces and models with complex inner structures. Despite their flexibility, UDFs encounter significant challenges in high-fidelity 3D reconstruction, such as non-differentiability at the zero level set, difficulty in achieving the exact zero value, numerous local minima, vanishing gradients, and oscillating gradient directions near the zero level set. To address these challenges, we propose Details Enhanced UDF (DEUDF) learning that integrates normal alignment and the SIREN network for capturing fine geometric details, adaptively weighted Eikonal constraints to address vanishing gradients near the target surface, unconditioned MLP-based UDF representation to relax non-negativity constraints, and a UDF-tailored method for extracting iso-surface with non-constant iso-values. These strategies collectively stabilize the learning process from unoriented point clouds and enhance the accuracy of UDFs. Our computational results demonstrate that DEUDF outperforms existing UDF learning methods in both accuracy and the quality of reconstructed surfaces. We will make the source code publicly available. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2406.00268 [pdf, other]

doi 10.1103/PhysRevB.109.184314

Magnetization in a non-equilibrium quantum spin system

Authors: X. Z. Zhang

Abstract: The dynamics described by the non-Hermitian Hamiltonian typically capture the short-term behavior of open quantum systems before quantum jumps occur. In contrast, the long-term dynamics, characterized by the Lindblad master equation (LME), drive the system towards a non-equilibrium steady state (NESS), which is an eigenstate with zero energy of the Liouvillian superoperator, denoted as… ▽ More The dynamics described by the non-Hermitian Hamiltonian typically capture the short-term behavior of open quantum systems before quantum jumps occur. In contrast, the long-term dynamics, characterized by the Lindblad master equation (LME), drive the system towards a non-equilibrium steady state (NESS), which is an eigenstate with zero energy of the Liouvillian superoperator, denoted as $\mathcal{L}$. Conventionally, these two types of evolutions exhibit distinct dynamical behaviors. However, in this study, we challenge this common belief and demonstrate that the effective non-Hermitian Hamiltonian can accurately represent the long-term dynamics of a critical two-level open quantum system. The criticality of the system arises from the exceptional point (EP) of the effective non-Hermitian Hamiltonian. Additionally, the NESS is identical to the coalescent state of the effective non-Hermitian Hamiltonian. We apply this finding to a series of critical open quantum systems and show that a local dissipation channel can induce collective alignment of all spins in the same direction. This direction can be well controlled by modulating the quantum jump operator. The corresponding NESS is a product state and maintains long-time coherence, facilitating quantum control in open many-body systems. This discovery paves the way for a better understanding of the long-term dynamics of critical open quantum systems. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: 13 pages, 7 figures

Journal ref: Phys. Rev. B 109, 184314 (2024)

arXiv:2406.00235 [pdf, other]

Amplitude analysis of the radiative decay $B^0_s\to K^+K^-γ$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1061 additional authors not shown)

Abstract: A search for radiative decay of $B^0_s$ mesons to orbitally excited $K^+K^-$ states is performed using proton proton collisions recorded by the \mbox{LHCb}\xspace experiment, corresponding to an integrated luminosity of 9~fb$^{-1}$. The dikaon spectrum in the mass range $m_{KK}<2400$~{\ensuremath{\,\text{Me\kern -0.1em V\!/}c^2}\xspace} is dominated by the $φ(1020)$ resonance that accounts for alm… ▽ More A search for radiative decay of $B^0_s$ mesons to orbitally excited $K^+K^-$ states is performed using proton proton collisions recorded by the \mbox{LHCb}\xspace experiment, corresponding to an integrated luminosity of 9~fb$^{-1}$. The dikaon spectrum in the mass range $m_{KK}<2400$~{\ensuremath{\,\text{Me\kern -0.1em V\!/}c^2}\xspace} is dominated by the $φ(1020)$ resonance that accounts for almost 70$\%$ of the decay rate. Considering the possible contributions of $f_2{(1270)}$, $f'_2{(1525)}$ and $f_2{(2010)}$ meson states, the overall tensor contribution to the amplitude is measured to be \begin{equation} {\cal F}_{\{f_2\}}=16.8\pm 0.5\mathrm{~(stat.)}\pm0.7\mathrm{~(syst.)}\%,\nonumber \end{equation} mostly dominated by the $f'_2(1525)$ state. Several statistically equivalent solutions are obtained for the detailed resonant structure depending on whether the smaller amplitudes interfere destructively or constructively with the dominant amplitude. The preferred solution that corresponds to the lowest values of the fit fractions along with constructive interference leads to the relative branching ratio measurement \begin{equation} \frac{{\cal B}(B^0_s\to f'_2γ)}{{\cal B}(B^0_s\toφγ)}= 19.4^{+0.9}_{-0.8}\mathrm{~(stat.)}{}^{+1.4}_{-0.5}\mathrm{~(syst.)}\pm0.5\mathrm{~(\cal{B})}\%\nonumber, \end{equation} where the last uncertainty is due to the ratio of measured branching fractions to the $K^+K^-$ final state. This result represents the first observation of the radiative $B^0_s\to f'_2(1525)γ$ decay, which is the second radiative transition observed in the $B^0_s$ sector. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-002.html (LHCb public pages)

Report number: LHCb-PAPER-2024-002, CERN-EP-2024-115

arXiv:2406.00234 [pdf, other]

Learning to Stabilize Unknown LTI Systems on a Single Trajectory under Stochastic Noise

Authors: Ziyi Zhang, Yorie Nakahira, Guannan Qu

Abstract: We study the problem of learning to stabilize unknown noisy Linear Time-Invariant (LTI) systems on a single trajectory. It is well known in the literature that the learn-to-stabilize problem suffers from exponential blow-up in which the state norm blows up in the order of $Θ(2^n)$ where $n$ is the state space dimension. This blow-up is due to the open-loop instability when exploring the $n$-dimens… ▽ More We study the problem of learning to stabilize unknown noisy Linear Time-Invariant (LTI) systems on a single trajectory. It is well known in the literature that the learn-to-stabilize problem suffers from exponential blow-up in which the state norm blows up in the order of $Θ(2^n)$ where $n$ is the state space dimension. This blow-up is due to the open-loop instability when exploring the $n$-dimensional state space. To address this issue, we develop a novel algorithm that decouples the unstable subspace of the LTI system from the stable subspace, based on which the algorithm only explores and stabilizes the unstable subspace, the dimension of which can be much smaller than $n$. With a new singular-value-decomposition(SVD)-based analytical framework, we prove that the system is stabilized before the state norm reaches $2^{O(k \log n)}$, where $k$ is the dimension of the unstable subspace. Critically, this bound avoids exponential blow-up in state dimension in the order of $Θ(2^n)$ as in the previous works, and to the best of our knowledge, this is the first paper to avoid exponential blow-up in dimension for stabilizing LTI systems with noise. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2406.00164 [pdf, other]

DYNA: Disease-Specific Language Model for Variant Pathogenicity

Authors: Huixin Zhan, Zijun Zhang

Abstract: Clinical variant classification of pathogenic versus benign genetic variants remains a challenge in clinical genetics. Recently, the proposition of genomic foundation models has improved the generic variant effect prediction (VEP) accuracy via weakly-supervised or unsupervised training. However, these VEPs are not disease-specific, limiting their adaptation at the point of care. To address this pr… ▽ More Clinical variant classification of pathogenic versus benign genetic variants remains a challenge in clinical genetics. Recently, the proposition of genomic foundation models has improved the generic variant effect prediction (VEP) accuracy via weakly-supervised or unsupervised training. However, these VEPs are not disease-specific, limiting their adaptation at the point of care. To address this problem, we propose DYNA: Disease-specificity fine-tuning via a Siamese neural network broadly applicable to all genomic foundation models for more effective variant effect predictions in disease-specific contexts. We evaluate DYNA in two distinct disease-relevant tasks. For coding VEPs, we focus on various cardiovascular diseases, where gene-disease relationships of loss-of-function vs. gain-of-function dictate disease-specific VEP. For non-coding VEPs, we apply DYNA to an essential post-transcriptional regulatory axis of RNA splicing, the most common non-coding pathogenic mechanism in established clinical VEP guidelines. In both cases, DYNA fine-tunes various pre-trained genomic foundation models on small, rare variant sets. The DYNA fine-tuned models show superior performance in the held-out rare variant testing set and are further replicated in large, clinically-relevant variant annotations in ClinVAR. Thus, DYNA offers a potent disease-specific variant effect prediction method, excelling in intra-gene generalization and generalization to unseen genetic variants, making it particularly valuable for disease associations and clinical applicability. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2405.20764 [pdf, other]

CoMoFusion: Fast and High-quality Fusion of Infrared and Visible Image with Consistency Model

Authors: Zhiming Meng, Hui Li, Zeyang Zhang, Zhongwei Shen, Yunlong Yu, Xiaoning Song, Xiaojun Wu

Abstract: Generative models are widely utilized to model the distribution of fused images in the field of infrared and visible image fusion. However, current generative models based fusion methods often suffer from unstable training and slow inference speed. To tackle this problem, a novel fusion method based on consistency model is proposed, termed as CoMoFusion, which can generate the high-quality images… ▽ More Generative models are widely utilized to model the distribution of fused images in the field of infrared and visible image fusion. However, current generative models based fusion methods often suffer from unstable training and slow inference speed. To tackle this problem, a novel fusion method based on consistency model is proposed, termed as CoMoFusion, which can generate the high-quality images and achieve fast image inference speed. In specific, the consistency model is used to construct multi-modal joint features in the latent space with the forward and reverse process. Then, the infrared and visible features extracted by the trained consistency model are fed into fusion module to generate the final fused image. In order to enhance the texture and salient information of fused images, a novel loss based on pixel value selection is also designed. Extensive experiments on public datasets illustrate that our method obtains the SOTA fusion performance compared with the existing fusion methods. △ Less

Submitted 11 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.20676 [pdf, other]

Search for $e^{+}e^{-}\toη'ψ(2S)$ at center-of-mass energies from 4.66 to 4.95 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using data samples with an integrated luminosity of $4.67~\mathrm{fb}^{-1}$ collected by the BESIII detector operating at the BEPCII collider, we search for the process $e^+e^- \rightarrow η' ψ(2S)$ at center-of-mass energies from $4.66$ to $4.95~\mathrm{GeV}$. No significant signal is observed, and upper limits for the Born cross sections $σ^B(e^+e^-\rightarrowη'ψ(2S))$ at the 90\% confidence lev… ▽ More Using data samples with an integrated luminosity of $4.67~\mathrm{fb}^{-1}$ collected by the BESIII detector operating at the BEPCII collider, we search for the process $e^+e^- \rightarrow η' ψ(2S)$ at center-of-mass energies from $4.66$ to $4.95~\mathrm{GeV}$. No significant signal is observed, and upper limits for the Born cross sections $σ^B(e^+e^-\rightarrowη'ψ(2S))$ at the 90\% confidence level are determined. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.20646 [pdf, other]

Large Language Models Enhanced Sequential Recommendation for Long-tail User and Item

Authors: Qidong Liu, Xian Wu, Xiangyu Zhao, Ye**g Wang, Zijian Zhang, Feng Tian, Yefeng Zheng

Abstract: Sequential recommendation systems (SRS) serve the purpose of predicting users' subsequent preferences based on their past interactions and have been applied across various domains such as e-commerce and social networking platforms. However, practical SRS encounters challenges due to the fact that most users engage with only a limited number of items, while the majority of items are seldom consumed… ▽ More Sequential recommendation systems (SRS) serve the purpose of predicting users' subsequent preferences based on their past interactions and have been applied across various domains such as e-commerce and social networking platforms. However, practical SRS encounters challenges due to the fact that most users engage with only a limited number of items, while the majority of items are seldom consumed. These challenges, termed as the long-tail user and long-tail item dilemmas, often create obstacles for traditional SRS methods. Mitigating these challenges is crucial as they can significantly impact user satisfaction and business profitability. While some research endeavors have alleviated these issues, they still grapple with issues such as seesaw or noise stemming from the scarcity of interactions. The emergence of large language models (LLMs) presents a promising avenue to address these challenges from a semantic standpoint. In this study, we introduce the Large Language Models Enhancement framework for Sequential Recommendation (LLM-ESR), which leverages semantic embeddings from LLMs to enhance SRS performance without increasing computational overhead. To combat the long-tail item challenge, we propose a dual-view modeling approach that fuses semantic information from LLMs with collaborative signals from traditional SRS. To address the long-tail user challenge, we introduce a retrieval augmented self-distillation technique to refine user preference representations by incorporating richer interaction data from similar users. Through comprehensive experiments conducted on three authentic datasets using three widely used SRS models, our proposed enhancement framework demonstrates superior performance compared to existing methodologies. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.20641 [pdf, other]

Query Provenance Analysis for Robust and Efficient Query-based Black-box Attack Defense

Authors: Shaofei Li, Ziqi Zhang, Haomin Jia, Ding Li, Yao Guo, Xiangqun Chen

Abstract: Query-based black-box attacks have emerged as a significant threat to machine learning systems, where adversaries can manipulate the input queries to generate adversarial examples that can cause misclassification of the model. To counter these attacks, researchers have proposed Stateful Defense Models (SDMs) for detecting adversarial query sequences and rejecting queries that are "similar" to the… ▽ More Query-based black-box attacks have emerged as a significant threat to machine learning systems, where adversaries can manipulate the input queries to generate adversarial examples that can cause misclassification of the model. To counter these attacks, researchers have proposed Stateful Defense Models (SDMs) for detecting adversarial query sequences and rejecting queries that are "similar" to the history queries. Existing state-of-the-art (SOTA) SDMs (e.g., BlackLight and PIHA) have shown great effectiveness in defending against these attacks. However, recent studies have shown that they are vulnerable to Oracle-guided Adaptive Rejection Sampling (OARS) attacks, which is a stronger adaptive attack strategy. It can be easily integrated with existing attack algorithms to evade the SDMs by generating queries with fine-tuned direction and step size of perturbations utilizing the leaked decision information from the SDMs. In this paper, we propose a novel approach, Query Provenance Analysis (QPA), for more robust and efficient SDMs. QPA encapsulates the historical relationships among queries as the sequence feature to capture the fundamental difference between benign and adversarial query sequences. To utilize the query provenance, we propose an efficient query provenance analysis algorithm with dynamic management. We evaluate QPA compared with two baselines, BlackLight and PIHA, on four widely used datasets with six query-based black-box attack algorithms. The results show that QPA outperforms the baselines in terms of defense effectiveness and efficiency on both non-adaptive and adaptive attacks. Specifically, QPA reduces the Attack Success Rate (ASR) of OARS to 4.08%, comparing to 77.63% and 87.72% for BlackLight and PIHA, respectively. Moreover, QPA also achieves 7.67x and 2.25x higher throughput than BlackLight and PIHA. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.20638 [pdf, other]

Study of the decays $χ_{cJ} \rightarrow Λ\barΛφ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (637 additional authors not shown)

Abstract: Based on $(2712.4 \pm 14.3) \times 10^{6}$ $ e^{+}e^{-}\toψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, we report the first evidence of $χ_{c0}\to Λ\bar Λφ$ decays and the first observation of $χ_{c1,2}\to Λ\bar Λφ$ decays, with significances of $4.5σ$, $11.3σ$ and $13.0σ$, respectively. The decay branching fractions of $χ_{c0,1,2}\to Λ\bar Λφ$ are measured t… ▽ More Based on $(2712.4 \pm 14.3) \times 10^{6}$ $ e^{+}e^{-}\toψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, we report the first evidence of $χ_{c0}\to Λ\bar Λφ$ decays and the first observation of $χ_{c1,2}\to Λ\bar Λφ$ decays, with significances of $4.5σ$, $11.3σ$ and $13.0σ$, respectively. The decay branching fractions of $χ_{c0,1,2}\to Λ\bar Λφ$ are measured to be $( 2.99\pm1.24\pm0.19) \times 10^{-5}$, $(6.01\pm0.90\pm0.40 )\times 10^{-5}$, and $(7.13\pm0.81\pm0.36) \times 10^{-5}$, where the first uncertainties are statistical and the second systematic. No obvious enhancement near the $Λ\barΛ$ production threshold or excited $Λ$ state is found in the $Λφ$ (or $\barΛφ$) system. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: 10 pages, 9 figures

arXiv:2405.20637 [pdf, ps, other]

Boundedness in a two-dimensional doubly degenerate nutrient taxis system

Authors: Zhiguang Zhang, Yuxxiang Li

Abstract: In this work, we study the no-flux initial-boundary value problem for the doubly degenerate nutrient taxis system \begin{align} \begin{cases}\tag{$\star$}\label{eq 0.1} u_t=\nabla \cdot(u v \nabla u)-χ\nabla \cdot\left(u^{2} v \nabla v\right)+\ell u v, & x \in Ω, t>0, \\ v_t=Δv-u v, & x \in Ω, t>0 \end{cases} \end{align} in a smoothly bounded convex domain $Ω\subset \mathbb{R}^2$, where $χ>0$ and… ▽ More In this work, we study the no-flux initial-boundary value problem for the doubly degenerate nutrient taxis system \begin{align} \begin{cases}\tag{$\star$}\label{eq 0.1} u_t=\nabla \cdot(u v \nabla u)-χ\nabla \cdot\left(u^{2} v \nabla v\right)+\ell u v, & x \in Ω, t>0, \\ v_t=Δv-u v, & x \in Ω, t>0 \end{cases} \end{align} in a smoothly bounded convex domain $Ω\subset \mathbb{R}^2$, where $χ>0$ and $\ell \geq 0$. In this paper, we present that for all reasonably regular initial data, the model \eqref{eq 0.1} possesses a global bounded weak solution which is continuous in its first and essentially smooth in its second component. \end{abstract} △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.20554 [pdf, ps, other]

Three approaches to a categorical Torelli theorem for cubic threefolds of non-Eckardt type via the equivariant Kuznetsov components

Authors: Sebastian Casalaina-Martin, Xianyu Hu, Xun Lin, Shizhuo Zhang, Zheng Zhang

Abstract: Let $Y$ be a cubic threefold with a non-Eckardt type involution $τ$. Our first main result is that the $τ$-equivariant category of the Kuznetsov component $\mathcal{K}u_{\mathbb{Z}_2}(Y)$ determines the isomorphism class of $Y$ for general $(Y,τ)$. We shall prove this categorical Torelli theorem via three approaches: a noncommutative Hodge theoretical one (using a generalization of the intermediat… ▽ More Let $Y$ be a cubic threefold with a non-Eckardt type involution $τ$. Our first main result is that the $τ$-equivariant category of the Kuznetsov component $\mathcal{K}u_{\mathbb{Z}_2}(Y)$ determines the isomorphism class of $Y$ for general $(Y,τ)$. We shall prove this categorical Torelli theorem via three approaches: a noncommutative Hodge theoretical one (using a generalization of the intermediate Jacobian construction in [perry2020integral], a Bridgeland moduli theoretical one (using equivariant stability conditions), and a Chow theoretical one (using some techniques in [kuznetsovnonclodedfield2021].The remaining part of the paper is devoted to proving an equivariant infinitesimal categorical Torelli for non-Eckardt cubic threefolds $(Y,τ)$. To accomplish it, we prove a compatibility theorem on the algebra structures of the Hochschild cohomology of the bounded derived category $D^b(X)$ of a smooth projective variety $X$ and on the Hochschild cohomology of a semi-orthogonal component of $D^b(X)$. Another key ingredient is a generalization of a result in [macri2009infinitesimal] which shows that the twisted Hochschild-Kostant-Rosenberg isomorphism is compatible with the actions on the Hochschild cohomology and on the singular cohomology induced by an automorphism of $X$. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 37 pages, comments are welcome

MSC Class: 14F05; 14J45; 14D20; 14D23

arXiv:2405.20351 [pdf, other]

ADR-BC: Adversarial Density Weighted Regression Behavior Cloning

Authors: Ziqi Zhang, Zifeng Zhuang, Donglin Wang, **gzehua Xu, Miao Liu, Shuai Zhang

Abstract: Typically, traditional Imitation Learning (IL) methods first shape a reward or Q function and then use this shaped function within a reinforcement learning (RL) framework to optimize the empirical policy. However, if the shaped reward/Q function does not adequately represent the ground truth reward/Q function, updating the policy within a multi-step RL framework may result in cumulative bias, furt… ▽ More Typically, traditional Imitation Learning (IL) methods first shape a reward or Q function and then use this shaped function within a reinforcement learning (RL) framework to optimize the empirical policy. However, if the shaped reward/Q function does not adequately represent the ground truth reward/Q function, updating the policy within a multi-step RL framework may result in cumulative bias, further impacting policy learning. Although utilizing behavior cloning (BC) to learn a policy by directly mimicking a few demonstrations in a single-step updating manner can avoid cumulative bias, BC tends to greedily imitate demonstrated actions, limiting its capacity to generalize to unseen state action pairs. To address these challenges, we propose ADR-BC, which aims to enhance behavior cloning through augmented density-based action support, optimizing the policy with this augmented support. Specifically, the objective of ADR-BC shares the similar physical meanings that matching expert distribution while diverging the sub-optimal distribution. Therefore, ADR-BC can achieve more robust expert distribution matching. Meanwhile, as a one-step behavior cloning framework, ADR-BC avoids the cumulative bias associated with multi-step RL frameworks. To validate the performance of ADR-BC, we conduct extensive experiments. Specifically, ADR-BC showcases a 10.5% improvement over the previous state-of-the-art (SOTA) generalized IL baseline, CEIL, across all tasks in the Gym-Mujoco domain. Additionally, it achieves an 89.5% improvement over Implicit Q Learning (IQL) using real rewards across all tasks in the Adroit and Kitchen domains. On the other hand, we conduct extensive ablations to further demonstrate the effectiveness of ADR-BC. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.20335 [pdf, other]

Xwin-LM: Strong and Scalable Alignment Practice for LLMs

Authors: Bolin Ni, **gCheng Hu, Yixuan Wei, Houwen Peng, Zheng Zhang, Gaofeng Meng, Han Hu

Abstract: In this work, we present Xwin-LM, a comprehensive suite of alignment methodologies for large language models (LLMs). This suite encompasses several key techniques, including supervised finetuning (SFT), reward modeling (RM), rejection sampling finetuning (RS), and direct preference optimization (DPO). The key components are as follows: (1) Xwin-LM-SFT, models initially finetuned with high-quality… ▽ More In this work, we present Xwin-LM, a comprehensive suite of alignment methodologies for large language models (LLMs). This suite encompasses several key techniques, including supervised finetuning (SFT), reward modeling (RM), rejection sampling finetuning (RS), and direct preference optimization (DPO). The key components are as follows: (1) Xwin-LM-SFT, models initially finetuned with high-quality instruction data; (2) Xwin-Pair, a large-scale, multi-turn preference dataset meticulously annotated using GPT-4; (3) Xwin-RM, reward models trained on Xwin-Pair, developed at scales of 7B, 13B, and 70B parameters; (4) Xwin-Set, a multiwise preference dataset in which each prompt is linked to 64 unique responses generated by Xwin-LM-SFT and scored by Xwin-RM; (5) Xwin-LM-RS, models finetuned with the highest-scoring responses from Xwin-Set; (6) Xwin-LM-DPO, models further optimized on Xwin-Set using the DPO algorithm. Our evaluations on AlpacaEval and MT-bench demonstrate consistent and significant improvements across the pipeline, demonstrating the strength and scalability of Xwin-LM. The repository https://github.com/Xwin-LM/Xwin-LM will be continually updated to foster community research. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.20325 [pdf, other]

MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion

Authors: Shuyuan Tu, Qi Dai, Zihao Zhang, Sicheng Xie, Zhi-Qi Cheng, Chong Luo, Xintong Han, Zuxuan Wu, Yu-Gang Jiang

Abstract: Despite impressive advancements in diffusion-based video editing models in altering video attributes, there has been limited exploration into modifying motion information while preserving the original protagonist's appearance and background. In this paper, we propose MotionFollower, a lightweight score-guided diffusion model for video motion editing. To introduce conditional controls to the denois… ▽ More Despite impressive advancements in diffusion-based video editing models in altering video attributes, there has been limited exploration into modifying motion information while preserving the original protagonist's appearance and background. In this paper, we propose MotionFollower, a lightweight score-guided diffusion model for video motion editing. To introduce conditional controls to the denoising process, MotionFollower leverages two of our proposed lightweight signal controllers, one for poses and the other for appearances, both of which consist of convolution blocks without involving heavy attention calculations. Further, we design a score guidance principle based on a two-branch architecture, including the reconstruction and editing branches, which significantly enhance the modeling capability of texture details and complicated backgrounds. Concretely, we enforce several consistency regularizers and losses during the score estimation. The resulting gradients thus inject appropriate guidance to the intermediate latents, forcing the model to preserve the original background details and protagonists' appearances without interfering with the motion modification. Experiments demonstrate the competitive motion editing ability of MotionFollower qualitatively and quantitatively. Compared with MotionEditor, the most advanced motion editing model, MotionFollower achieves an approximately 80% reduction in GPU memory while delivering superior motion editing performance and exclusively supporting large camera movements and actions. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 23 pages, 18 figures. Project page at https://francis-rings.github.io/MotionFollower/

MSC Class: 68T45; 68T10

arXiv:2405.20068 [pdf, other]

An Efficient Network with Novel Quantization Designed for Massive MIMO CSI Feedback

Authors: Xinran Sun, Zhengming Zhang, Luxi Yang

Abstract: The efficacy of massive multiple-input multiple-output (MIMO) techniques heavily relies on the accuracy of channel state information (CSI) in frequency division duplexing (FDD) systems. Many works focus on CSI compression and quantization methods to enhance CSI reconstruction accuracy with lower feedback overhead. In this letter, we propose CsiConformer, a novel CSI feedback network that combines… ▽ More The efficacy of massive multiple-input multiple-output (MIMO) techniques heavily relies on the accuracy of channel state information (CSI) in frequency division duplexing (FDD) systems. Many works focus on CSI compression and quantization methods to enhance CSI reconstruction accuracy with lower feedback overhead. In this letter, we propose CsiConformer, a novel CSI feedback network that combines convolutional operations and self-attention mechanisms to improve CSI feedback accuracy. Additionally, a new quantization module is developed to improve encoding efficiency. Experiment results show that CsiConformer outperforms previous state-of-the-art networks, achieving an average accuracy improvement of 17.67\% with lower computational overhead. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.19893 [pdf, other]

Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Authors: Chun**g Gan, Dan Yang, Binbin Hu, Hanxiao Zhang, Siyuan Li, Ziqi Liu, Yue Shen, Lin Ju, Zhiqiang Zhang, **jie Gu, Lei Liang, Jun Zhou

Abstract: In recent years, large language models (LLMs) have made remarkable achievements in various domains. However, the untimeliness and cost of knowledge updates coupled with hallucination issues of LLMs have curtailed their applications in knowledge intensive tasks, where retrieval augmented generation (RAG) can be of help. Nevertheless, existing retrieval augmented models typically use similarity as a… ▽ More In recent years, large language models (LLMs) have made remarkable achievements in various domains. However, the untimeliness and cost of knowledge updates coupled with hallucination issues of LLMs have curtailed their applications in knowledge intensive tasks, where retrieval augmented generation (RAG) can be of help. Nevertheless, existing retrieval augmented models typically use similarity as a bridge between queries and documents and follow a retrieve then read procedure. In this work, we argue that similarity is not always the panacea and totally relying on similarity would sometimes degrade the performance of retrieval augmented generation. To this end, we propose MetRag, a Multi layEred Thoughts enhanced Retrieval Augmented Generation framework. To begin with, beyond existing similarity oriented thought, we embrace a small scale utility model that draws supervision from an LLM for utility oriented thought and further come up with a smarter model by comprehensively combining the similarity and utility oriented thoughts. Furthermore, given the fact that the retrieved document set tends to be huge and using them in isolation makes it difficult to capture the commonalities and characteristics among them, we propose to make an LLM as a task adaptive summarizer to endow retrieval augmented generation with compactness-oriented thought. Finally, with multi layered thoughts from the precedent stages, an LLM is called for knowledge augmented generation. Extensive experiments on knowledge-intensive tasks have demonstrated the superiority of MetRag. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 12 pages

arXiv:2405.19788 [pdf]

Unidirectional charge orders induced by oxygen vacancies on SrTiO$_3$(001)

Authors: Cui Ding, Wenfeng Dong, Xiaotong Jiao, Zhiyu Zhang, Guanming Gong, Zhongxu Wei, Lili Wang, **-Feng Jia, Qi-Kun Xue

Abstract: The discovery of high-mobility two-dimensional electron gas and low carrier density superconductivity in multiple SrTiO$_3$-based heterostructures has stimulated intense interest in the surface properties of SrTiO$_3$. The recent discovery of high-T$_c$ superconductivity in the monolayer FeSe/SrTiO$_3$ aroused the upsurge and underscored the atomic precision probe of the surface structure. By perf… ▽ More The discovery of high-mobility two-dimensional electron gas and low carrier density superconductivity in multiple SrTiO$_3$-based heterostructures has stimulated intense interest in the surface properties of SrTiO$_3$. The recent discovery of high-T$_c$ superconductivity in the monolayer FeSe/SrTiO$_3$ aroused the upsurge and underscored the atomic precision probe of the surface structure. By performing atomically resolved cryogenic scanning tunneling microscopy/spectroscopy characterization on dual-TiO$_{2}$-$δ$-terminated SrTiO$_3$(001) surfaces with ($\sqrt{13}$ $\times$ $\sqrt{13}$), c(4 $\times$ 2), mixed (2 $\times$ 1), and (2 $\times$ 2) reconstructions, we disclosed universally broken rotational symmetry and contrasting bias- and temperature-dependent electronic states for apical and equatorial oxygen sites. With the sequentially evolved surface reconstructions and simultaneously increasing equatorial oxygen vacancies, the surface anisotropy reduces, and the work function lowers. Intriguingly, unidirectional stripe orders appear on the c(4 $\times$ 2) surface, whereas local (4 $\times$ 4) order emerges and eventually forms long-range unidirectional c(4 $\times$ 4) charge order on the (2 $\times$ 2) surface. This work reveals robust unidirectional charge orders induced by oxygen vacancies due to strong and delicate electronic-lattice interaction under broken rotational symmetry, providing insights into understanding the complex behaviors in perovskite oxide-based heterostructures. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.19677 [pdf, other]

Large Language Model Watermark Stealing With Mixed Integer Programming

Authors: Zhaoxi Zhang, Xiaomei Zhang, Yanjun Zhang, Leo Yu Zhang, Chao Chen, Shengshan Hu, Asif Gill, Shirui Pan

Abstract: The Large Language Model (LLM) watermark is a newly emerging technique that shows promise in addressing concerns surrounding LLM copyright, monitoring AI-generated text, and preventing its misuse. The LLM watermark scheme commonly includes generating secret keys to partition the vocabulary into green and red lists, applying a perturbation to the logits of tokens in the green list to increase their… ▽ More The Large Language Model (LLM) watermark is a newly emerging technique that shows promise in addressing concerns surrounding LLM copyright, monitoring AI-generated text, and preventing its misuse. The LLM watermark scheme commonly includes generating secret keys to partition the vocabulary into green and red lists, applying a perturbation to the logits of tokens in the green list to increase their sampling likelihood, thus facilitating watermark detection to identify AI-generated text if the proportion of green tokens exceeds a threshold. However, recent research indicates that watermarking methods using numerous keys are susceptible to removal attacks, such as token editing, synonym substitution, and paraphrasing, with robustness declining as the number of keys increases. Therefore, the state-of-the-art watermark schemes that employ fewer or single keys have been demonstrated to be more robust against text editing and paraphrasing. In this paper, we propose a novel green list stealing attack against the state-of-the-art LLM watermark scheme and systematically examine its vulnerability to this attack. We formalize the attack as a mixed integer programming problem with constraints. We evaluate our attack under a comprehensive threat model, including an extreme scenario where the attacker has no prior knowledge, lacks access to the watermark detector API, and possesses no information about the LLM's parameter settings or watermark injection/detection scheme. Extensive experiments on LLMs, such as OPT and LLaMA, demonstrate that our attack can successfully steal the green list and remove the watermark across all settings. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 12 pages

arXiv:2405.19674 [pdf, ps, other]

On blow-up for the supercritical defocusing nonlinear wave equation

Authors: Feng Shao, Dongyi Wei, Zhifei Zhang

Abstract: In this paper, we consider the defocusing nonlinear wave equation $-\partial_t^2u+Δu=|u|^{p-1}u$ in $\mathbb R\times \mathbb R^d$. Building on our companion work ({\it \small Self-similar imploding solutions of the relativistic Euler equations}), we prove that for $d=4, p\geq 29$ and $d\geq 5, p\geq 17$, there exists a smooth complex-valued solution that blows up in finite time. In this paper, we consider the defocusing nonlinear wave equation $-\partial_t^2u+Δu=|u|^{p-1}u$ in $\mathbb R\times \mathbb R^d$. Building on our companion work ({\it \small Self-similar imploding solutions of the relativistic Euler equations}), we prove that for $d=4, p\geq 29$ and $d\geq 5, p\geq 17$, there exists a smooth complex-valued solution that blows up in finite time. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 56 pages

arXiv:2405.19661 [pdf, other]

MGCP: A Multi-Grained Correlation based Prediction Network for Multivariate Time Series

Authors: Zhicheng Chen, Xi Xiao, Ke Xu, Zhong Zhang, Yu Rong, Qing Li, Guojun Gan, Zhiqiang Xu, Peilin Zhao

Abstract: Multivariate time series prediction is widely used in daily life, which poses significant challenges due to the complex correlations that exist at multi-grained levels. Unfortunately, the majority of current time series prediction models fail to simultaneously learn the correlations of multivariate time series at multi-grained levels, resulting in suboptimal performance. To address this, we propos… ▽ More Multivariate time series prediction is widely used in daily life, which poses significant challenges due to the complex correlations that exist at multi-grained levels. Unfortunately, the majority of current time series prediction models fail to simultaneously learn the correlations of multivariate time series at multi-grained levels, resulting in suboptimal performance. To address this, we propose a Multi-Grained Correlations-based Prediction (MGCP) Network, which simultaneously considers the correlations at three granularity levels to enhance prediction performance. Specifically, MGCP utilizes Adaptive Fourier Neural Operators and Graph Convolutional Networks to learn the global spatiotemporal correlations and inter-series correlations, enabling the extraction of potential features from multivariate time series at fine-grained and medium-grained levels. Additionally, MGCP employs adversarial training with an attention mechanism-based predictor and conditional discriminator to optimize prediction results at coarse-grained level, ensuring high fidelity between the generated forecast results and the actual data distribution. Finally, we compare MGCP with several state-of-the-art time series prediction algorithms on real-world benchmark datasets, and our results demonstrate the generality and effectiveness of the proposed model. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.19621 [pdf, ps, other]

Riesz potential estimates for double obstacle problems with Orlicz growth

Authors: Qi Xiong, Zhenqiu Zhang, Lingwei Ma

Abstract: In this paper, we consider the solutions to the non-homogeneous double obstacle problems with Orlicz growth involving measure data. After establishing the existence of the solutions to this problem in the Orlicz-Sobolev space, we derive a pointwise gradient estimate for these solutions by Riesz potential, which leads to the result on the $C^1$ regularity criterion. In this paper, we consider the solutions to the non-homogeneous double obstacle problems with Orlicz growth involving measure data. After establishing the existence of the solutions to this problem in the Orlicz-Sobolev space, we derive a pointwise gradient estimate for these solutions by Riesz potential, which leads to the result on the $C^1$ regularity criterion. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.19580 [pdf, other]

Facilitating Mixed-Methods Analysis with Computational Notebooks

Authors: Jiawen Stefanie Zhu, Zibo Zhang, Jian Zhao

Abstract: Data exploration is an important aspect of the workflow of mixed-methods researchers, who conduct both qualitative and quantitative analysis. However, there currently exists few tools that adequately support both types of analysis simultaneously, forcing researchers to context-switch between different tools and increasing their mental burden when integrating the results. To address this gap, we pr… ▽ More Data exploration is an important aspect of the workflow of mixed-methods researchers, who conduct both qualitative and quantitative analysis. However, there currently exists few tools that adequately support both types of analysis simultaneously, forcing researchers to context-switch between different tools and increasing their mental burden when integrating the results. To address this gap, we propose a unified environment that facilitates mixed-methods analysis in a computational notebook-based settings. We conduct a scenario study with three HCI mixed-methods researchers to gather feedback on our design concept and to understand our users' needs and requirements. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: Appeared at 1st ACM CHI Workshop on Human-Notebook Interactions

ACM Class: H.5; H.5.2

arXiv:2405.19528 [pdf, other]

Predicting Long-Term Human Behaviors in Discrete Representations via Physics-Guided Diffusion

Authors: Zhitian Zhang, Anjian Li, Angelica Lim, Mo Chen

Abstract: Long-term human trajectory prediction is a challenging yet critical task in robotics and autonomous systems. Prior work that studied how to predict accurate short-term human trajectories with only unimodal features often failed in long-term prediction. Reinforcement learning provides a good solution for learning human long-term behaviors but can suffer from challenges in data efficiency and optimi… ▽ More Long-term human trajectory prediction is a challenging yet critical task in robotics and autonomous systems. Prior work that studied how to predict accurate short-term human trajectories with only unimodal features often failed in long-term prediction. Reinforcement learning provides a good solution for learning human long-term behaviors but can suffer from challenges in data efficiency and optimization. In this work, we propose a long-term human trajectory forecasting framework that leverages a guided diffusion model to generate diverse long-term human behaviors in a high-level latent action space, obtained via a hierarchical action quantization scheme using a VQ-VAE to discretize continuous trajectories and the available context. The latent actions are predicted by our guided diffusion model, which uses physics-inspired guidance at test time to constrain generated multimodal action distributions. Specifically, we use reachability analysis during the reverse denoising process to guide the diffusion steps toward physically feasible latent actions. We evaluate our framework on two publicly available human trajectory forecasting datasets: SFU-Store-Nav and JRDB, and extensive experimental results show that our framework achieves superior performance in long-term human trajectory forecasting. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Showing 201–250 of 10,323 results for author: Zhang, Z