Search | arXiv e-print repository

Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale

Authors: Wenzhen Zheng, Wenbo Pan, Xu Xu, Libo Qin, Li Yue, Ming Zhou

Abstract: In recent years, Large Language Models (LLMs) have made significant strides towards Artificial General Intelligence. However, training these models from scratch requires substantial computational resources and vast amounts of text data. In this paper, we explore an alternative approach to constructing an LLM for a new language by continually pretraining (CPT) from existing pretrained LLMs, instead… ▽ More In recent years, Large Language Models (LLMs) have made significant strides towards Artificial General Intelligence. However, training these models from scratch requires substantial computational resources and vast amounts of text data. In this paper, we explore an alternative approach to constructing an LLM for a new language by continually pretraining (CPT) from existing pretrained LLMs, instead of using randomly initialized parameters. Based on parallel experiments on 40 model sizes ranging from 40M to 5B parameters, we find that 1) CPT converges faster and saves significant resources in a scalable manner; 2) CPT adheres to an extended scaling law derived from Hoffmann et al. (2022) with a joint data-parameter scaling term; 3) The compute-optimal data-parameter allocation for CPT markedly differs based on our estimated scaling factors; 4) The effectiveness of transfer at scale is influenced by training duration and linguistic properties, while robust to data replaying, a method that effectively mitigates catastrophic forgetting in CPT. We hope our findings provide deeper insights into the transferability of LLMs at scale for the research community. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 8 pages

arXiv:2406.19767 [pdf, other]

Subgraph Matching via Partial Optimal Transport

Authors: Wen-Xin Pan, Isabel Haasler, Pascal Frossard

Abstract: In this work, we propose a novel approach for subgraph matching, the problem of finding a given query graph in a large source graph, based on the fused Gromov-Wasserstein distance. We formulate the subgraph matching problem as a partial fused Gromov-Wasserstein problem, which allows us to build on existing theory and computational methods in order to solve this challenging problem. We extend our m… ▽ More In this work, we propose a novel approach for subgraph matching, the problem of finding a given query graph in a large source graph, based on the fused Gromov-Wasserstein distance. We formulate the subgraph matching problem as a partial fused Gromov-Wasserstein problem, which allows us to build on existing theory and computational methods in order to solve this challenging problem. We extend our method by employing a subgraph sliding approach, which makes it efficient even for large graphs. In numerical experiments, we showcase that our new algorithms have the ability to outperform state-of-the-art methods for subgraph matching on synthetic as well as realworld datasets. In particular, our methods exhibit robustness with respect to noise in the datasets and achieve very fast query times. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.19593 [pdf, other]

SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs

Authors: Xin Su, Man Luo, Kris W Pan, Tien Pei Chou, Vasudev Lal, Phillip Howard

Abstract: Synthetic data generation has gained significant attention recently for its utility in training large vision and language models. However, the application of synthetic data to the training of multimodal context-augmented generation systems has been relatively unexplored. This gap in existing work is important because existing vision and language models (VLMs) are not trained specifically for conte… ▽ More Synthetic data generation has gained significant attention recently for its utility in training large vision and language models. However, the application of synthetic data to the training of multimodal context-augmented generation systems has been relatively unexplored. This gap in existing work is important because existing vision and language models (VLMs) are not trained specifically for context-augmented generation. Resources for adapting such models are therefore crucial for enabling their use in retrieval-augmented generation (RAG) settings, where a retriever is used to gather relevant information that is then subsequently provided to a generative model via context augmentation. To address this challenging problem, we generate SK-VQA: a large synthetic multimodal dataset containing over 2 million question-answer pairs which require external knowledge to determine the final answer. Our dataset is both larger and significantly more diverse than existing resources of its kind, possessing over 11x more unique questions and containing images from a greater variety of sources than previously-proposed datasets. Through extensive experiments, we demonstrate that our synthetic dataset can not only serve as a challenging benchmark, but is also highly effective for adapting existing generative multimodal models for context-augmented generation. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.19247 [pdf, other]

Local Manifold Learning for No-Reference Image Quality Assessment

Authors: Timin Gao, Wensheng Pan, Yan Zhang, Sicheng Zhao, Shengchuan Zhang, Xiawu Zheng, Ke Li, Liujuan Cao, Rongrong Ji

Abstract: Contrastive learning has considerably advanced the field of Image Quality Assessment (IQA), emerging as a widely adopted technique. The core mechanism of contrastive learning involves minimizing the distance between quality-similar (positive) examples while maximizing the distance between quality-dissimilar (negative) examples. Despite its successes, current contrastive learning methods often negl… ▽ More Contrastive learning has considerably advanced the field of Image Quality Assessment (IQA), emerging as a widely adopted technique. The core mechanism of contrastive learning involves minimizing the distance between quality-similar (positive) examples while maximizing the distance between quality-dissimilar (negative) examples. Despite its successes, current contrastive learning methods often neglect the importance of preserving the local manifold structure. This oversight can result in a high degree of similarity among hard examples within the feature space, thereby impeding effective differentiation and assessment. To address this issue, we propose an innovative framework that integrates local manifold learning with contrastive learning for No-Reference Image Quality Assessment (NR-IQA). Our method begins by sampling multiple crops from a given image, identifying the most visually salient crop. This crop is then used to cluster other crops from the same image as the positive class, while crops from different images are treated as negative classes to increase inter-class distance. Uniquely, our approach also considers non-saliency crops from the same image as intra-class negative classes to preserve their distinctiveness. Additionally, we employ a mutual learning framework, which further enhances the model's ability to adaptively learn and identify visual saliency regions. Our approach demonstrates a better performance compared to state-of-the-art methods in 7 standard datasets, achieving PLCC values of 0.942 (compared to 0.908 in TID2013) and 0.914 (compared to 0.894 in LIVEC). △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.12215 [pdf, other]

Discrete Variable Topology Optimization Using Multi-Cut Formulation and Adaptive Trust Regions

Authors: Zisheng Ye, Wenxiao Pan

Abstract: We present a new framework for solving general topology optimization (TO) problems that find an optimal material distribution within a design space to maximize the performance of a structure while satisfying design constraints. These problems involve state variables that nonlinearly depend on the design variables, with objective functions that can be convex or non-convex, and may include multiple… ▽ More We present a new framework for solving general topology optimization (TO) problems that find an optimal material distribution within a design space to maximize the performance of a structure while satisfying design constraints. These problems involve state variables that nonlinearly depend on the design variables, with objective functions that can be convex or non-convex, and may include multiple candidate materials. The framework is designed to greatly enhance computational efficiency, primarily by diminishing optimization iteration counts and thereby reducing the solving of associated state-equilibrium partial differential equations (PDEs). It maintains binary design variables and addresses the large-scale mixed integer nonlinear programming (MINLP) problem that arises from discretizing the design space and PDEs. The core of this framework is the integration of the generalized Benders' decomposition and adaptive trust regions. The trust-region radius adapts based on a merit function. To mitigate ill-conditioning due to extreme parameter values, we further introduce a parameter relaxation scheme where two parameters are relaxed in stages at different paces. Numerical tests validate the framework's superior performance, including minimum compliance and compliant mechanism problems in single-material and multi-material designs. We compare our results with those of other methods and demonstrate significant reductions in optimization iterations by about one order of magnitude, while maintaining comparable optimal objective function values. As the design variables and constraints increase, the framework maintains consistent solution quality and efficiency, underscoring its good scalability. We anticipate this framework will be especially advantageous for TO applications involving substantial design variables and constraints and requiring significant computational resources for PDE solving. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.10744

Technique Report of CVPR 2024 PBDL Challenges

Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Jose Alvarez, Coert van Gemeren, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Sheng** Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou , et al. (77 additional authors not shown)

Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images. In recent years, deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems. This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop. The challenge consisted of eight tracks, focusing on Low-Light Enhancement and Detection as well as High Dynamic Range (HDR) Imaging. This report details the objectives, methodologies, and results of each track, highlighting the top-performing solutions and their innovative approaches. △ Less

Submitted 27 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

Comments: The author list and contents need to be verified by all authors

arXiv:2406.03598 [pdf]

Time reversal symmetry breaking and zero magnetic field Josephson diode effect in Dirac semimetal Cd3As2-mediated asymmetric SQUIDs

Authors: W. Yu, J. J. Cuozzo, K. Sapkota, E. Rossi, D. X. Rademacher, T. M. Nenoff, W. Pan

Abstract: A zero-magnetic-field Josephson diode effect (JDE) is observed in an asymmetric superconducting quantum interference device (SQUID) mediated by Dirac semimetal Cd3As2. Herein it is shown that phase coupling between the surface and bulk superconducting channels, a unique phenomenon recently identified in the observations of fractional Josephson effect and Leggett modes in Cd3As2, can break time rev… ▽ More A zero-magnetic-field Josephson diode effect (JDE) is observed in an asymmetric superconducting quantum interference device (SQUID) mediated by Dirac semimetal Cd3As2. Herein it is shown that phase coupling between the surface and bulk superconducting channels, a unique phenomenon recently identified in the observations of fractional Josephson effect and Leggett modes in Cd3As2, can break time reversal symmetry (TRS) and, therefore, give rise to the zero-field JDE. It is identified that the efficiency of the JDE can be readily controlled by varying the geometry of the Josephson junction (JJ) arms in the SQUIDs, thus providing an explanation of different JDE behaviors in two SQUIDs examined in this work. Our results are anticipated to have important implications in superconducting electronic circuit applications. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.02272 [pdf, other]

Computation-Aware Learning for Stable Control with Gaussian Process

Authors: Wenhan Cao, Alexandre Capone, Rishabh Yadav, Sandra Hirche, Wei Pan

Abstract: In Gaussian Process (GP) dynamical model learning for robot control, particularly for systems constrained by computational resources like small quadrotors equipped with low-end processors, analyzing stability and designing a stable controller present significant challenges. This paper distinguishes between two types of uncertainty within the posteriors of GP dynamical models: the well-documented m… ▽ More In Gaussian Process (GP) dynamical model learning for robot control, particularly for systems constrained by computational resources like small quadrotors equipped with low-end processors, analyzing stability and designing a stable controller present significant challenges. This paper distinguishes between two types of uncertainty within the posteriors of GP dynamical models: the well-documented mathematical uncertainty stemming from limited data and computational uncertainty arising from constrained computational capabilities, which has been largely overlooked in prior research. Our work demonstrates that computational uncertainty, quantified through a probabilistic approximation of the inverse covariance matrix in GP dynamical models, is essential for stable control under computational constraints. We show that incorporating computational uncertainty can prevent overestimating the region of attraction, a safe subset of the state space with asymptotic stability, thus improving system safety. Building on these insights, we propose an innovative controller design methodology that integrates computational uncertainty within a second-order cone programming framework. Simulations of canonical stable control tasks and experiments of quadrotor tracking exhibit the effectiveness of our method under computational constraints. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.00116 [pdf, other]

A Sim2Real Approach for Identifying Task-Relevant Properties in Interpretable Machine Learning

Authors: Eura Nofshin, Esther Brown, Brian Lim, Weiwei Pan, Finale Doshi-Velez

Abstract: Existing user studies suggest that different tasks may require explanations with different properties. However, user studies are expensive. In this paper, we introduce a generalizable, cost-effective method for identifying task-relevant explanation properties in silico, which can guide the design of more expensive user studies. We use our approach to identify relevant proxies for three example tas… ▽ More Existing user studies suggest that different tasks may require explanations with different properties. However, user studies are expensive. In this paper, we introduce a generalizable, cost-effective method for identifying task-relevant explanation properties in silico, which can guide the design of more expensive user studies. We use our approach to identify relevant proxies for three example tasks and validate our simulation with real user studies. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2405.20195 [pdf, other]

Using Large Language Models for Humanitarian Frontline Negotiation: Opportunities and Considerations

Authors: Zilin Ma, Susannah, Su, Nathan Zhao, Linn Bieske, Blake Bullwinkel, Yanyi Zhang, Sophia, Yang, Ziqing Luo, Siyao Li, Gekai Liao, Boxiang Wang, **glun Gao, Zihan Wen, Claude Bruderlein, Weiwei Pan

Abstract: Humanitarian negotiations in conflict zones, called \emph{frontline negotiation}, are often highly adversarial, complex, and high-risk. Several best-practices have emerged over the years that help negotiators extract insights from large datasets to navigate nuanced and rapidly evolving scenarios. Recent advances in large language models (LLMs) have sparked interest in the potential for AI to aid d… ▽ More Humanitarian negotiations in conflict zones, called \emph{frontline negotiation}, are often highly adversarial, complex, and high-risk. Several best-practices have emerged over the years that help negotiators extract insights from large datasets to navigate nuanced and rapidly evolving scenarios. Recent advances in large language models (LLMs) have sparked interest in the potential for AI to aid decision making in frontline negotiation. Through in-depth interviews with 13 experienced frontline negotiators, we identified their needs for AI-assisted case analysis and creativity support, as well as concerns surrounding confidentiality and model bias. We further explored the potential for AI augmentation of three standard tools used in frontline negotiation planning. We evaluated the quality and stability of our ChatGPT-based negotiation tools in the context of two real cases. Our findings highlight the potential for LLMs to enhance humanitarian negotiations and underscore the need for careful ethical and practical considerations. △ Less

Submitted 30 May, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.19909 [pdf, other]

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

Authors: Tenglong Liu, Yang Li, Yixing Lan, Hao Gao, Wei Pan, Xin Xu

Abstract: In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced. To address this, existing methods often constrain the learned policy through policy regularization. However, these methods often suffer from the issue of unnecessary conservativeness, hampering policy improvement. This occurs due to the indiscriminate use of all actions from the behavior policy that genera… ▽ More In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced. To address this, existing methods often constrain the learned policy through policy regularization. However, these methods often suffer from the issue of unnecessary conservativeness, hampering policy improvement. This occurs due to the indiscriminate use of all actions from the behavior policy that generates the offline dataset as constraints. The problem becomes particularly noticeable when the quality of the dataset is suboptimal. Thus, we propose Adaptive Advantage-guided Policy Regularization (A2PR), obtaining high-advantage actions from an augmented behavior policy combined with VAE to guide the learned policy. A2PR can select high-advantage actions that differ from those present in the dataset, while still effectively maintaining conservatism from OOD actions. This is achieved by harnessing the VAE capacity to generate samples matching the distribution of the data points. We theoretically prove that the improvement of the behavior policy is guaranteed. Besides, it effectively mitigates value overestimation with a bounded performance gap. Empirically, we conduct a series of experiments on the D4RL benchmark, where A2PR demonstrates state-of-the-art performance. Furthermore, experimental results on additional suboptimal mixed datasets reveal that A2PR exhibits superior performance. Code is available at https://github.com/ltlhuuu/A2PR. △ Less

Submitted 1 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Comments: ICML 2024, 19 pages

arXiv:2405.18194 [pdf, other]

Delving into Differentially Private Transformer

Authors: Youlong Ding, Xueyang Wu, Yining Meng, Yonggang Luo, Hao Wang, Weike Pan

Abstract: Deep learning with differential privacy (DP) has garnered significant attention over the past years, leading to the development of numerous methods aimed at enhancing model accuracy and training efficiency. This paper delves into the problem of training Transformer models with differential privacy. Our treatment is modular: the logic is to `reduce' the problem of training DP Transformer to the mor… ▽ More Deep learning with differential privacy (DP) has garnered significant attention over the past years, leading to the development of numerous methods aimed at enhancing model accuracy and training efficiency. This paper delves into the problem of training Transformer models with differential privacy. Our treatment is modular: the logic is to `reduce' the problem of training DP Transformer to the more basic problem of training DP vanilla neural nets. The latter is better understood and amenable to many model-agnostic methods. Such `reduction' is done by first identifying the hardness unique to DP Transformer training: the attention distraction phenomenon and a lack of compatibility with existing techniques for efficient gradient clip**. To deal with these two issues, we propose the Re-Attention Mechanism and Phantom Clip**, respectively. We believe that our work not only casts new light on training DP Transformers but also promotes a modular treatment to advance research in the field of differentially private deep learning. △ Less

Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

Comments: ICML 2024

arXiv:2405.17250 [pdf, ps, other]

"Pass the butter": A study on desktop-classic multitasking robotic arm based on advanced YOLOv7 and BERT

Authors: Haohua Que, Wenbin Pan, Jie Xu, Hao Luo, Pei Wang, Li Zhang

Abstract: In recent years, various intelligent autonomous robots have begun to appear in daily life and production. Desktop-level robots are characterized by their flexible deployment, rapid response, and suitability for light workload environments. In order to meet the current societal demand for service robot technology, this study proposes using a miniaturized desktop-level robot (by ROS) as a carrier, l… ▽ More In recent years, various intelligent autonomous robots have begun to appear in daily life and production. Desktop-level robots are characterized by their flexible deployment, rapid response, and suitability for light workload environments. In order to meet the current societal demand for service robot technology, this study proposes using a miniaturized desktop-level robot (by ROS) as a carrier, locally deploying a natural language model (NLP-BERT), and integrating visual recognition (CV-YOLO) and speech recognition technology (ASR-Whisper) as inputs to achieve autonomous decision-making and rational action by the desktop robot. Three comprehensive experiments were designed to validate the robotic arm, and the results demonstrate excellent performance using this approach across all three experiments. In Task 1, the execution rates for speech recognition and action performance were 92.6% and 84.3%, respectively. In Task 2, the highest execution rates under the given conditions reached 92.1% and 84.6%, while in Task 3, the highest execution rates were 95.2% and 80.8%, respectively. Therefore, it can be concluded that the proposed solution integrating ASR, NLP, and other technologies on edge devices is feasible and provides a technical and engineering foundation for realizing multimodal desktop-level robots. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.16413 [pdf, other]

Augmented Risk Prediction for the Onset of Alzheimer's Disease from Electronic Health Records with Large Language Models

Authors: Jiankun Wang, Sumyeong Ahn, Taykhoom Dalal, Xiaodan Zhang, Weishen Pan, Qiannan Zhang, Bin Chen, Hiroko H. Dodge, Fei Wang, Jiayu Zhou

Abstract: Alzheimer's disease (AD) is the fifth-leading cause of death among Americans aged 65 and older. Screening and early detection of AD and related dementias (ADRD) are critical for timely intervention and for identifying clinical trial participants. The widespread adoption of electronic health records (EHRs) offers an important resource for develo** ADRD screening tools such as machine learning bas… ▽ More Alzheimer's disease (AD) is the fifth-leading cause of death among Americans aged 65 and older. Screening and early detection of AD and related dementias (ADRD) are critical for timely intervention and for identifying clinical trial participants. The widespread adoption of electronic health records (EHRs) offers an important resource for develo** ADRD screening tools such as machine learning based predictive models. Recent advancements in large language models (LLMs) demonstrate their unprecedented capability of encoding knowledge and performing reasoning, which offers them strong potential for enhancing risk prediction. This paper proposes a novel pipeline that augments risk prediction by leveraging the few-shot inference power of LLMs to make predictions on cases where traditional supervised learning methods (SLs) may not excel. Specifically, we develop a collaborative pipeline that combines SLs and LLMs via a confidence-driven decision-making mechanism, leveraging the strengths of SLs in clear-cut cases and LLMs in more complex scenarios. We evaluate this pipeline using a real-world EHR data warehouse from Oregon Health \& Science University (OHSU) Hospital, encompassing EHRs from over 2.5 million patients and more than 20 million patient encounters. Our results show that our proposed approach effectively combines the power of SLs and LLMs, offering significant improvements in predictive performance. This advancement holds promise for revolutionizing ADRD screening and early detection practices, with potential implications for better strategies of patient management and thus improving healthcare. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.11280 [pdf, other]

Joint Analysis of Single-Cell Data across Cohorts with Missing Modalities

Authors: Marianne Arriola, Weishen Pan, Manqi Zhou, Qiannan Zhang, Chang Su, Fei Wang

Abstract: Joint analysis of multi-omic single-cell data across cohorts has significantly enhanced the comprehensive analysis of cellular processes. However, most of the existing approaches for this purpose require access to samples with complete modality availability, which is impractical in many real-world scenarios. In this paper, we propose (Single-Cell Cross-Cohort Cross-Category) integration, a novel f… ▽ More Joint analysis of multi-omic single-cell data across cohorts has significantly enhanced the comprehensive analysis of cellular processes. However, most of the existing approaches for this purpose require access to samples with complete modality availability, which is impractical in many real-world scenarios. In this paper, we propose (Single-Cell Cross-Cohort Cross-Category) integration, a novel framework that learns unified cell representations under domain shift without requiring full-modality reference samples. Our generative approach learns rich cross-modal and cross-domain relationships that enable imputation of these missing modalities. Through experiments on real-world multi-omic datasets, we demonstrate that offers a robust solution to single-cell tasks such as cell type clustering, cell type classification, and feature imputation. △ Less

Submitted 18 May, 2024; originally announced May 2024.

Comments: 10 pages, 7 figures, 5 tables

arXiv:2405.09202 [pdf, other]

Strain-Induced Intrinsic Antiferromagnetic Skyrmions in Two-Dimensional Janus Magnets

Authors: Weiyi Pan, Zhiming Xu

Abstract: Antiferromagnetic (AFM) skyrmions, which are resistant to both the skyrmion Hall effect and external magnetic perturbations, are expected to be promising candidates for next-generation spintronics devices. Despite being observed in bulk materials and synthetic AFM layered systems, the existence of intrinsic AFM skyrmions within single magnetic layers, which offer potential advantages for spintroni… ▽ More Antiferromagnetic (AFM) skyrmions, which are resistant to both the skyrmion Hall effect and external magnetic perturbations, are expected to be promising candidates for next-generation spintronics devices. Despite being observed in bulk materials and synthetic AFM layered systems, the existence of intrinsic AFM skyrmions within single magnetic layers, which offer potential advantages for spintronic device fabrication, has remained elusive. In this work, taking monolayer CrSi(Te,Se)$_{3}$ as a representative system, we demonstrate the emergence of intrinsic AFM skyrmions in two-dimensional Janus magnets. It is found that under moderate compressive strain, the interplay between considerable Dyzaloshinskii-Moriya interaction and the strain-induced AFM Heisenberg exchange interaction in monolayer CrSi(Te,Se)$_{3}$ would give rise to the emergence of intrinsic AFM skyrmions assembled from AFM spin spirals. Moreover, the application of an external magnetic field could trigger the emergence of AFM merons as well as a canted AFM state. Our findings propose a feasible approach for achieving intrinsic AFM skyrmions in realistic systems, which paves the way for developments in AFM topological spintronics devices. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 10 pages, 6 figures

arXiv:2405.00185 [pdf]

Finite-sample adjustments for comparing clustered adaptive interventions using data from a clustered SMART

Authors: Wenchu Pan, Daniel Almirall, Amy M. Kilbourne, Andrew Quanbeck, Lu Wang

Abstract: Adaptive interventions, aka dynamic treatment regimens, are sequences of pre-specified decision rules that guide the provision of treatment for an individual given information about their baseline and evolving needs, including in response to prior intervention. Clustered adaptive interventions (cAIs) extend this idea by guiding the provision of intervention at the level of clusters (e.g., clinics)… ▽ More Adaptive interventions, aka dynamic treatment regimens, are sequences of pre-specified decision rules that guide the provision of treatment for an individual given information about their baseline and evolving needs, including in response to prior intervention. Clustered adaptive interventions (cAIs) extend this idea by guiding the provision of intervention at the level of clusters (e.g., clinics), but with the goal of improving outcomes at the level of individuals within the cluster (e.g., clinicians or patients within clinics). A clustered, sequential multiple-assignment randomized trials (cSMARTs) is a multistage, multilevel randomized trial design used to construct high-quality cAIs. In a cSMART, clusters are randomized at multiple intervention decision points; at each decision point, the randomization probability can depend on response to prior data. A challenge in cluster-randomized trials, including cSMARTs, is the deleterious effect of small samples of clusters on statistical inference, particularly via estimation of standard errors. \par This manuscript develops finite-sample adjustment (FSA) methods for making improved statistical inference about the causal effects of cAIs in a cSMART. The paper develops FSA methods that (i) scale variance estimators using a degree-of-freedom adjustment, (ii) reference a t distribution (instead of a normal), and (iii) employ a ``bias corrected" variance estimator. Method (iii) requires extensions that are unique to the analysis of cSMARTs. Extensive simulation experiments are used to test the performance of the methods. The methods are illustrated using the Adaptive School-based Implementation of CBT (ASIC) study, a cSMART designed to construct a cAI for improving the delivery of cognitive behavioral therapy (CBT) by school mental health professionals within high schools in Michigan. △ Less

Submitted 30 April, 2024; originally announced May 2024.

arXiv:2404.17354 [pdf, other]

Pole-skip** for massive fields and the Stueckelberg formalism

Authors: Wen-Bin Pan, Ya-Wen Sun, Yuan-Tai Wang

Abstract: Pole-skip** refers to the special phenomenon that the pole and the zero of a retarded two-point Green's function coincide at certain points in momentum space. We study the pole-skip** phenomenon in holographic Green's functions of boundary operators that are dual to massive $p$-form fields and the dRGT massive gravitational fields in the AdS black hole background. Pole-skip** points for thes… ▽ More Pole-skip** refers to the special phenomenon that the pole and the zero of a retarded two-point Green's function coincide at certain points in momentum space. We study the pole-skip** phenomenon in holographic Green's functions of boundary operators that are dual to massive $p$-form fields and the dRGT massive gravitational fields in the AdS black hole background. Pole-skip** points for these systems are computed using the near horizon method. The relation between the pole-skip** points of massive fields and their massless counterparts is revealed. In particular, as the field mass $m$ is varied from zero to non-zero, the pole-skip** phenomenon undergoes an abrupt change with doubled pole-skip** points found in the massive case. This arises from the breaking of gauge invariance due to the mass term and the consequent appearance of more degrees of freedom. We recover the gauge invariance using the Stueckelberg formalism by introducing auxiliary dynamical fields. The extra pole-skip** points are identified to be associated with the Stueckelberg fields. We also observe that, as the mass varies, some pole-skip** points of the wave number $q$ may move from a non-physical region with complex $q$ to a physical region with real $q$. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: 50 pages, 1 figure

arXiv:2404.14949 [pdf, other]

Multi-Modal Prompt Learning on Blind Image Quality Assessment

Authors: Wensheng Pan, Timin Gao, Yan Zhang, Runze Hu, Xiawu Zheng, Enwei Zhang, Yuting Gao, Yutao Liu, Yunhang Shen, Ke Li, Shengchuan Zhang, Liujuan Cao, Rongrong Ji

Abstract: Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Currently, leveraging semantic information to enhance IQA is a crucial research direction. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semant… ▽ More Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Currently, leveraging semantic information to enhance IQA is a crucial research direction. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. However, the generalist nature of these pre-trained Vision-Language (VL) models often renders them suboptimal for IQA-specific tasks. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. Existing prompt-based VL models overly focus on incremental semantic information from text, neglecting the rich insights available from visual data analysis. This imbalance limits their performance improvements in IQA tasks. This paper introduces an innovative multi-modal prompt-based methodology for IQA. Our approach employs carefully crafted prompts that synergistically mine incremental semantic information from both visual and linguistic data. Specifically, in the visual branch, we introduce a multi-layer prompt structure to enhance the VL model's adaptability. In the text branch, we deploy a dual-prompt scheme that steers the model to recognize and differentiate between scene category and distortion type, thereby refining the model's capacity to assess image quality. Our experimental findings underscore the effectiveness of our method over existing Blind Image Quality Assessment (BIQA) approaches. Notably, it demonstrates competitive performance across various datasets. Our method achieves Spearman Rank Correlation Coefficient (SRCC) values of 0.961(surpassing 0.946 in CSIQ) and 0.941 (exceeding 0.930 in KADID), illustrating its robustness and accuracy in diverse contexts. △ Less

Submitted 18 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.11624 [pdf, ps, other]

Token Space: A Category Theory Framework for AI Computations

Authors: Wuming Pan

Abstract: This paper introduces the Token Space framework, a novel mathematical construct designed to enhance the interpretability and effectiveness of deep learning models through the application of category theory. By establishing a categorical structure at the Token level, we provide a new lens through which AI computations can be understood, emphasizing the relationships between tokens, such as grou**… ▽ More This paper introduces the Token Space framework, a novel mathematical construct designed to enhance the interpretability and effectiveness of deep learning models through the application of category theory. By establishing a categorical structure at the Token level, we provide a new lens through which AI computations can be understood, emphasizing the relationships between tokens, such as grou**, order, and parameter types. We explore the foundational methodologies of the Token Space, detailing its construction, the role of construction operators and initial categories, and its application in analyzing deep learning models, specifically focusing on attention mechanisms and Transformer architectures. The integration of category theory into AI research offers a unified framework to describe and analyze computational structures, enabling new research paths and development possibilities. Our investigation reveals that the Token Space framework not only facilitates a deeper theoretical understanding of deep learning models but also opens avenues for the design of more efficient, interpretable, and innovative models, illustrating the significant role of category theory in advancing computational models. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: 42 pages,5 tables

MSC Class: I.2.6

arXiv:2404.02737 [pdf, other]

Entanglement structures from modified IR geometry

Authors: Xin-Xiang Ju, Teng-Zhou Lai, Bo-Hao Liu, Wen-Bin Pan, Ya-Wen Sun

Abstract: We investigate a new proposal connecting the geometry at various radial scales in asymptotic AdS spacetime with entanglement structure at corresponding real-space length scales of the boundary theory. With this proposal, the bulk IR geometry encodes the long-scale entanglement structure of the dual quantum system. We consider two distinct types of IR geometries, namely the spherical case and the h… ▽ More We investigate a new proposal connecting the geometry at various radial scales in asymptotic AdS spacetime with entanglement structure at corresponding real-space length scales of the boundary theory. With this proposal, the bulk IR geometry encodes the long-scale entanglement structure of the dual quantum system. We consider two distinct types of IR geometries, namely the spherical case and the hyperbolic case, which are intimately related to the physics of differential entropy and brane-world holography separately. We explore the corresponding change in the dual long-scale entanglement structures, utilizing the tools of the Ryu-Takayanagi formula, conditional mutual information, and partial entanglement entropy. The results indicate that modifying the IR geometry leads to a redistribution of entanglement at scales longer than a critical length determined by the location of the IR region, with the two modified IR geometries corresponding to two opposite ways of redistribution. Furthermore, we establish the maximum amount of entanglement that can be modified, which is proportional to the area of the IR region. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: 34 pages, 10 figures

arXiv:2403.15627 [pdf]

Nanoscale Imaging of Phonons and Reconfiguration in Topologically-Engineered, Self-Assembled Nanoparticle Lattice

Authors: Chang Qian, Ethan Stanifer, Zhan Ma, Binbin Luo, Chang Liu, Lehan Yao, Wenxiao Pan, Xiaoming Mao, Qian Chen

Abstract: Topologically-engineered mechanical frames are important model constructs for architecture, machine mechanisms, and metamaterials. Despite significant advances in macroscopically fashioned frames, realization and phonon imaging of nanoframes have remained challenging. Here we extend for the first time the principles of topologically-engineered mechanical frames to lattices self-assembled from nano… ▽ More Topologically-engineered mechanical frames are important model constructs for architecture, machine mechanisms, and metamaterials. Despite significant advances in macroscopically fashioned frames, realization and phonon imaging of nanoframes have remained challenging. Here we extend for the first time the principles of topologically-engineered mechanical frames to lattices self-assembled from nanoparticles. Liquid-phase transmission electron microscopy images the vibrations of nanoparticles in self-assembled Maxwell and hexagonal lattices at the nanometer resolution, measuring a series of otherwise inaccessible properties such as phonon spectra and nonlinear lattice deformation paths. These properties are experimentally modulated by ionic strength, captured by our discrete mechanical model considering the complexity of nanoscale interactions and thermal fluctuations. The experiment-theory integration bridges mechanical metamaterials and colloidal self-assembly, opening new opportunities to manufacture phononic devices with solution processibility, transformability, light weight, and emergent functions, at underexplored length, frequency, and energy scales. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.09339 [pdf, other]

Field test of mode-pairing quantum key distribution

Authors: Hao-Tao Zhu, Yizhi Huang, Wen-Xin Pan, Chao-Wu Zhou, Jianjun Tang, Hong He, Ming Cheng, Xiandu **, Mi Zou, Shibiao Tang, Xiongfeng Ma, Teng-Yun Chen, Jian-Wei Pan

Abstract: Quantum key distribution is a cornerstone of quantum technology, offering information-theoretical secure keys for remote parties. With many quantum communication networks established globally, the mode-pairing protocol stands out for its efficacy over inter-city distances using simple setups, emerging as a promising solution. In this study, we employ the mode-pairing scheme into existing inter-cit… ▽ More Quantum key distribution is a cornerstone of quantum technology, offering information-theoretical secure keys for remote parties. With many quantum communication networks established globally, the mode-pairing protocol stands out for its efficacy over inter-city distances using simple setups, emerging as a promising solution. In this study, we employ the mode-pairing scheme into existing inter-city fiber links, conducting field tests across distances ranging from tens to about a hundred kilometers. Our system achieves a key rate of $1.217$ kbit/s in a $195.85$ km symmetric link and $3.089$ kbit/s in a $127.92$ km asymmetric link without global phase locking. The results demonstrate that the mode-pairing protocol can achieve key rates comparable to those of a single quantum link between two trusted nodes on the Bei**g-Shanghai backbone line, effectively reducing the need for half of the trusted nodes. These field tests confirm the mode-pairing scheme's adaptability, efficiency, and practicality, positioning it as a highly suitable protocol for quantum networks. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 15 pages, 5 figures, 6 tables

arXiv:2403.09008 [pdf, other]

Fault Detection and Tolerant Control for Aero2 2D0F Two-rotor Helicopter

Authors: Khalid Kabir Dandago, Long Zhang, Wei Pan

Abstract: Stability and satisfactory performance are critical control requirements for Unmanned Aerial Vehicle (UAV) applications. While conventional control systems for UAVs aim to ensure flight stability and safe operation while accomplishing tasks, UAVs may experience various flight faults that can degrade performance or, in severe cases, lead to instability. Unsatisfactory performance or instability of… ▽ More Stability and satisfactory performance are critical control requirements for Unmanned Aerial Vehicle (UAV) applications. While conventional control systems for UAVs aim to ensure flight stability and safe operation while accomplishing tasks, UAVs may experience various flight faults that can degrade performance or, in severe cases, lead to instability. Unsatisfactory performance or instability of a UAV poses risks to lives, properties, and the flying environment. Therefore, it's essential to design a system capable of detecting faults, pinpointing their location, assessing their severity, and using this information to mitigate them, enabling the vehicle to continue operating satisfactorily. Despite the importance of analysing fault performance to select optimal fault detection and tolerance strategies, limited research has been conducted, especially with real systems. This study examines the performance of a 2-degree-of-freedom (2DOF) bi-rotor helicopter's control system in the presence of various actuator faults. Results from different fault conditions demonstrate that faults degrade the performance of conventional control systems on UAVs and introduce vibrations into the system, particularly when faults cause asymmetry or imbalance. However, additional experiments reveal that effective fault diagnosis and accommodation methods can help maintain satisfactory system performance despite the presence of faults. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 7 pages, 7 figures, Undergoing Review for International conference on Unmanned Aircraft System (Crete, Greece)

arXiv:2403.08941 [pdf, other]

Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders

Authors: Yaniv Yacoby, Weiwei Pan, Finale Doshi-Velez

Abstract: Inference for Variational Autoencoders (VAEs) consists of learning two models: (1) a generative model, which transforms a simple distribution over a latent space into the distribution over observed data, and (2) an inference model, which approximates the posterior of the latent codes given data. The two components are learned jointly via a lower bound to the generative model's log marginal likelih… ▽ More Inference for Variational Autoencoders (VAEs) consists of learning two models: (1) a generative model, which transforms a simple distribution over a latent space into the distribution over observed data, and (2) an inference model, which approximates the posterior of the latent codes given data. The two components are learned jointly via a lower bound to the generative model's log marginal likelihood. In early phases of joint training, the inference model poorly approximates the latent code posteriors. Recent work showed that this leads optimization to get stuck in local optima, negatively impacting the learned generative model. As such, recent work suggests ensuring a high-quality inference model via iterative training: maximizing the objective function relative to the inference model before every update to the generative model. Unfortunately, iterative training is inefficient, requiring heuristic criteria for reverting from iterative to joint training for speed. Here, we suggest an inference method that trains the generative and inference models independently. It approximates the posterior of the true model a priori; fixing this posterior approximation, we then maximize the lower bound relative to only the generative model. By conventional wisdom, this approach should rely on the true prior and likelihood of the true model to approximate its posterior (which are unknown). However, we show that we can compute a deterministic, model-agnostic posterior approximation (MAPA) of the true model's posterior. We then use MAPA to develop a proof-of-concept inference method. We present preliminary results on low-dimensional synthetic data that (1) MAPA captures the trend of the true posterior, and (2) our MAPA-based inference performs better density estimation with less computation than baselines. Lastly, we present a roadmap for scaling the MAPA-based inference method to high-dimensional data. △ Less

Submitted 12 June, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: Accepted at the Workshop at the 6th Symposium on Advances in Approximate Bayesian Inference (AABI) 2024

arXiv:2403.01977 [pdf, other]

TTA-Nav: Test-time Adaptive Reconstruction for Point-Goal Navigation under Visual Corruptions

Authors: Maytus Piriyajitakonkij, Mingfei Sun, Mengmi Zhang, Wei Pan

Abstract: Robot navigation under visual corruption presents a formidable challenge. To address this, we propose a Test-time Adaptation (TTA) method, named as TTA-Nav, for point-goal navigation under visual corruptions. Our "plug-and-play" method incorporates a top-down decoder to a pre-trained navigation model. Firstly, the pre-trained navigation model gets a corrupted image and extracts features. Secondly,… ▽ More Robot navigation under visual corruption presents a formidable challenge. To address this, we propose a Test-time Adaptation (TTA) method, named as TTA-Nav, for point-goal navigation under visual corruptions. Our "plug-and-play" method incorporates a top-down decoder to a pre-trained navigation model. Firstly, the pre-trained navigation model gets a corrupted image and extracts features. Secondly, the top-down decoder produces the reconstruction given the high-level features extracted by the pre-trained model. Then, it feeds the reconstruction of a corrupted image back to the pre-trained model. Finally, the pre-trained model does forward pass again to output action. Despite being trained solely on clean images, the top-down decoder can reconstruct cleaner images from corrupted ones without the need for gradient-based adaptation. The pre-trained navigation model with our top-down decoder significantly enhances navigation performance across almost all visual corruptions in our benchmarks. Our method improves the success rate of point-goal navigation from the state-of-the-art result of 46% to 94% on the most severe corruption. This suggests its potential for broader application in robotic visual navigation. Project page: https://sites.google.com/view/tta-nav △ Less

Submitted 14 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: Submitted to IROS2024

arXiv:2402.18784 [pdf, other]

Brain-inspired and Self-based Artificial Intelligence

Authors: Yi Zeng, Feifei Zhao, Yuxuan Zhao, Dongcheng Zhao, Enmeng Lu, Qian Zhang, Yuwei Wang, Hui Feng, Zhuoya Zhao, Jihang Wang, Qingqun Kong, Yinqian Sun, Yang Li, Guobin Shen, Bing Han, Yiting Dong, Wenxuan Pan, Xiang He, Aorigele Bao, ** Wang

Abstract: The question "Can machines think?" and the Turing Test to assess whether machines could achieve human-level intelligence is one of the roots of AI. With the philosophical argument "I think, therefore I am", this paper challenge the idea of a "thinking machine" supported by current AIs since there is no sense of self in them. Current artificial intelligence is only seemingly intelligent information… ▽ More The question "Can machines think?" and the Turing Test to assess whether machines could achieve human-level intelligence is one of the roots of AI. With the philosophical argument "I think, therefore I am", this paper challenge the idea of a "thinking machine" supported by current AIs since there is no sense of self in them. Current artificial intelligence is only seemingly intelligent information processing and does not truly understand or be subjectively aware of oneself and perceive the world with the self as human intelligence does. In this paper, we introduce a Brain-inspired and Self-based Artificial Intelligence (BriSe AI) paradigm. This BriSe AI paradigm is dedicated to coordinating various cognitive functions and learning strategies in a self-organized manner to build human-level AI models and robotic applications. Specifically, BriSe AI emphasizes the crucial role of the Self in sha** the future AI, rooted with a practical hierarchical Self framework, including Perception and Learning, Bodily Self, Autonomous Self, Social Self, and Conceptual Self. The hierarchical framework of the Self highlights self-based environment perception, self-bodily modeling, autonomous interaction with the environment, social interaction and collaboration with others, and even more abstract understanding of the Self. Furthermore, the positive mutual promotion and support among multiple levels of Self, as well as between Self and learning, enhance the BriSe AI's conscious understanding of information and flexible adaptation to complex environments, serving as a driving force propelling BriSe AI towards real Artificial General Intelligence. △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.17375 [pdf, other]

Impact of Computation in Integral Reinforcement Learning for Continuous-Time Control

Authors: Wenhan Cao, Wei Pan

Abstract: Integral reinforcement learning (IntRL) demands the precise computation of the utility function's integral at its policy evaluation (PEV) stage. This is achieved through quadrature rules, which are weighted sums of utility functions evaluated from state samples obtained in discrete time. Our research reveals a critical yet underexplored phenomenon: the choice of the computational method -- in this… ▽ More Integral reinforcement learning (IntRL) demands the precise computation of the utility function's integral at its policy evaluation (PEV) stage. This is achieved through quadrature rules, which are weighted sums of utility functions evaluated from state samples obtained in discrete time. Our research reveals a critical yet underexplored phenomenon: the choice of the computational method -- in this case, the quadrature rule -- can significantly impact control performance. This impact is traced back to the fact that computational errors introduced in the PEV stage can affect the policy iteration's convergence behavior, which in turn affects the learned controller. To elucidate how computation impacts control, we draw a parallel between IntRL's policy iteration and Newton's method applied to the Hamilton-Jacobi-Bellman equation. In this light, computational error in PEV manifests as an extra error term in each iteration of Newton's method, with its upper bound proportional to the computational error. Further, we demonstrate that when the utility function resides in a reproducing kernel Hilbert space (RKHS), the optimal quadrature is achievable by employing Bayesian quadrature with the RKHS-inducing kernel function. We prove that the local convergence rates for IntRL using the trapezoidal rule and Bayesian quadrature with a Matérn kernel to be $O(N^{-2})$ and $O(N^{-b})$, where $N$ is the number of evenly-spaced samples and $b$ is the Matérn kernel's smoothness parameter. These theoretical findings are finally validated by two canonical control tasks. △ Less

Submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.15259 [pdf, other]

Open Ad Hoc Teamwork with Cooperative Game Theory

Authors: Jianhong Wang, Yang Li, Yuan Zhang, Wei Pan, Samuel Kaski

Abstract: Ad hoc teamwork poses a challenging problem, requiring the design of an agent to collaborate with teammates without prior coordination or joint training. Open ad hoc teamwork (OAHT) further complicates this challenge by considering environments with a changing number of teammates, referred to as open teams. One promising solution in practice to this problem is leveraging the generalizability of gr… ▽ More Ad hoc teamwork poses a challenging problem, requiring the design of an agent to collaborate with teammates without prior coordination or joint training. Open ad hoc teamwork (OAHT) further complicates this challenge by considering environments with a changing number of teammates, referred to as open teams. One promising solution in practice to this problem is leveraging the generalizability of graph neural networks to handle an unrestricted number of agents with various agent-types, named graph-based policy learning (GPL). However, its joint Q-value representation over a coordination graph lacks convincing explanations. In this paper, we establish a new theory to understand the representation of the joint Q-value for OAHT and its learning paradigm, through the lens of cooperative game theory. Building on our theory, we propose a novel algorithm named CIAO, based on GPL's framework, with additional provable implementation tricks that can facilitate learning. The demos of experimental results are available on https://sites.google.com/view/ciao2024, and the code of experiments is published on https://github.com/hsvgbkhgbv/CIAO. △ Less

Submitted 10 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Comments: Published at ICML 2024, 29 pages

arXiv:2402.12733 [pdf, other]

BMLP: Behavior-aware MLP for Heterogeneous Sequential Recommendation

Authors: Weixin Li, Yuhao Wu, Yang Liu, Weike Pan, Zhong Ming

Abstract: In real recommendation scenarios, users often have different types of behaviors, such as clicking and buying. Existing research methods show that it is possible to capture the heterogeneous interests of users through different types of behaviors. However, most multi-behavior approaches have limitations in learning the relationship between different behaviors. In this paper, we propose a novel mult… ▽ More In real recommendation scenarios, users often have different types of behaviors, such as clicking and buying. Existing research methods show that it is possible to capture the heterogeneous interests of users through different types of behaviors. However, most multi-behavior approaches have limitations in learning the relationship between different behaviors. In this paper, we propose a novel multilayer perceptron (MLP)-based heterogeneous sequential recommendation method, namely behavior-aware multilayer perceptron (BMLP). Specifically, it has two main modules, including a heterogeneous interest perception (HIP) module, which models behaviors at multiple granularities through behavior types and transition relationships, and a purchase intent perception (PIP) module, which adaptively fuses subsequences of auxiliary behaviors to capture users' purchase intent. Compared with mainstream sequence models, MLP is competitive in terms of accuracy and has unique advantages in simplicity and efficiency. Extensive experiments show that BMLP achieves significant improvement over state-of-the-art algorithms on four public datasets. In addition, its pure MLP architecture leads to a linear time complexity. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.12416 [pdf, other]

Aligning Individual and Collective Objectives in Multi-Agent Cooperation

Authors: Yang Li, Wenhao Zhang, Jianhong Wang, Shao Zhang, Yali Du, Ying Wen, Wei Pan

Abstract: Among the research topics in multi-agent learning, mixed-motive cooperation is one of the most prominent challenges, primarily due to the mismatch between individual and collective goals. The cutting-edge research is focused on incorporating domain knowledge into rewards and introducing additional mechanisms to incentivize cooperation. However, these approaches often face shortcomings such as the… ▽ More Among the research topics in multi-agent learning, mixed-motive cooperation is one of the most prominent challenges, primarily due to the mismatch between individual and collective goals. The cutting-edge research is focused on incorporating domain knowledge into rewards and introducing additional mechanisms to incentivize cooperation. However, these approaches often face shortcomings such as the effort on manual design and the absence of theoretical groundings. To close this gap, we model the mixed-motive game as a differentiable game for the ease of illuminating the learning dynamics towards cooperation. More detailed, we introduce a novel optimization method named \textbf{\textit{A}}ltruistic \textbf{\textit{G}}radient \textbf{\textit{A}}djustment (\textbf{\textit{AgA}}) that employs gradient adjustments to progressively align individual and collective objectives. Furthermore, we theoretically prove that AgA effectively attracts gradients to stable fixed points of the collective objective while considering individual interests, and we validate these claims with empirical evidence. We evaluate the effectiveness of our algorithm AgA through benchmark environments for testing mixed-motive collaboration with small-scale agents such as the two-player public good game and the sequential social dilemma games, Cleanup and Harvest, as well as our self-developed large-scale environment in the game StarCraft II. △ Less

Submitted 22 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: 19 pages

arXiv:2402.09575 [pdf, other]

Analyzing the Impact of Computation in Adaptive Dynamic Programming for Stochastic LQR Problem

Authors: Wenhan Cao, Alexandre Capone, Sandra Hirche, Wei Pan

Abstract: Adaptive dynamic programming (ADP) for stochastic linear quadratic regulation (LQR) demands the precise computation of stochastic integrals during policy iteration (PI). In a fully model-free problem setting, this computation can only be approximated by state samples collected at discrete time points using computational methods such as the canonical Euler-Maruyama method. Our research reveals a cr… ▽ More Adaptive dynamic programming (ADP) for stochastic linear quadratic regulation (LQR) demands the precise computation of stochastic integrals during policy iteration (PI). In a fully model-free problem setting, this computation can only be approximated by state samples collected at discrete time points using computational methods such as the canonical Euler-Maruyama method. Our research reveals a critical phenomenon: the sampling period can significantly impact control performance. This impact is due to the fact that computational errors introduced in each step of PI can significantly affect the algorithm's convergence behavior, which in turn influences the resulting control policy. We draw a parallel between PI and Newton's method applied to the Ricatti equation to elucidate how the computation impacts control. In this light, the computational error in each PI step manifests itself as an extra error term in each step of Newton's method, with its upper bound proportional to the computational error. Furthermore, we demonstrate that the convergence rate for ADP in stochastic LQR problems using the Euler-Maruyama method is O(h), with h being the sampling period. A sensorimotor control task finally validates these theoretical findings. △ Less

Submitted 14 February, 2024; originally announced February 2024.

arXiv:2402.00005 [pdf, other]

doi 10.1007/s44214-023-00039-9

1002 km Twin-Field Quantum Key Distribution with Finite-Key Analysis

Authors: Yang Liu, Wei-Jun Zhang, Cong Jiang, Jiu-Peng Chen, Di Ma, Chi Zhang, Wen-Xin Pan, Hao Dong, Jia-Min Xiong, Cheng-Jun Zhang, Hao Li, Rui-Chun Wang, Chao-Yang Lu, Jun Wu, Teng-Yun Chen, Lixing You, Xiang-Bin Wang, Qiang Zhang, Jian-Wei Pan

Abstract: Quantum key distribution (QKD) holds the potential to establish secure keys over long distances. The distance of point-to-point QKD secure key distribution is primarily impeded by the transmission loss inherent to the channel. In the quest to realize a large-scale quantum network, increasing the QKD distance under current technology is of great research interest. Here we adopt the 3-intensity send… ▽ More Quantum key distribution (QKD) holds the potential to establish secure keys over long distances. The distance of point-to-point QKD secure key distribution is primarily impeded by the transmission loss inherent to the channel. In the quest to realize a large-scale quantum network, increasing the QKD distance under current technology is of great research interest. Here we adopt the 3-intensity sending-or-not-sending twin-field QKD (TF-QKD) protocol with the actively-odd-parity-pairing method. The experiment demonstrates the feasibility of secure QKD over a 1002 km fibre channel considering the finite size effect. The secure key rate is $3.11\times10^{-12}$ per pulse at this distance. Furthermore, by optimizing parameters for shorter fiber distances, we conducted performance tests on key distribution for fiber lengths ranging from 202 km to 505 km. Notably, the secure key rate for the 202 km, the normal distance between major cities, reached 111.74 kbps. △ Less

Submitted 1 December, 2023; originally announced February 2024.

Comments: 18 pages, 3 figures

Journal ref: Quantum Front 2, 16 (2023)

arXiv:2401.15880 [pdf, other]

Deciphering regulatory architectures from synthetic single-cell expression patterns

Authors: Rosalind Wenshan Pan, Tom Roeschinger, Kian Faizi, Hernan Garcia, Rob Phillips

Abstract: For the vast majority of genes in sequenced genomes, there is limited understanding of how they are regulated. Without such knowledge, it is not possible to perform a quantitative theory-experiment dialogue on how such genes give rise to physiological and evolutionary adaptation. One category of high-throughput experiments used to understand the sequence-phenotype relationship of the transcriptome… ▽ More For the vast majority of genes in sequenced genomes, there is limited understanding of how they are regulated. Without such knowledge, it is not possible to perform a quantitative theory-experiment dialogue on how such genes give rise to physiological and evolutionary adaptation. One category of high-throughput experiments used to understand the sequence-phenotype relationship of the transcriptome is massively parallel reporter assays (MPRAs). However, to improve the versatility and scalability of MPRA pipelines, we need a "theory of the experiment" to help us better understand the impact of various biological and experimental parameters on the interpretation of experimental data. To that end, in this paper we create tens of thousands of synthetic single-cell gene expression outputs using both equilibrium and out-of-equilibrium models. These models make it possible to imitate the summary statistics (information footprints and expression shift matrices) used to characterize the output of MPRAs and from this summary statistic to infer the underlying regulatory architecture. Specifically, we use a more refined implementation of the so-called thermodynamic models in which the binding energies of each sequence variant are derived from energy matrices. Our simulations reveal important effects of the parameters on MPRA data and we demonstrate our ability to optimize MPRA experimental designs with the goal of generating thermodynamic models of the transcriptome with base-pair specificity. Further, this approach makes it possible to carefully examine the map** between mutations in binding sites and their corresponding expression profiles, a tool useful not only for better designing MPRAs, but also for exploring regulatory evolution. △ Less

Submitted 5 June, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

arXiv:2401.15369 [pdf, other]

Privacy-Preserving Cross-Domain Sequential Recommendation

Authors: Zhaohao Lin, Weike Pan, Zhong Ming

Abstract: Cross-domain sequential recommendation is an important development direction of recommender systems. It combines the characteristics of sequential recommender systems and cross-domain recommender systems, which can capture the dynamic preferences of users and alleviate the problem of cold-start users. However, in recent years, people pay more and more attention to their privacy. They do not want o… ▽ More Cross-domain sequential recommendation is an important development direction of recommender systems. It combines the characteristics of sequential recommender systems and cross-domain recommender systems, which can capture the dynamic preferences of users and alleviate the problem of cold-start users. However, in recent years, people pay more and more attention to their privacy. They do not want other people to know what they just bought, what videos they just watched, and where they just came from. How to protect the users' privacy has become an urgent problem to be solved. In this paper, we propose a novel privacy-preserving cross-domain sequential recommender system (PriCDSR), which can provide users with recommendation services while preserving their privacy at the same time. Specifically, we define a new differential privacy on the data, taking into account both the ID information and the order information. Then, we design a random mechanism that satisfies this differential privacy and provide its theoretical proof. Our PriCDSR is a non-invasive method that can adopt any cross-domain sequential recommender system as a base model without any modification to it. To the best of our knowledge, our PriCDSR is the first work to investigate privacy issues in cross-domain sequential recommender systems. We conduct experiments on three domains, and the results demonstrate that our PriCDSR, despite introducing noise, still outperforms recommender systems that only use data from a single domain. △ Less

Submitted 27 January, 2024; originally announced January 2024.

arXiv:2401.15000 [pdf, other]

Density-matrix renormalization group algorithm for non-Hermitian systems

Authors: Peigeng Zhong, Wei Pan, Haiqing Lin, Xiaoqun Wang, Shijie Hu

Abstract: A biorthonormal-block density-matrix renormalization group algorithm is proposed to compute properties of non-Hermitian many-body systems, in which a structured low-rank approximation to a non-Hermitian reduced density matrix is implemented to fulfill the prerequisite for the biorthonormality of the renormalization transformation and to optimally construct a saved space as well. A redundancy assig… ▽ More A biorthonormal-block density-matrix renormalization group algorithm is proposed to compute properties of non-Hermitian many-body systems, in which a structured low-rank approximation to a non-Hermitian reduced density matrix is implemented to fulfill the prerequisite for the biorthonormality of the renormalization transformation and to optimally construct a saved space as well. A redundancy assigned to the saved space of the reduced density matrix is exploited to reduce a condition number resulting from the left and right transformation matrices, thus ensuring the numerical stability of the renormalization procedure. The algorithm is successfully applied to an interacting fermionic Su-Schrieffer-Heeger model with both nonreciprocal hop**s and staggered complex chemical potential, exhibiting novel many-body phenomena in the ground-state phase diagram. △ Less

Submitted 26 January, 2024; originally announced January 2024.

Comments: 5+9 pages, 3+2 figures

arXiv:2401.14923 [pdf, other]

Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks

Authors: Eura Nofshin, Siddharth Swaroop, Weiwei Pan, Susan Murphy, Finale Doshi-Velez

Abstract: Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us unders… ▽ More Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us understand the behavioral interventions. In this paper, we introduce Behavior Model Reinforcement Learning (BMRL), a framework in which an AI agent intervenes on the parameters of a Markov Decision Process (MDP) belonging to a boundedly rational human agent. Our formulation of the human decision-maker as a planning agent allows us to attribute undesirable human policies (ones that do not lead to the goal) to their maladapted MDP parameters, such as an extremely low discount factor. Furthermore, we propose a class of tractable human models that captures fundamental behaviors in frictionful tasks. Introducing a notion of MDP equivalence specific to BMRL, we theoretically and empirically show that AI planning with our human models can lead to helpful policies on a wide range of more complex, ground-truth humans. △ Less

Submitted 26 January, 2024; originally announced January 2024.

Comments: In AAMAS 2024

arXiv:2401.12596 [pdf, other]

UniHDA: A Unified and Versatile Framework for Multi-Modal Hybrid Domain Adaptation

Authors: Hengjia Li, Yang Liu, Yuqi Lin, Zhanwei Zhang, Yibo Zhao, weihang Pan, Tu Zheng, Zheng Yang, Yuchun Jiang, Boxi Wu, Deng Cai

Abstract: Recently, generative domain adaptation has achieved remarkable progress, enabling us to adapt a pre-trained generator to a new target domain. However, existing methods simply adapt the generator to a single target domain and are limited to a single modality, either text-driven or image-driven. Moreover, they cannot maintain well consistency with the source domain, which impedes the inheritance of… ▽ More Recently, generative domain adaptation has achieved remarkable progress, enabling us to adapt a pre-trained generator to a new target domain. However, existing methods simply adapt the generator to a single target domain and are limited to a single modality, either text-driven or image-driven. Moreover, they cannot maintain well consistency with the source domain, which impedes the inheritance of the diversity. In this paper, we propose UniHDA, a \textbf{unified} and \textbf{versatile} framework for generative hybrid domain adaptation with multi-modal references from multiple domains. We use CLIP encoder to project multi-modal references into a unified embedding space and then linearly interpolate the direction vectors from multiple target domains to achieve hybrid domain adaptation. To ensure \textbf{consistency} with the source domain, we propose a novel cross-domain spatial structure (CSS) loss that maintains detailed spatial structure information between source and target generator. Experiments show that the adapted generator can synthesise realistic images with various attribute compositions. Additionally, our framework is generator-agnostic and versatile to multiple generators, e.g., StyleGAN, EG3D, and Diffusion Models. △ Less

Submitted 15 March, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.04971 [pdf, other]

A Survey on Cross-Domain Sequential Recommendation

Authors: Shu Chen, Zitao Xu, Weike Pan, Qiang Yang, Zhong Ming

Abstract: Cross-domain sequential recommendation (CDSR) shifts the modeling of user preferences from flat to stereoscopic by integrating and learning interaction information from multiple domains at different granularities (ranging from inter-sequence to intra-sequence and from single-domain to cross-domain). In this survey, we first define the CDSR problem using a four-dimensional tensor and then analyze i… ▽ More Cross-domain sequential recommendation (CDSR) shifts the modeling of user preferences from flat to stereoscopic by integrating and learning interaction information from multiple domains at different granularities (ranging from inter-sequence to intra-sequence and from single-domain to cross-domain). In this survey, we first define the CDSR problem using a four-dimensional tensor and then analyze its multi-type input representations under multidirectional dimensionality reductions. Following that, we provide a systematic overview from both macro and micro views. From a macro view, we abstract the multi-level fusion structures of various models across domains and discuss their bridges for fusion. From a micro view, focusing on the existing models, we first discuss the basic technologies and then explain the auxiliary learning technologies. Finally, we exhibit the available public datasets and the representative experimental results as well as provide some insights into future directions for research in CDSR. △ Less

Submitted 17 May, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

Comments: Accepted to the IJCAI 2024 Survey Track

arXiv:2401.04217 [pdf, other]

Force Propagation in Active Cytoskeletal Networks

Authors: Shichen Liu, Rosalind Wenshan Pan, Heun ** Lee, Shahriar Shadkhoo, Fan Yang, Chunhe Li, Zijie Qu, Rob Phillips, Matt Thomson

Abstract: In biological systems, molecular-scale forces and motions are pivotal for enabling processes like motility, shape change, and replication. These forces and motions are organized, amplified, and transmitted across macroscopic scales by active materials such as the cytoskeleton, which drives micron-scale cellular movement and re-organization. Despite the integral role of active materials, understand… ▽ More In biological systems, molecular-scale forces and motions are pivotal for enabling processes like motility, shape change, and replication. These forces and motions are organized, amplified, and transmitted across macroscopic scales by active materials such as the cytoskeleton, which drives micron-scale cellular movement and re-organization. Despite the integral role of active materials, understanding how molecular-scale interactions alter macroscopic structure and force propagation remains elusive. This knowledge gap presents challenges to the harnessing and regulation of such dynamics across diverse length scales. Here, we demonstrate how mediating the bundling of microtubules can shift active matter between a global force-transmitting phase and a local force-dissipating phase. A fivefold increase in microtubule effective length results in the transition from local to global phase with a hundredfold increase in velocity autocorrelation. Through theory and simulation, we identify signatures of a percolation-driven transition between the two phases. This provides evidence for how force propagation can be generated when local molecular interactions reach a sufficient length scale. We show that force propagation in the active matter system enables material transport. Consequently, we demonstrate that the global phase is capable of facilitating millimeter-scale human cell transport and manipulation, as well as powering the movement of aqueous droplets. These findings underscore the potential for designing active materials capable of force organization and transmission. Our results lay the foundation for further exploration into the organization and propagation of forces/stresses in biological systems, thereby paving the way for the engineering of active materials in synthetic biology and soft robotics. △ Less

Submitted 12 April, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

Comments: 15 pages, 4 figrues

arXiv:2401.03676 [pdf, other]

Assessing AI Detectors in Identifying AI-Generated Code: Implications for Education

Authors: Wei Hung Pan, Ming Jie Chok, Jonathan Leong Shan Wong, Yung Xin Shin, Yeong Shian Poon, Zhou Yang, Chun Yong Chong, David Lo, Mei Kuan Lim

Abstract: Educators are increasingly concerned about the usage of Large Language Models (LLMs) such as ChatGPT in programming education, particularly regarding the potential exploitation of imperfections in Artificial Intelligence Generated Content (AIGC) Detectors for academic misconduct. In this paper, we present an empirical study where the LLM is examined for its attempts to bypass detection by AIGC Det… ▽ More Educators are increasingly concerned about the usage of Large Language Models (LLMs) such as ChatGPT in programming education, particularly regarding the potential exploitation of imperfections in Artificial Intelligence Generated Content (AIGC) Detectors for academic misconduct. In this paper, we present an empirical study where the LLM is examined for its attempts to bypass detection by AIGC Detectors. This is achieved by generating code in response to a given question using different variants. We collected a dataset comprising 5,069 samples, with each sample consisting of a textual description of a coding problem and its corresponding human-written Python solution codes. These samples were obtained from various sources, including 80 from Quescol, 3,264 from Kaggle, and 1,725 from LeetCode. From the dataset, we created 13 sets of code problem variant prompts, which were used to instruct ChatGPT to generate the outputs. Subsequently, we assessed the performance of five AIGC detectors. Our results demonstrate that existing AIGC Detectors perform poorly in distinguishing between human-written code and AI-generated code. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 11 pages, paper accepted at 46th International Conference on Software Engineering, Software Engineering Education and Training Track (ICSE-SEET 2024)

arXiv:2312.15248 [pdf, other]

Type-II Apollonian Model

Authors: Fei Ma, **zhi Ouyang, ** Wang, Haobin Shi, Wei Pan

Abstract: The family of planar graphs is a particularly important family and models many real-world networks. In this paper, we propose a principled framework based on the widely-known Apollonian packing process to generate new planar network, i.e., Type-II Apollonian network $\mathcal{A}_{t}$. The manipulation is different from that of the typical Apollonian network, and is proceeded in terms of the iterat… ▽ More The family of planar graphs is a particularly important family and models many real-world networks. In this paper, we propose a principled framework based on the widely-known Apollonian packing process to generate new planar network, i.e., Type-II Apollonian network $\mathcal{A}_{t}$. The manipulation is different from that of the typical Apollonian network, and is proceeded in terms of the iterative addition of triangle instead of vertex. As a consequence, network $\mathcal{A}_{t}$ turns out to be hamiltonian and eulerian, however, the typical Apollonian network is not. Then, we in-depth study some fundamental structural properties on network $\mathcal{A}_{t}$, and verify that network $\mathcal{A}_{t}$ is sparse like most real-world networks, has scale-free feature and small-world property, and exhibits disassortative mixing structure. Next, we design an effective algorithm for solving the problem of how to enumerate spanning trees on network $\mathcal{A}_{t}$, and derive the asymptotic solution of the spanning tree entropy, which suggests that Type-II Apollonian network is more reliable to a random removal of edges than the typical Apollonian network. Additionally, we study trap** problem on network $\mathcal{A}_{t}$, and use average trap** time as metric to show that Type-II Apollonian network $\mathcal{A}_{t}$ has better structure for fast information diffusion than the typical Apollonian network. △ Less

Submitted 23 December, 2023; originally announced December 2023.

arXiv:2312.08631 [pdf, other]

Semi-supervised Semantic Segmentation Meets Masked Modeling:Fine-grained Locality Learning Matters in Consistency Regularization

Authors: Wentao Pan, Zhe Xu, Jiangpeng Yan, Zihan Wu, Raymond Kai-yu Tong, Xiu Li, Jianhua Yao

Abstract: Semi-supervised semantic segmentation aims to utilize limited labeled images and abundant unlabeled images to achieve label-efficient learning, wherein the weak-to-strong consistency regularization framework, popularized by FixMatch, is widely used as a benchmark scheme. Despite its effectiveness, we observe that such scheme struggles with satisfactory segmentation for the local regions. This can… ▽ More Semi-supervised semantic segmentation aims to utilize limited labeled images and abundant unlabeled images to achieve label-efficient learning, wherein the weak-to-strong consistency regularization framework, popularized by FixMatch, is widely used as a benchmark scheme. Despite its effectiveness, we observe that such scheme struggles with satisfactory segmentation for the local regions. This can be because it originally stems from the image classification task and lacks specialized mechanisms to capture fine-grained local semantics that prioritizes in dense prediction. To address this issue, we propose a novel framework called \texttt{MaskMatch}, which enables fine-grained locality learning to achieve better dense segmentation. On top of the original teacher-student framework, we design a masked modeling proxy task that encourages the student model to predict the segmentation given the unmasked image patches (even with 30\% only) and enforces the predictions to be consistent with pseudo-labels generated by the teacher model using the complete image. Such design is motivated by the intuition that if the predictions are more consistent given insufficient neighboring information, stronger fine-grained locality perception is achieved. Besides, recognizing the importance of reliable pseudo-labels in the above locality learning and the original consistency learning scheme, we design a multi-scale ensembling strategy that considers context at different levels of abstraction for pseudo-label generation. Extensive experiments on benchmark datasets demonstrate the superiority of our method against previous approaches and its plug-and-play flexibility. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2312.06987

A new lightweight additive homomorphic encryption algorithm

Authors: Wuqiong Pan, Hongliang Gu

Abstract: This article describes a lightweight additive homomorphic algorithm with the same encryption and decryption keys. Compared to standard additive homomorphic algorithms like Paillier, this algorithm reduces the computational cost of encryption and decryption from modular exponentiation to modular multiplication, and reduces the computational cost of ciphertext addition from modular multiplication to… ▽ More This article describes a lightweight additive homomorphic algorithm with the same encryption and decryption keys. Compared to standard additive homomorphic algorithms like Paillier, this algorithm reduces the computational cost of encryption and decryption from modular exponentiation to modular multiplication, and reduces the computational cost of ciphertext addition from modular multiplication to modular addition. This algorithm is based on a new mathematical problem: in two division operations, whether it is possible to infer the remainder or divisor based on the dividend when two remainders are related. Currently, it is not obvious how to break this problem, but further exploration is needed to determine if it is sufficiently difficult. In addition to this mathematical problem, we have also designed two interesting mathematical structures for decryption, which are used in the two algorithms mentioned in the main text. It is possible that the decryption structure of Algorithm 2 introduces new security vulnerabilities, but we have not investigated this issue thoroughly. △ Less

Submitted 1 April, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

Comments: This algorithm proposed in this paper has serious security problem. It can be attacked by Orthogonal lattice

arXiv:2312.05643 [pdf, other]

NiSNN-A: Non-iterative Spiking Neural Networks with Attention with Application to Motor Imagery EEG Classification

Authors: Chuhan Zhang, Wei Pan, Cosimo Della Santina

Abstract: Motor imagery, an important category in electroencephalogram (EEG) research, often intersects with scenarios demanding low energy consumption, such as portable medical devices and isolated environment operations. Traditional deep learning algorithms, despite their effectiveness, are characterized by significant computational demands accompanied by high energy usage. As an alternative, spiking neur… ▽ More Motor imagery, an important category in electroencephalogram (EEG) research, often intersects with scenarios demanding low energy consumption, such as portable medical devices and isolated environment operations. Traditional deep learning algorithms, despite their effectiveness, are characterized by significant computational demands accompanied by high energy usage. As an alternative, spiking neural networks (SNNs), inspired by the biological functions of the brain, emerge as a promising energy-efficient solution. However, SNNs typically exhibit lower accuracy than their counterpart convolutional neural networks (CNNs). Although attention mechanisms successfully increase network accuracy by focusing on relevant features, their integration in the SNN framework remains an open question. In this work, we combine the SNN and the attention mechanisms for the EEG classification, aiming to improve precision and reduce energy consumption. To this end, we first propose a Non-iterative Leaky Integrate-and-Fire (LIF) neuron model, overcoming the gradient issues in the traditional SNNs using the Iterative LIF neurons. Then, we introduce the sequence-based attention mechanisms to refine the feature map. We evaluated the proposed Non-iterative SNN with Attention (NiSNN-A) model on OpenBMI, a large-scale motor imagery dataset. Experiment results demonstrate that 1) our model outperforms other SNN models by achieving higher accuracy, 2) our model increases energy efficiency compared to the counterpart CNN models (i.e., by 2.27 times) while maintaining comparable accuracy. △ Less

Submitted 9 December, 2023; originally announced December 2023.

arXiv:2312.01175 [pdf]

High Q and high gradient performance of the first medium-temperature baking 1.3 GHz cryomodule

Authors: Jiyuan Zhai, Weimin Pan, Feisi He, Rui Ge, Zhenghui Mi, Peng Sha, Song **, Ruixiong Han, Qunyao Wang, Haiying Lin, Guangwei Wang, Mei Li, Min**g Sang, Liangrui Sun, Rui Ye, Tongxian Zhao, Shaopeng Li, Keyu Zhu, Baiqi Liu, Xiaolong Wang, Xiangchen Yang, Xiaojuan Bian, Xiangzhen Zhang, Huizhou Ma, Xuwen Dai , et al. (14 additional authors not shown)

Abstract: World's first 1.3 GHz cryomodule containing eight 9-cell superconducting radio-frequency (RF) cavities treated by medium-temperature furnace baking (mid-T bake) was developed, assembled and tested at IHEP for the Dalian Advanced Light Source (DALS) and CEPC R&D. The 9-cell cavities in the cryomodule achieved an unprecedented highest average Q0 of 3.8E10 at 16 MV/m and 3.6E10 at 21 MV/m in the hori… ▽ More World's first 1.3 GHz cryomodule containing eight 9-cell superconducting radio-frequency (RF) cavities treated by medium-temperature furnace baking (mid-T bake) was developed, assembled and tested at IHEP for the Dalian Advanced Light Source (DALS) and CEPC R&D. The 9-cell cavities in the cryomodule achieved an unprecedented highest average Q0 of 3.8E10 at 16 MV/m and 3.6E10 at 21 MV/m in the horizontal test. The cryomodule can operate stably up to a total CW RF voltage greater than 191 MV, with an average cavity CW accelerating gradient of more than 23 MV/m. The results significantly exceed the specifications of CEPC, DALS and the other high repetition rate free electron laser facilities (LCLS-II, LCLS-II-HE, SHINE, S3FEL). There is evidence that the mid-T bake cavity may not require fast cool-down or long processing time in the cryomodule. This paper reviews the cryomodule performance and discusses some important issues in cryomodule assembly and testing. △ Less

Submitted 2 December, 2023; originally announced December 2023.

Comments: 5 pages, 6 figures

arXiv:2311.12600 [pdf, other]

Weyl semimetal from non-inertial observers

Authors: Wen-Bin Pan, Ya-Wen Sun

Abstract: We show that a reference frame transformation could turn a topologically trivial Dirac fermion into a topologically nontrivial Weyl semimetal. This is elucidated by the transformation of the Dirac equation into the equation for Weyl semimetals through specific infinitesimal local Lorentz transformations of the orthonormal basis. This kind of transformation, interpreted as a change of reference fra… ▽ More We show that a reference frame transformation could turn a topologically trivial Dirac fermion into a topologically nontrivial Weyl semimetal. This is elucidated by the transformation of the Dirac equation into the equation for Weyl semimetals through specific infinitesimal local Lorentz transformations of the orthonormal basis. This kind of transformation, interpreted as a change of reference frame, could induce an observational effect that an axial gauge field and/or a vector U(1) gauge field appears effectively, which are in fact inertial forces in the non-inertial frame.The precise local Lorentz transformations and the movement of observers needed to realize the two additional fields are provided respectively. This novel effect can be viewed as a generalization of the effect found in relativistic hydrodynamics that topologically trivial modes in an inertial frame could become topologically nontrivial observed by a special non-inertial observer. △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: 19 pages, 2 figures

arXiv:2311.10265 [pdf, other]

On the dimension of limit sets on $\mathbb{P}(\mathbb{R}^3)$ via stationary measures: the theory and applications

Authors: Jialun Li, Wenyu Pan, Disheng Xu

Abstract: This paper investigates the (semi)group action of $\mathrm{SL}_3(\mathbb{R})$ on $\mathbb{P}(\mathbb{R}^3)$, a primary example of non-conformal, non-linear, and non-strictly contracting action. We study the Hausdorff dimension of a dynamically defined limit set in $\mathbb{P}(\mathbb{R}^3)$ and generalize the classical Patterson-Sullivan formula using the approach of stationary measures. The two… ▽ More This paper investigates the (semi)group action of $\mathrm{SL}_3(\mathbb{R})$ on $\mathbb{P}(\mathbb{R}^3)$, a primary example of non-conformal, non-linear, and non-strictly contracting action. We study the Hausdorff dimension of a dynamically defined limit set in $\mathbb{P}(\mathbb{R}^3)$ and generalize the classical Patterson-Sullivan formula using the approach of stationary measures. The two main examples are Anosov representations in $\mathrm{SL}_3(\mathbb{R})$ and the Rauzy gasket. 1. For Anosov representations in $\mathrm{SL}_3(\mathbb{R})$, we establish a sharp lower bound for the dimension of their limit sets in $\mathbb{P}(\mathbb{R}^3)$. Coupled with the upper bound in Pozzetti-Sambarino-Wienhard, it shows that their Hausdorff dimensions equal the affinity exponents. The merit of our approach is that it works uniformly for all the components of irreducible Anosov representations in $\mathrm{SL}_3(\mathbb{R})$. As an application, it reveals a surprising dimension jump phenomenon in the Barbot component, which is a local generalization of Bowen's dimension rigidity result. 2. For the Rauzy gasket, we confirm a folklore conjecture about the Hausdorff dimension of the gasket and improve the numerical lower bound to $3/2$. These results originate from a dimension formula of stationary measures on $\mathbb{P}(\mathbb{R}^3)$. Let $ν$ be a probability measure on $\mathrm{SL}_3(\mathbb{R})$ whose support is finite and spans a Zariski dense subgroup. Let $μ$ be the associated stationary measure for the action on $\mathbb{P}(\mathbb{R}^3)$. Under the exponential separation condition on $ν$, we prove that the Hausdorff dimension of $μ$ equals its Lyapunov dimension, which extends Hochman-Solomyak and Bárány-Hochman-Rapaport to non-conformal and projective settings respectively. △ Less

Submitted 27 February, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: Abstract and introduction are changed

arXiv:2311.10262 [pdf, other]

On the dimension of limit sets on $\mathbb{P}(\mathbb{R}^3)$ via stationary measures: variational principles and applications

Authors: Yuxiang Jiao, Jialun Li, Wenyu Pan, Disheng Xu

Abstract: In this article, we establish the variational principle of the affinity exponent of Borel Anosov representations. We also establish such a principle of the Rauzy gasket. In Li-Pan-Xu, they obtain a dimension formula of the stationary measures on $\mathbb{P}(\mathbb{R}^3)$. Combined with our result, it allows us to study the Hausdorff dimension of limit sets of Anosov representations in… ▽ More In this article, we establish the variational principle of the affinity exponent of Borel Anosov representations. We also establish such a principle of the Rauzy gasket. In Li-Pan-Xu, they obtain a dimension formula of the stationary measures on $\mathbb{P}(\mathbb{R}^3)$. Combined with our result, it allows us to study the Hausdorff dimension of limit sets of Anosov representations in $\mathrm{SL}_3(\mathbb{R})$ and the Rauzy gasket. It yields the equality between the Hausdorff dimensions and the affinity exponents in both settings. In the appendix, we improve the numerical lower bound of the Hausdorff dimension of Rauzy gasket to $1.5$. △ Less

Submitted 13 December, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: We add an appendix where we prove the Hausdorff dimension of Rauzy gasket is at least $1.5$

arXiv:2311.09008 [pdf, other]

End-to-end Task-oriented Dialogue: A Survey of Tasks, Methods, and Future Directions

Authors: Libo Qin, Wenbo Pan, Qiguang Chen, Lizi Liao, Zhou Yu, Yue Zhang, Wanxiang Che, Min Li

Abstract: End-to-end task-oriented dialogue (EToD) can directly generate responses in an end-to-end fashion without modular training, which attracts escalating popularity. The advancement of deep neural networks, especially the successful use of large pre-trained models, has further led to significant progress in EToD research in recent years. In this paper, we present a thorough review and provide a unifie… ▽ More End-to-end task-oriented dialogue (EToD) can directly generate responses in an end-to-end fashion without modular training, which attracts escalating popularity. The advancement of deep neural networks, especially the successful use of large pre-trained models, has further led to significant progress in EToD research in recent years. In this paper, we present a thorough review and provide a unified perspective to summarize existing approaches as well as recent trends to advance the development of EToD research. The contributions of this paper can be summarized: (1) \textbf{\textit{First survey}}: to our knowledge, we take the first step to present a thorough survey of this research field; (2) \textbf{\textit{New taxonomy}}: we first introduce a unified perspective for EToD, including (i) \textit{Modularly EToD} and (ii) \textit{Fully EToD}; (3) \textbf{\textit{New Frontiers}}: we discuss some potential frontier areas as well as the corresponding challenges, ho** to spur breakthrough research in EToD field; (4) \textbf{\textit{Abundant resources}}: we build a public website\footnote{We collect the related papers, baseline projects, and leaderboards for the community at \url{https://etods.net/}.}, where EToD researchers could directly access the recent progress. We hope this work can serve as a thorough reference for the EToD research community. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: Accepted at EMNLP2023

Showing 1–50 of 392 results for author: Pan, W