Search | arXiv e-print repository

arXiv:2407.02031 [pdf, other]

SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules

Authors: Suyi Li, Lingyun Yang, Xiaoxiao Jiang, Hanfeng Lu, Zhipeng Di, Weiyi Lu, Jiawei Chen, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Li** Zhang, Wei Wang

Abstract: This paper documents our characterization study and practices for serving text-to-image requests with stable diffusion models in production. We first comprehensively analyze inference request traces for commercial text-to-image applications. It commences with our observation that add-on modules, i.e., ControlNets and LoRAs, that augment the base stable diffusion models, are ubiquitous in generatin… ▽ More This paper documents our characterization study and practices for serving text-to-image requests with stable diffusion models in production. We first comprehensively analyze inference request traces for commercial text-to-image applications. It commences with our observation that add-on modules, i.e., ControlNets and LoRAs, that augment the base stable diffusion models, are ubiquitous in generating images for commercial applications. Despite their efficacy, these add-on modules incur high loading overhead, prolong the serving latency, and swallow up expensive GPU resources. Driven by our characterization study, we present SwiftDiffusion, a system that efficiently generates high-quality images using stable diffusion models and add-on modules. To achieve this, SwiftDiffusion reconstructs the existing text-to-image serving workflow by identifying the opportunities for parallel computation and distributing ControlNet computations across multiple GPUs. Further, SwiftDiffusion thoroughly analyzes the dynamics of image generation and develops techniques to eliminate the overhead associated with LoRA loading and patching while preserving the image quality. Last, SwiftDiffusion proposes specialized optimizations in the backbone architecture of the stable diffusion models, which are also compatible with the efficient serving of add-on modules. Compared to state-of-the-art text-to-image serving systems, SwiftDiffusion reduces serving latency by up to 5x and improves serving throughput by up to 2x without compromising image quality. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01885 [pdf, other]

Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application

Authors: Chuanpeng Yang, Wang Lu, Yao Zhu, Yidong Wang, Qian Chen, Chenlong Gao, Bingjie Yan, Yiqiang Chen

Abstract: Large Language Models (LLMs) have showcased exceptional capabilities in various domains, attracting significant interest from both academia and industry. Despite their impressive performance, the substantial size and computational demands of LLMs pose considerable challenges for practical deployment, particularly in environments with limited resources. The endeavor to compress language models whil… ▽ More Large Language Models (LLMs) have showcased exceptional capabilities in various domains, attracting significant interest from both academia and industry. Despite their impressive performance, the substantial size and computational demands of LLMs pose considerable challenges for practical deployment, particularly in environments with limited resources. The endeavor to compress language models while maintaining their accuracy has become a focal point of research. Among the various methods, knowledge distillation has emerged as an effective technique to enhance inference speed without greatly compromising performance. This paper presents a thorough survey from three aspects: method, evaluation, and application, exploring knowledge distillation techniques tailored specifically for LLMs. Specifically, we divide the methods into white-box KD and black-box KD to better illustrate their differences. Furthermore, we also explored the evaluation tasks and distillation effects between different distillation methods, and proposed directions for future research. Through in-depth understanding of the latest advancements and practical applications, this survey provides valuable resources for researchers, paving the way for sustained progress in this field. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 28 pages

arXiv:2407.01455 [pdf, other]

TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models' Theory-of-Mind

Authors: Guiyang Hou, Wenqi Zhang, Yongliang Shen, Linjuan Wu, Weiming Lu

Abstract: Theory of Mind (ToM)-the cognitive ability to reason about mental states of ourselves and others, is the foundation of social interaction. Although ToM comes naturally to humans, it poses a significant challenge to even the most advanced Large Language Models (LLMs). Due to the complex logical chains in ToM reasoning, especially in higher-order ToM questions, simply utilizing reasoning methods lik… ▽ More Theory of Mind (ToM)-the cognitive ability to reason about mental states of ourselves and others, is the foundation of social interaction. Although ToM comes naturally to humans, it poses a significant challenge to even the most advanced Large Language Models (LLMs). Due to the complex logical chains in ToM reasoning, especially in higher-order ToM questions, simply utilizing reasoning methods like Chain of Thought (CoT) will not improve the ToM capabilities of LLMs. We present TimeToM, which constructs a temporal space and uses it as the foundation to improve the ToM capabilities of LLMs in multiple scenarios. Specifically, within the temporal space, we construct Temporal Belief State Chain (TBSC) for each character and inspired by the cognition perspective of the social world model, we divide TBSC into self-world beliefs and social world beliefs, aligning with first-order ToM (first-order beliefs) and higher-order ToM (higher-order beliefs) questions, respectively. Moreover, we design a novel tool-belief solver that, by considering belief communication between characters in temporal space, can transform a character's higher-order beliefs into another character's first-order beliefs under belief communication period. Experimental results indicate that TimeToM can dramatically improve the reasoning performance of LLMs on ToM questions while taking a big step towards coherent and robust ToM reasoning. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 16 pages, 6 figures, ACL 2024(findings)

arXiv:2407.00390 [pdf, other]

Advancing Process Verification for Large Language Models via Tree-Based Preference Learning

Authors: Mingqian He, Yongliang Shen, Wenqi Zhang, Zeqi Tan, Weiming Lu

Abstract: Large Language Models (LLMs) have demonstrated remarkable potential in handling complex reasoning tasks by generating step-by-step rationales.Some methods have proven effective in boosting accuracy by introducing extra verifiers to assess these paths. However, existing verifiers, typically trained on binary-labeled reasoning paths, fail to fully utilize the relative merits of intermediate steps, t… ▽ More Large Language Models (LLMs) have demonstrated remarkable potential in handling complex reasoning tasks by generating step-by-step rationales.Some methods have proven effective in boosting accuracy by introducing extra verifiers to assess these paths. However, existing verifiers, typically trained on binary-labeled reasoning paths, fail to fully utilize the relative merits of intermediate steps, thereby limiting the effectiveness of the feedback provided. To overcome this limitation, we propose Tree-based Preference Learning Verifier (Tree-PLV), a novel approach that constructs reasoning trees via a best-first search algorithm and collects step-level paired data for preference training. Compared to traditional binary classification, step-level preferences more finely capture the nuances between reasoning steps, allowing for a more precise evaluation of the complete reasoning path. We empirically evaluate Tree-PLV across a range of arithmetic and commonsense reasoning tasks, where it significantly outperforms existing benchmarks. For instance, Tree-PLV achieved substantial performance gains over the Mistral-7B self-consistency baseline on GSM8K (67.55% to 82.79%), MATH (17.00% to 26.80%), CSQA (68.14% to 72.97%), and StrategyQA (82.86% to 83.25%).Additionally, our study explores the appropriate granularity for applying preference learning, revealing that step-level guidance provides feedback that better aligns with the evaluation of the reasoning process. △ Less

Submitted 29 June, 2024; originally announced July 2024.

arXiv:2406.19853 [pdf, other]

YuLan: An Open-source Large Language Model

Authors: Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou , et al. (13 additional authors not shown)

Abstract: Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of training details hinders further research and development. This paper presents the development of YuLan, a series of open-source LLMs with $12$ billi… ▽ More Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of training details hinders further research and development. This paper presents the development of YuLan, a series of open-source LLMs with $12$ billion parameters. The base model of YuLan is pre-trained on approximately $1.7$T tokens derived from a diverse corpus, including massive English, Chinese, and multilingual texts. We design a three-stage pre-training method to enhance YuLan's overall capabilities. Subsequent phases of training incorporate instruction-tuning and human alignment, employing a substantial volume of high-quality synthesized data. To facilitate the learning of complex and long-tail knowledge, we devise a curriculum-learning framework throughout across these stages, which helps LLMs learn knowledge in an easy-to-hard manner. YuLan's training is finished on Jan, 2024 and has achieved performance on par with state-of-the-art LLMs across various English and Chinese benchmarks. This paper outlines a comprehensive technical roadmap for develo** LLMs from scratch. Our model and codes are available at https://github.com/RUC-GSAI/YuLan-Chat. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.18505 [pdf, other]

Mental Modeling of Reinforcement Learning Agents by Language Models

Authors: Wenhao Lu, Xufeng Zhao, Josua Spisak, Jae Hee Lee, Stefan Wermter

Abstract: Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pretrained models have memorized can be utilized to comprehend an agent's behaviour in the physical worl… ▽ More Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pretrained models have memorized can be utilized to comprehend an agent's behaviour in the physical world. This study empirically examines, for the first time, how well large language models (LLMs) can build a mental model of agents, termed agent mental modelling, by reasoning about an agent's behaviour and its effect on states from agent interaction history. This research may unveil the potential of leveraging LLMs for elucidating RL agent behaviour, addressing a key challenge in eXplainable reinforcement learning (XRL). To this end, we propose specific evaluation metrics and test them on selected RL task datasets of varying complexity, reporting findings on agent mental model establishment. Our results disclose that LLMs are not yet capable of fully mental modelling agents through inference alone without further innovations. This work thus provides new insights into the capabilities and limitations of modern LLMs. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: https://lukaswill.github.io/

arXiv:2406.15222 [pdf]

Rapid and Accurate Diagnosis of Acute Aortic Syndrome using Non-contrast CT: A Large-scale, Retrospective, Multi-center and AI-based Study

Authors: Yujian Hu, Yilang Xiang, Yan-Jie Zhou, Yangyan He, Shifeng Yang, Xiaolong Du, Chunlan Den, Youyao Xu, Gaofeng Wang, Zhengyao Ding, **gyong Huang, Wenjun Zhao, Xuejun Wu, Donglin Li, Qianqian Zhu, Zhenjiang Li, Chenyang Qiu, Ziheng Wu, Yunjun He, Chen Tian, Yihui Qiu, Zuodong Lin, Xiaolong Zhang, Yuan He, Zhenpeng Yuan , et al. (15 additional authors not shown)

Abstract: Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed… ▽ More Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed as having other acute chest pain conditions. Subsequently, these AAS patients will undergo clinically inaccurate or suboptimal differential diagnosis. Fortunately, even under these suboptimal protocols, nearly all these patients underwent non-contrast CT covering the aorta anatomy at the early stage of differential diagnosis. In this study, we developed an artificial intelligence model (DeepAAS) using non-contrast CT, which is highly accurate for identifying AAS and provides interpretable results to assist in clinical decision-making. Performance was assessed in two major phases: a multi-center retrospective study (n = 20,750) and an exploration in real-world emergency scenarios (n = 137,525). In the multi-center cohort, DeepAAS achieved a mean area under the receiver operating characteristic curve of 0.958 (95% CI 0.950-0.967). In the real-world cohort, DeepAAS detected 109 AAS patients with misguided initial suspicion, achieving 92.6% (95% CI 76.2%-97.5%) in mean sensitivity and 99.2% (95% CI 99.1%-99.3%) in mean specificity. Our AI model performed well on non-contrast CT at all applicable early stages of differential diagnosis workflows, effectively reduced the overall missed diagnosis and misdiagnosis rate from 48.8% to 4.8% and shortened the diagnosis time for patients with misguided initial suspicion from an average of 681.8 (74-11,820) mins to 68.5 (23-195) mins. DeepAAS could effectively fill the gap in the current clinical workflow without requiring additional tests. △ Less

Submitted 24 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Comments: under peer review

arXiv:2406.13448 [pdf, other]

Demonstration of High-Efficiency Microwave Heating Producing Record Highly Charged Xenon Ion Beams with Superconducting ECR Ion Sources

Authors: X. Wang, J. B. Li, V. Mironov, J. W. Guo, X. Z. Zhang, O. Tarvainen, Y. C. Feng, L. X. Li, J. D. Ma, Z. H. Zhang, W. Lu, S. Bogomolov, L. Sun, H. W. Zhao

Abstract: Intense highly charged ion beam production is essential for high-power heavy ion accelerators. A novel movable Vlasov launcher for superconducting high charge state Electron Cyclotron Resonance (ECR) ion source has been devised that can affect the microwave power effectiveness by a factor of about 4 in terms of highly charged ion beam production. This approach based on a dedicated microwave launch… ▽ More Intense highly charged ion beam production is essential for high-power heavy ion accelerators. A novel movable Vlasov launcher for superconducting high charge state Electron Cyclotron Resonance (ECR) ion source has been devised that can affect the microwave power effectiveness by a factor of about 4 in terms of highly charged ion beam production. This approach based on a dedicated microwave launching system instead of the traditional coupling scheme has led to new insight on microwave-plasma interaction. With this new understanding, the world record highly charged xenon ion beam currents have been enhanced by up to a factor of 2, which could directly and significantly enhance the performance of heavy ion accelerators and provide many new research opportunities in nuclear physics, atomic physics and other disciplines. △ Less

Submitted 25 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.13198 [pdf, other]

Single-photon triggered quantum entanglement between two qubits or at least 2000 identical qubits

Authors: Wangjun Lu, Cuilu Zhai, Hong Tao, Yaju Song, Shiqing Tang, Lan Xu

Abstract: This paper studies the effect of single-photon light fields on quantum entanglement between two qubits and multiple identical qubits initially in a direct state. For two qubits, we first analyze the impact of the excited state's weight on single-photon-triggered entanglement, finding that excessive weight disrupts this process. We then explore how initial coherence affects entanglement, discoverin… ▽ More This paper studies the effect of single-photon light fields on quantum entanglement between two qubits and multiple identical qubits initially in a direct state. For two qubits, we first analyze the impact of the excited state's weight on single-photon-triggered entanglement, finding that excessive weight disrupts this process. We then explore how initial coherence affects entanglement, discovering that maximum initial coherence enables the single photon to achieve maximal entanglement. For multiple qubits, we similarly investigate the effects of the excited state's weight and initial coherence on entanglement control. In large qubit systems, we find that single photons cannot trigger entanglement when excited-state weights exceed ground-state weights or when all qubits are initially in the ground state. Interestingly, single photons can still trigger entanglement between any two qubits in systems with at least 2000 qubits, with the entanglement depending on initial state parameters rather than the number of qubits. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 19 pages, 11 figures

arXiv:2406.10956 [pdf, other]

Robust Channel Learning for Large-Scale Radio Speaker Verification

Authors: Wenhao Yang, Jianguo Wei, Wenhuan Lu, Lei Li, Xugang Lu

Abstract: Recent research in speaker verification has increasingly focused on achieving robust and reliable recognition under challenging channel conditions and noisy environments. Identifying speakers in radio communications is particularly difficult due to inherent limitations such as constrained bandwidth and pervasive noise interference. To address this issue, we present a Channel Robust Speaker Learnin… ▽ More Recent research in speaker verification has increasingly focused on achieving robust and reliable recognition under challenging channel conditions and noisy environments. Identifying speakers in radio communications is particularly difficult due to inherent limitations such as constrained bandwidth and pervasive noise interference. To address this issue, we present a Channel Robust Speaker Learning (CRSL) framework that enhances the robustness of the current speaker verification pipeline, considering data source, data augmentation, and the efficiency of model transfer processes. Our framework introduces an augmentation module that mitigates bandwidth variations in radio speech datasets by manipulating the bandwidth of training inputs. It also addresses unknown noise by introducing noise within the manifold space. Additionally, we propose an efficient fine-tuning method that reduces the need for extensive additional training time and large amounts of data. Moreover, we develop a toolkit for assembling a large-scale radio speech corpus and establish a benchmark specifically tailored for radio scenario speaker verification studies. Experimental results demonstrate that our proposed methodology effectively enhances performance and mitigates degradation caused by radio transmission in speaker verification tasks. The code will be available on Github. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 12 pages, 11 figures

arXiv:2406.10505 [pdf, other]

CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding

Authors: Libo Qin, Fuxuan Wei, Qiguang Chen, **gxuan Zhou, Shijue Huang, Jiasheng Si, Wenpeng Lu, Wanxiang Che

Abstract: Slot filling and intent detection are two highly correlated tasks in spoken language understanding (SLU). Recent SLU research attempts to explore zero-shot prompting techniques in large language models to alleviate the data scarcity problem. Nevertheless, the existing prompting work ignores the cross-task interaction information for SLU, which leads to sub-optimal performance. To solve this proble… ▽ More Slot filling and intent detection are two highly correlated tasks in spoken language understanding (SLU). Recent SLU research attempts to explore zero-shot prompting techniques in large language models to alleviate the data scarcity problem. Nevertheless, the existing prompting work ignores the cross-task interaction information for SLU, which leads to sub-optimal performance. To solve this problem, we present the pioneering work of Cross-task Interactive Prompting (CroPrompt) for SLU, which enables the model to interactively leverage the information exchange across the correlated tasks in SLU. Additionally, we further introduce a multi-task self-consistency mechanism to mitigate the error propagation caused by the intent information injection. We conduct extensive experiments on the standard SLU benchmark and the results reveal that CroPrompt consistently outperforms the existing prompting approaches. In addition, the multi-task self-consistency mechanism can effectively ease the error propagation issue, thereby enhancing the performance. We hope this work can inspire more research on cross-task prompting for SLU. △ Less

Submitted 15 June, 2024; originally announced June 2024.

arXiv:2406.10222 [pdf, other]

Ultra-low noise laser and optical frequency comb-based timing system for the Black Hole Explorer (BHEX) mission

Authors: Hannah Tomio, Guangning Yang, Holly F. Leopardi, Kenji Numata, Anthony W. Yu, Andrew Attar, Xiaozhen Xu, Wei Lu, Cheryl Gramling, T. K. Sridharan, Peter Kurczynski

Abstract: In this effort, we demonstrate the performance of a highly stable time reference for the proposed Black Hole Explorer (BHEX) mission, a space-based extension to the Event Horizon Telescope (EHT) Very Long Baseline Interferometry (VLBI) project. This precision timing system is based on the use of a space-qualified, ultra-low noise laser developed as part of the Laser Interferometer Space Antenna (L… ▽ More In this effort, we demonstrate the performance of a highly stable time reference for the proposed Black Hole Explorer (BHEX) mission, a space-based extension to the Event Horizon Telescope (EHT) Very Long Baseline Interferometry (VLBI) project. This precision timing system is based on the use of a space-qualified, ultra-low noise laser developed as part of the Laser Interferometer Space Antenna (LISA) mission as the timing reference, and an optical frequency comb to transfer the stability of this laser to the microwave regime for instrumentation use. We describe the implementation of this system and experimental setup to characterize the stability performance. We present the results of this experiment that demonstrate the performance of this system meets requirements for the BHEX mission. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: To be published in the proceedings of SPIE Astronomical Telescopes + Instrumentation 2024

arXiv:2406.09988 [pdf, other]

Details Make a Difference: Object State-Sensitive Neurorobotic Task Planning

Authors: Xiaowen Sun, Xufeng Zhao, Jae Hee Lee, Wenhao Lu, Matthias Kerzel, Stefan Wermter

Abstract: The state of an object reflects its current status or condition and is important for a robot's task planning and manipulation. However, detecting an object's state and generating a state-sensitive plan for robots is challenging. Recently, pre-trained Large Language Models (LLMs) and Vision-Language Models (VLMs) have shown impressive capabilities in generating plans. However, to the best of our kn… ▽ More The state of an object reflects its current status or condition and is important for a robot's task planning and manipulation. However, detecting an object's state and generating a state-sensitive plan for robots is challenging. Recently, pre-trained Large Language Models (LLMs) and Vision-Language Models (VLMs) have shown impressive capabilities in generating plans. However, to the best of our knowledge, there is hardly any investigation on whether LLMs or VLMs can also generate object state-sensitive plans. To study this, we introduce an Object State-Sensitive Agent (OSSA), a task-planning agent empowered by pre-trained neural networks. We propose two methods for OSSA: (i) a modular model consisting of a pre-trained vision processing module (dense captioning model, DCM) and a natural language processing model (LLM), and (ii) a monolithic model consisting only of a VLM. To quantitatively evaluate the performances of the two methods, we use tabletop scenarios where the task is to clear the table. We contribute a multimodal benchmark dataset that takes object states into consideration. Our results show that both methods can be used for object state-sensitive tasks, but the monolithic approach outperforms the modular approach. The code for OSSA is available at \url{https://github.com/Xiao-wen-Sun/OSSA} △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.09469 [pdf, other]

Conformance Testing of Relational DBMS Against SQL Specifications

Authors: Shuang Liu, Chenglin Tian, Jun Sun, Ruifeng Wang, Wei Lu, Yongxin Zhao, Yinxing Xue, Junjie Wang, Xiaoyong Du

Abstract: A Relational Database Management System (RDBMS) is one of the fundamental software that supports a wide range of applications, making it critical to identify bugs within these systems. There has been active research on testing RDBMS, most of which employ crash or use metamorphic relations as the oracle. Although existing approaches can detect bugs in RDBMS, they are far from comprehensively evalua… ▽ More A Relational Database Management System (RDBMS) is one of the fundamental software that supports a wide range of applications, making it critical to identify bugs within these systems. There has been active research on testing RDBMS, most of which employ crash or use metamorphic relations as the oracle. Although existing approaches can detect bugs in RDBMS, they are far from comprehensively evaluating the RDBMS's correctness (i.e., with respect to the semantics of SQL). In this work, we propose a method to test the semantic conformance of RDBMS i.e., whether its behavior respects the intended semantics of SQL. Specifically, we have formally defined the semantics of SQL and implemented them in Prolog. Then, the Prolog implementation serves as the reference RDBMS, enabling differential testing on existing RDBMS. We applied our approach to four widely-used and thoroughly tested RDBMSs, i.e., MySQL, TiDB, SQLite, and DuckDB. In total, our approach uncovered 19 bugs and 11 inconsistencies, which are all related to violating the SQL specification or missing/unclear specification, thereby demonstrating the effectiveness and applicability of our approach. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.08012 [pdf, other]

Interaction of an outflow with surrounding gaseous clouds as the origin of the late-time radio flares in TDEs

Authors: Jialun Zhuang, Rong-Feng Shen, Guobin Mou, Wenbin Lu

Abstract: Close encounter between a star and a supermassive black hole (SMBH) results in the tidal disruption of the star, known as a tidal disruption event (TDE). Recently, a few TDEs, e.g., ASASSN-15oi and AT2018hyz, have shown late-time (hundreds of days after their UV/optical peaks) radio flares with radio luminosities of $10^{38\sim39}$ erg/s. The super-Eddington fallback or accretion in a TDE may gene… ▽ More Close encounter between a star and a supermassive black hole (SMBH) results in the tidal disruption of the star, known as a tidal disruption event (TDE). Recently, a few TDEs, e.g., ASASSN-15oi and AT2018hyz, have shown late-time (hundreds of days after their UV/optical peaks) radio flares with radio luminosities of $10^{38\sim39}$ erg/s. The super-Eddington fallback or accretion in a TDE may generate a mass outflow. Here we investigate a scenario that the late-time radio flares come from the interaction of the outflow with the circum-nuclear gaseous clouds, in addition to the slow-evolving emission component due to the outflow-diffuse medium interaction. We calculate the associated radio temporal and spectral signatures and find that they reproduce well the observations. The outflows have the inferred velocity of 0.2$\sim0.8$ c, the total mass of $10^{-3}\sim10^{-1}$ $\mathrm{M_{\odot}}$ and the ejection duration of a month to a year. The distances of the clouds to the SMBH are $0.1\sim1$ pc. This scenario has advantages in explaining the long delay, sharpness of the rise and the multiplicity of the late radio flares. Future observations may build up a much larger sample of late-time radio flares and enable their use as a probe of the TDE physics and the host circumnuclear environment. △ Less

Submitted 26 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

Comments: 13 pages, 13 figures. Submitted to ApJ. A new version with some modifications. Comments are welcome

arXiv:2406.06594 [pdf, other]

Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism

Authors: Chang Zong, Jian Shao, Weiming Lu, Yueting Zhuang

Abstract: The accurate prediction of stock movements is crucial for investment strategies. Stock prices are subject to the influence of various forms of information, including financial indicators, sentiment analysis, news documents, and relational structures. Predominant analytical approaches, however, tend to address only unimodal or bimodal sources, neglecting the complexity of multimodal data. Further c… ▽ More The accurate prediction of stock movements is crucial for investment strategies. Stock prices are subject to the influence of various forms of information, including financial indicators, sentiment analysis, news documents, and relational structures. Predominant analytical approaches, however, tend to address only unimodal or bimodal sources, neglecting the complexity of multimodal data. Further complicating the landscape are the issues of data sparsity and semantic conflicts between these modalities, which are frequently overlooked by current models, leading to unstable performance and limiting practical applicability. To address these shortcomings, this study introduces a novel architecture, named Multimodal Stable Fusion with Gated Cross-Attention (MSGCA), designed to robustly integrate multimodal input for stock movement prediction. The MSGCA framework consists of three integral components: (1) a trimodal encoding module, responsible for processing indicator sequences, dynamic documents, and a relational graph, and standardizing their feature representations; (2) a cross-feature fusion module, where primary and consistent features guide the multimodal fusion of the three modalities via a pair of gated cross-attention networks; and (3) a prediction module, which refines the fused features through temporal and dimensional reduction to execute precise movement forecasting. Empirical evaluations demonstrate that the MSGCA framework exceeds current leading methods, achieving performance gains of 8.1%, 6.1%, 21.7% and 31.6% on four multimodal datasets, respectively, attributed to its enhanced multimodal fusion stability. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 29 pages, 10 figures

MSC Class: 68T07 ACM Class: I.2.6; J.4

arXiv:2406.06563 [pdf, other]

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

Authors: Tianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng, Biye Li, Weiwei Lü, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Liang Zeng, Xiaokun Wang, Yutuan Ma, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou

Abstract: In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts. It is initialized from the pre-existing dense checkpoints of our Skywork-13B model. We explore the comparative effectiveness of upcycling versus training from scratch initi… ▽ More In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts. It is initialized from the pre-existing dense checkpoints of our Skywork-13B model. We explore the comparative effectiveness of upcycling versus training from scratch initializations. Our findings suggest that the choice between these two approaches should consider both the performance of the existing dense checkpoints and the MoE training budget. We highlight two innovative techniques: gating logit normalization, which improves expert diversification, and adaptive auxiliary loss coefficients, allowing for layer-specific adjustment of auxiliary loss coefficients. Our experimental results validate the effectiveness of these methods. Leveraging these techniques and insights, we trained our upcycled Skywork-MoE on a condensed subset of our SkyPile corpus. The evaluation results demonstrate that our model delivers strong performance across a wide range of benchmarks. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.06028 [pdf, other]

ReCon1M:A Large-scale Benchmark Dataset for Relation Comprehension in Remote Sensing Imagery

Authors: Xian Sun, Qiwei Yan, Chubo Deng, Chenglong Liu, Yi Jiang, Zhongyan Hou, Wanxuan Lu, Fanglong Yao, Xiaoyu Liu, Lingxiang Hao, Hongfeng Yu

Abstract: Scene Graph Generation (SGG) is a high-level visual understanding and reasoning task aimed at extracting entities (such as objects) and their interrelationships from images. Significant progress has been made in the study of SGG in natural images in recent years, but its exploration in the domain of remote sensing images remains very limited. The complex characteristics of remote sensing images ne… ▽ More Scene Graph Generation (SGG) is a high-level visual understanding and reasoning task aimed at extracting entities (such as objects) and their interrelationships from images. Significant progress has been made in the study of SGG in natural images in recent years, but its exploration in the domain of remote sensing images remains very limited. The complex characteristics of remote sensing images necessitate higher time and manual interpretation costs for annotation compared to natural images. The lack of a large-scale public SGG benchmark is a major impediment to the advancement of SGG-related research in aerial imagery. In this paper, we introduce the first publicly available large-scale, million-level relation dataset in the field of remote sensing images which is named as ReCon1M. Specifically, our dataset is built upon Fair1M and comprises 21,392 images. It includes annotations for 859,751 object bounding boxes across 60 different categories, and 1,149,342 relation triplets across 64 categories based on these bounding boxes. We provide a detailed description of the dataset's characteristics and statistical information. We conducted two object detection tasks and three sub-tasks within SGG on this dataset, assessing the performance of mainstream methods on these tasks. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.03523 [pdf, other]

Objects May Be Closer Than They Appear: Significant Host Galaxy Dispersion Measures of Fast Radio Bursts in Zoom-in Simulations

Authors: Matthew E. Orr, Blakesley Burkhart, Wenbin Lu, Sam B. Ponnada, Cameron B. Hummels

Abstract: We investigate the contribution of host galaxies to the overall Dispersion Measures (DMs) for Fast Radio Bursts (FRBs) using the Feedback in Realistic Environments (FIRE-2) cosmological zoom-in simulation suite. We calculate DMs from every star particle in the simulated L* galaxies by ray-tracing through their multi-phase interstellar medium (ISM), summing the line-of-sight free thermal electron c… ▽ More We investigate the contribution of host galaxies to the overall Dispersion Measures (DMs) for Fast Radio Bursts (FRBs) using the Feedback in Realistic Environments (FIRE-2) cosmological zoom-in simulation suite. We calculate DMs from every star particle in the simulated L* galaxies by ray-tracing through their multi-phase interstellar medium (ISM), summing the line-of-sight free thermal electron column for all gas elements within $\pm$20 kpc of the galactic mid-plane. At $z=0$, we find average (median) host-galaxy DMs of 74 (43) and 210 (94) pc cm$^{-3}$ for older ($\gtrsim$10 Myr) and younger ($\lesssim$10 Myr) stellar populations, respectively. Inclination raises the median DM measured for older populations ($\gtrsim$10 Myr) in the simulations by a factor of $\sim$2, but generally does not affect the younger stars deeply embedded in H{\small II} regions except in extreme edge-on cases (inclination $\gtrsim 85^\circ$). In kinematically disturbed snapshots ($z = 1$ in FIRE), the average (median) host-galaxy DMs are higher: 80 (107) and 266 (795) pc cm$^{-3}$ for older ($\gtrsim$10 Myr) and younger ($\lesssim$10 Myr) stellar populations, respectively. FIRE galaxies tend to have higher DM values than cosmological simulations such as IllustrisTNG. As a result, FRB host galaxies may be closer (lower redshift) than previously inferred. Furthermore, constraining host-galaxy DM distributions may help significantly constrain FRB progenitor models. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 17 pages, 12 figures, submitted to ApJ Letters

arXiv:2406.00692 [pdf]

Harvesting room-temperature plasticity in ceramics by mechanically seeded dislocations

Authors: Xufei Fang, Wenjun Lu, Jiawen Zhang, Christian Minnert, Junhua Hou, Sebastian Bruns, Ulrike Kunz, Atsutomo Nakamura, Karsten Durst, Jürgen Rödel

Abstract: The quest for room-temperature ductile ceramics has been repeatedly fueled by hopes for large-scale applications but so far has been not successful. Recent demonstrations of enhanced functional properties in ceramics through judicious dislocation imprint, however, have been sparking renewed interest in dislocation plasticity in brittle ceramics. Here, we propose a facile approach using room-temper… ▽ More The quest for room-temperature ductile ceramics has been repeatedly fueled by hopes for large-scale applications but so far has been not successful. Recent demonstrations of enhanced functional properties in ceramics through judicious dislocation imprint, however, have been sparking renewed interest in dislocation plasticity in brittle ceramics. Here, we propose a facile approach using room-temperature mechanically seeded mobile dislocations with a density of ~10^14/m^2 to significantly improve the room-temperature plasticity of ceramics with a large plastic strain beyond ~30%. The seeded mobile dislocations trigger profuse dislocation multiplication via cross slip and motion. Hence, they offer an avenue to suppress brittle fracture and harvest plasticity in ceramics without any additional high-temperature process. We employ both in situ nano-/micromechanical deformation and ex situ bulk deformation to bridge the length scales. This finding tackles the pressing bottleneck of dislocation engineering in ceramics for achieving ductile ceramics and harvesting both versatile mechanical and functional properties. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.00623 [pdf]

Room-temperature bulk plasticity and tunable dislocation densities in KTaO3

Authors: Xufei Fang, Jiawen Zhang, Alexander Frisch, Oliver Preuß, Chukwudalu Okafor, Martin Setvin, Wenjun Lu

Abstract: We report room-temperature bulk plasticity mediated by dislocations in single-crystal cubic KTaO3, contrasting the conventional knowledge that single-crystal KTaO3 is susceptible to brittle cleavage. A mechanics-based combinatorial experimental approach using cyclic Brinell indentation, scratching, and uniaxial bulk compression consistently demonstrates room-temperature dislocation plasticity in K… ▽ More We report room-temperature bulk plasticity mediated by dislocations in single-crystal cubic KTaO3, contrasting the conventional knowledge that single-crystal KTaO3 is susceptible to brittle cleavage. A mechanics-based combinatorial experimental approach using cyclic Brinell indentation, scratching, and uniaxial bulk compression consistently demonstrates room-temperature dislocation plasticity in KTaO3 from the mesoscale to the macroscale. This approach also delivers tunable dislocation densities and plastic zone sizes. Scanning transmission electron microscopy analysis underpins the activated slip system to be <110>{1-10}. Given the growing significance of KTaO3 as an emerging electronic oxide and the increasing interest in dislocations for tuning physical properties of oxides, our findings are expected to trigger synergistic research interest in KTaO3 with dislocations. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.19596 [pdf, ps, other]

The weight hierarchies of three classes of linear codes

Authors: Wei Lu, Qingyao Wang, Xiaoqiang Wang, Dabin Zheng

Abstract: Studying the generalized Hamming weights of linear codes is a significant research area within coding theory, as it provides valuable structural information about the codes and plays a crucial role in determining their performance in various applications. However, determining the generalized Hamming weights of linear codes, particularly their weight hierarchy, is generally a challenging task. In t… ▽ More Studying the generalized Hamming weights of linear codes is a significant research area within coding theory, as it provides valuable structural information about the codes and plays a crucial role in determining their performance in various applications. However, determining the generalized Hamming weights of linear codes, particularly their weight hierarchy, is generally a challenging task. In this paper, we focus on investigating the generalized Hamming weights of three classes of linear codes over finite fields. These codes are constructed by different defining sets. By analysing the intersections between the definition sets and the duals of all $r$-dimensional subspaces, we get the inequalities on the sizes of these intersections. Then constructing subspaces that reach the upper bounds of these inequalities, we successfully determine the complete weight hierarchies of these codes. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.14307 [pdf, other]

AdaGMLP: AdaBoosting GNN-to-MLP Knowledge Distillation

Authors: Weigang Lu, Ziyu Guan, Wei Zhao, Yaming Yang

Abstract: Graph Neural Networks (GNNs) have revolutionized graph-based machine learning, but their heavy computational demands pose challenges for latency-sensitive edge devices in practical industrial applications. In response, a new wave of methods, collectively known as GNN-to-MLP Knowledge Distillation, has emerged. They aim to transfer GNN-learned knowledge to a more efficient MLP student, which offers… ▽ More Graph Neural Networks (GNNs) have revolutionized graph-based machine learning, but their heavy computational demands pose challenges for latency-sensitive edge devices in practical industrial applications. In response, a new wave of methods, collectively known as GNN-to-MLP Knowledge Distillation, has emerged. They aim to transfer GNN-learned knowledge to a more efficient MLP student, which offers faster, resource-efficient inference while maintaining competitive performance compared to GNNs. However, these methods face significant challenges in situations with insufficient training data and incomplete test data, limiting their applicability in real-world applications. To address these challenges, we propose AdaGMLP, an AdaBoosting GNN-to-MLP Knowledge Distillation framework. It leverages an ensemble of diverse MLP students trained on different subsets of labeled nodes, addressing the issue of insufficient training data. Additionally, it incorporates a Node Alignment technique for robust predictions on test data with missing or incomplete features. Our experiments on seven benchmark datasets with different settings demonstrate that AdaGMLP outperforms existing G2M methods, making it suitable for a wide range of latency-sensitive real-world applications. We have submitted our code to the GitHub repository (https://github.com/WeigangLu/AdaGMLP-KDD24). △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Accepted by KDD 2024

Journal ref: KDD 2024

arXiv:2405.11343 [pdf, other]

Sub-relativistic Outflow and Hours-Timescale Large-amplitude X-ray Dips during Super-Eddington Accretion onto a Low-mass Massive Black Hole in the Tidal Disruption Event AT2022lri

Authors: Yuhan Yao, Muryel Guolo, Francesco Tombesi, Ruancun Li, Suvi Gezari, Javier A. García, Lixin Dai, Ryan Chornock, Wenbin Lu, S. R. Kulkarni, Keith C. Gendreau, Dheeraj R. Pasham, S. Bradley Cenko, Erin Kara, Raffaella Margutti, Yukta Ajay, Thomas Wevers, Tom M. Kwan, Igor Andreoni, Joshua S. Bloom, Andrew J. Drake, Matthew J. Graham, Erica Hammerstein, Russ R. Laher, Natalie LeBaron , et al. (10 additional authors not shown)

Abstract: We present the tidal disruption event (TDE) AT2022lri, hosted in a nearby ($\approx\!144$ Mpc) quiescent galaxy with a low-mass massive black hole ($10^4\,M_\odot < M_{\rm BH} < 10^6\,M_\odot$). AT2022lri belongs to the TDE-H+He subtype. More than 1 Ms of X-ray data were collected with NICER, Swift, and XMM-Newton from 187 d to 672 d after peak. The X-ray luminosity gradually declined from… ▽ More We present the tidal disruption event (TDE) AT2022lri, hosted in a nearby ($\approx\!144$ Mpc) quiescent galaxy with a low-mass massive black hole ($10^4\,M_\odot < M_{\rm BH} < 10^6\,M_\odot$). AT2022lri belongs to the TDE-H+He subtype. More than 1 Ms of X-ray data were collected with NICER, Swift, and XMM-Newton from 187 d to 672 d after peak. The X-ray luminosity gradually declined from $1.5\times 10^{44}\,{\rm erg\,s^{-1}}$ to $1.5\times 10^{43}\,{\rm erg\,s^{-1}}$ and remains much above the UV and optical luminosity, consistent with a super-Eddington accretion flow viewed face-on. Sporadic strong X-ray dips atop a long-term decline are observed, with variability timescale of $\approx\!0.5$ hr--1 d and amplitude of $\approx\!2$--8. When fitted with simple continuum models, the X-ray spectrum is dominated by a thermal disk component with inner temperature going from $\sim\! 146$ eV to $\sim\! 86$ eV. However, there are residual features that peak around 1 keV, which, in some cases, cannot be reproduced by a single broad emission line. We analyzed a subset of time-resolved spectra with two physically motivated models describing either a scenario where ionized absorbers contribute extra absorption and emission lines or where disk reflection plays an important role. Both models provide good and statistically comparable fits, show that the X-ray dips are correlated with drops in the inner disk temperature, and require the existence of sub-relativistic (0.1--0.3$c$) ionized outflows. We propose that the disk temperature fluctuation stems from episodic drops of the mass accretion rate triggered by magnetic instabilities or/and wobbling of the inner accretion disk along the black hole's spin axis. △ Less

Submitted 18 May, 2024; originally announced May 2024.

Comments: 35 pages, 20 figures, submitted

arXiv:2405.09681 [pdf]

Inactive Overhang in Silicon Anodes

Authors: Aidin I. OBrien, Stephen E. Trask, Devashish Salpekar, Seoung-Bum Son, Alison R. Dunlop, Gabriel M. Veith, Wenquan Lu, Brian J. Ingram, Daniel P. Abraham, Andrew N. Jansen, Marco-Tulio F. Rodrigues

Abstract: Li-ion batteries contain excess anode area to improve manufacturability and prevent Li plating. These overhang areas in graphite electrodes are active but experience decreased Li+ flux during cycling. Over time, the overhang and the anode portions directly opposite to the cathode can exchange Li+, driven by differences in local electrical potential across the electrode, which artificially inflates… ▽ More Li-ion batteries contain excess anode area to improve manufacturability and prevent Li plating. These overhang areas in graphite electrodes are active but experience decreased Li+ flux during cycling. Over time, the overhang and the anode portions directly opposite to the cathode can exchange Li+, driven by differences in local electrical potential across the electrode, which artificially inflates or decreases the measured cell capacity. Here, we show that lithiation of the overhang is less likely to happen in silicon anodes paired with layered oxide cathodes. The large voltage hysteresis of silicon creates a lower driving force for Li+ exchange as lithium ions transit into the overhang, rendering this exchange highly inefficient. For crystalline Si particles, Li+ storage at the overhang is prohibitive, because the low potential required for the initial lithiation can act as thermodynamic barrier for this exchange. We use micro-Raman spectroscopy to demonstrate that crystalline Si particles at the overhang are never lithiated even after cell storage at 45 oC for four months. Since the anode overhang can affect the forecasting of cell life, cells using silicon anodes may require different methodologies for life estimation compared to those used for traditional graphite-based Li-ion batteries. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.09054 [pdf, other]

Dim Small Target Detection and Tracking: A Novel Method Based on Temporal Energy Selective Scaling and Trajectory Association

Authors: Weihua Gao, Wenlong Niu, Wenlong Lu, Pengcheng Wang, Zhaoyuan Qi, Xiaodong Peng, Zhen Yang

Abstract: The detection and tracking of small targets in passive optical remote sensing (PORS) has broad applications. However, most of the previously proposed methods seldom utilize the abundant temporal features formed by target motion, resulting in poor detection and tracking performance for low signal-to-clutter ratio (SCR) targets. In this article, we analyze the difficulty based on spatial features an… ▽ More The detection and tracking of small targets in passive optical remote sensing (PORS) has broad applications. However, most of the previously proposed methods seldom utilize the abundant temporal features formed by target motion, resulting in poor detection and tracking performance for low signal-to-clutter ratio (SCR) targets. In this article, we analyze the difficulty based on spatial features and the feasibility based on temporal features of realizing effective detection. According to this analysis, we use a multi-frame as a detection unit and propose a detection method based on temporal energy selective scaling (TESS). Specifically, we investigated the composition of intensity temporal profiles (ITPs) formed by pixels on a multi-frame detection unit. For the target-present pixel, the target passing through the pixel will bring a weak transient disturbance on the ITP and introduce a change in the statistical properties of ITP. We use a well-designed function to amplify the transient disturbance, suppress the background and noise components, and output the trajectory of the target on the multi-frame detection unit. Subsequently, to solve the contradiction between the detection rate and the false alarm rate brought by the traditional threshold segmentation, we associate the temporal and spatial features of the output trajectory and propose a trajectory extraction method based on the 3D Hough transform. Finally, we model the trajectory of the target and propose a trajectory-based multi-target tracking method. Compared with the various state-of-the-art detection and tracking methods, experiments in multiple scenarios prove the superiority of our proposed methods. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.09048 [pdf]

Beam Sha** Based on Axisymmetric Aspheric Mirrors

Authors: Zhihao Chen, Xiaonan Ning, Jiucheng Chen, Jianfei Hua, Wei Lu

Abstract: Flat-top beam, known for its ability to generate a consistently even irradiation area, holds vast utility in many fields of scientific and industrial applications. In this paper, a reflective laser beam sha** method based on two axisymmetric aspheric mirrors (AAMs), a polarizing beam splitter (PBS) and two quarter wave plates (QWPs) is proposed to transform Gaussian beam into flat-top beam. Comp… ▽ More Flat-top beam, known for its ability to generate a consistently even irradiation area, holds vast utility in many fields of scientific and industrial applications. In this paper, a reflective laser beam sha** method based on two axisymmetric aspheric mirrors (AAMs), a polarizing beam splitter (PBS) and two quarter wave plates (QWPs) is proposed to transform Gaussian beam into flat-top beam. Compared to alternative beam sha** methods, the method using AAMs demonstrates distinct advantages on notably high energy efficiency and unique capability to generate parallel beams. Thanks to its relative simplicities of design, manufacture and tunability, AAMs-sha** further enhances its appeal in applied research scenarios. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 7 pages, 9 figures

arXiv:2405.07687 [pdf, other]

Highly Efficient Observation Process based on FFT Filtering for Robot Swarm Collaborative Navigation in Unknown Environments

Authors: Chenxi Li, Weining Lu, Zhihao Ma, Litong Meng, Bin Liang

Abstract: Collaborative path planning for robot swarms in complex, unknown environments without external positioning is a challenging problem. This requires robots to find safe directions based on real-time environmental observations, and to efficiently transfer and fuse these observations within the swarm. This study presents a filtering method based on Fast Fourier Transform (FFT) to address these two iss… ▽ More Collaborative path planning for robot swarms in complex, unknown environments without external positioning is a challenging problem. This requires robots to find safe directions based on real-time environmental observations, and to efficiently transfer and fuse these observations within the swarm. This study presents a filtering method based on Fast Fourier Transform (FFT) to address these two issues. We treat sensors' environmental observations as a digital sampling process. Then, we design two different types of filters for safe direction extraction, as well as for the compression and reconstruction of environmental data. The reconstructed data is mapped to probabilistic domain, achieving efficient fusion of swarm observations and planning decision. The computation time is only on the order of microseconds, and the transmission data in communication systems is in bit-level. The performance of our algorithm in sensor data processing was validated in real world experiments, and the effectiveness in swarm path optimization was demonstrated through extensive simulations. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 8 pages, 8 figures, 1 table

arXiv:2405.07519 [pdf, ps, other]

Stability equivalence for stochastic differential equations, stochastic differential delay equations and their corresponding Euler-Maruyama methods in $G$-framework

Authors: Wen Lu

Abstract: In this paper, we investigate the stability equivalence problem for stochastic differential delay equations, the auxiliary stochastic differential equations and their corresponding Euler-Maruyama (EM) methods under $G$-framework. More precisely, for $p\geq 2$, we prove the equivalence of practical exponential stability in $p$-th moment sense among stochastic differential delay equations driven by… ▽ More In this paper, we investigate the stability equivalence problem for stochastic differential delay equations, the auxiliary stochastic differential equations and their corresponding Euler-Maruyama (EM) methods under $G$-framework. More precisely, for $p\geq 2$, we prove the equivalence of practical exponential stability in $p$-th moment sense among stochastic differential delay equations driven by $G$-Brownian motion ($G$-SDDEs), the auxiliary stochastic differential equations driven by $G$-Brownian motion ($G$-SDEs), and their corresponding Euler-Maruyama methods, provided the delay or the step size is small enough. Thus, we can carry out careful simulations to examine the practical exponential stability of the underlying $G$-SDDE or $G$-SDE under some reasonable assumptions. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.04673 [pdf, ps, other]

Singularity Structures of Linear Inviscid Dam** in a Channel

Authors: Wenjie Lu

Abstract: This paper studies singularity structures of the linear inviscid dam** of two-dimensional Euler equations in a finite periodic channel. We introduce a recursive definition of singularity structures which characterize the singularities of the spectrum density function from different sources: the free part and the boundary part of the Green function. As an application, we demonstrate that the stre… ▽ More This paper studies singularity structures of the linear inviscid dam** of two-dimensional Euler equations in a finite periodic channel. We introduce a recursive definition of singularity structures which characterize the singularities of the spectrum density function from different sources: the free part and the boundary part of the Green function. As an application, we demonstrate that the stream function exhibits smoothness away from the channel's boundary, yet it presents singularities in close proximity to the boundary. The singularities arise due to the interaction of boundary and interior singularities of the spectrum density function. We also show that the behavior of the initial data and background flow have an impact on the regularity of different components of the stream function. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.01908 [pdf, other]

A Full Adagrad algorithm with O(Nd) operations

Authors: Antoine Godichon-Baggioni, Wei Lu, Bruno Portier

Abstract: A novel approach is given to overcome the computational challenges of the full-matrix Adaptive Gradient algorithm (Full AdaGrad) in stochastic optimization. By develo** a recursive method that estimates the inverse of the square root of the covariance of the gradient, alongside a streaming variant for parameter updates, the study offers efficient and practical algorithms for large-scale applicat… ▽ More A novel approach is given to overcome the computational challenges of the full-matrix Adaptive Gradient algorithm (Full AdaGrad) in stochastic optimization. By develo** a recursive method that estimates the inverse of the square root of the covariance of the gradient, alongside a streaming variant for parameter updates, the study offers efficient and practical algorithms for large-scale applications. This innovative strategy significantly reduces the complexity and resource demands typically associated with full-matrix methods, enabling more effective optimization processes. Moreover, the convergence rates of the proposed estimators and their asymptotic efficiency are given. Their effectiveness is demonstrated through numerical studies. △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2404.19330 [pdf, other]

G2LTraj: A Global-to-Local Generation Approach for Trajectory Prediction

Authors: Zhanwei Zhang, Zishuo Hua, Minghao Chen, Wei Lu, Binbin Lin, Deng Cai, Wenxiao Wang

Abstract: Predicting future trajectories of traffic agents accurately holds substantial importance in various applications such as autonomous driving. Previous methods commonly infer all future steps of an agent either recursively or simultaneously. However, the recursive strategy suffers from the accumulated error, while the simultaneous strategy overlooks the constraints among future steps, resulting in k… ▽ More Predicting future trajectories of traffic agents accurately holds substantial importance in various applications such as autonomous driving. Previous methods commonly infer all future steps of an agent either recursively or simultaneously. However, the recursive strategy suffers from the accumulated error, while the simultaneous strategy overlooks the constraints among future steps, resulting in kinematically infeasible predictions. To address these issues, in this paper, we propose G2LTraj, a plug-and-play global-to-local generation approach for trajectory prediction. Specifically, we generate a series of global key steps that uniformly cover the entire future time range. Subsequently, the local intermediate steps between the adjacent key steps are recursively filled in. In this way, we prevent the accumulated error from propagating beyond the adjacent key steps. Moreover, to boost the kinematical feasibility, we not only introduce the spatial constraints among key steps but also strengthen the temporal constraints among the intermediate steps. Finally, to ensure the optimal granularity of key steps, we design a selectable granularity strategy that caters to each predicted trajectory. Our G2LTraj significantly improves the performance of seven existing trajectory predictors across the ETH, UCY and nuScenes datasets. Experimental results demonstrate its effectiveness. Code will be available at https://github.com/Zhanwei-Z/G2LTraj. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: Accepted by IJCAI 2024

arXiv:2404.15624 [pdf, other]

A new framework of high-order unfitted finite element methods using ALE maps for moving-domain problems

Authors: Wenhao Lu, Chuwen Ma, Weiying Zheng

Abstract: As a sequel to our previous work [C. Ma, Q. Zhang and W. Zheng, SIAM J. Numer. Anal., 60 (2022)], [C. Ma and W. Zheng, J. Comput. Phys. 469 (2022)], this paper presents a generic framework of arbitrary Lagrangian-Eulerian unfitted finite element (ALE-UFE) methods for partial differential equations (PDEs) on time-varying domains. The ALE-UFE method has a great potential in develo** high-order unf… ▽ More As a sequel to our previous work [C. Ma, Q. Zhang and W. Zheng, SIAM J. Numer. Anal., 60 (2022)], [C. Ma and W. Zheng, J. Comput. Phys. 469 (2022)], this paper presents a generic framework of arbitrary Lagrangian-Eulerian unfitted finite element (ALE-UFE) methods for partial differential equations (PDEs) on time-varying domains. The ALE-UFE method has a great potential in develo** high-order unfitted finite element methods. The usefulness of the method is demonstrated by a variety of moving-domain problems, including a linear problem with explicit velocity of the boundary (or interface), a PDE-domain coupled problem, and a problem whose domain has a topological change. Numerical experiments show that optimal convergence is achieved by both third- and fourth-order methods on domains with smooth boundaries, but is deteriorated to the second order when the domain has topological changes. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.15462 [pdf]

Environmental permittivity-asymmetric BIC metasurfaces with electrical reconfigurability

Authors: Haiyang Hu, Wenzheng Lu, Rodrigo Berte, Stefan A Maier, Andreas Tittl

Abstract: In the rapidly evolving field of nanophotonics, achieving precise spectral and temporal light manipulation at the nanoscale remains a critical challenge. While photonic bound states in the continuum (BICs) have emerged as a powerful means of controlling light, their common reliance on geometrical symmetry breaking for obtaining tailored resonances makes them highly susceptible to fabrication imper… ▽ More In the rapidly evolving field of nanophotonics, achieving precise spectral and temporal light manipulation at the nanoscale remains a critical challenge. While photonic bound states in the continuum (BICs) have emerged as a powerful means of controlling light, their common reliance on geometrical symmetry breaking for obtaining tailored resonances makes them highly susceptible to fabrication imperfections and fundamentally limits their maximum resonance quality factor. Here, we introduce the concept of environmental symmetry breaking by embedding identical resonators into a surrounding medium with carefully placed regions of contrasting refractive indexes, activating permittivity-driven quasi-BIC resonances without any alterations of the underlying resonator geometry and unlocking an additional degree of freedom for light manipulation through actively tuning the surrounding refractive index contrast. We demonstrate this concept by integrating polyaniline (PANI), an electro-optically active polymer, to achieve electrically reconfigurable qBICs. This integration not only demonstrates rapid switching speeds, and exceptional durability but also significantly boosts the system's optical response to environmental perturbations. Our strategy significantly expands the capabilities of resonant light manipulation through permittivity modulation, opening avenues for on-chip optical devices, advanced sensing, and beyond. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 35 pages, 4 figures, and 11 supporting figures

arXiv:2404.13985 [pdf, other]

Information Re-Organization Improves Reasoning in Large Language Models

Authors: Xiaoxia Cheng, Zeqi Tan, Wei Xue, Weiming Lu

Abstract: Improving the reasoning capabilities of large language models (LLMs) has attracted considerable interest. Recent approaches primarily focus on improving the reasoning process to yield a more precise final answer. However, in scenarios involving contextually aware reasoning, these methods neglect the importance of first identifying logical relationships from the context before proceeding with the r… ▽ More Improving the reasoning capabilities of large language models (LLMs) has attracted considerable interest. Recent approaches primarily focus on improving the reasoning process to yield a more precise final answer. However, in scenarios involving contextually aware reasoning, these methods neglect the importance of first identifying logical relationships from the context before proceeding with the reasoning. This oversight could lead to a superficial understanding and interaction with the context, potentially undermining the quality and reliability of the reasoning outcomes. In this paper, we propose an information re-organization (InfoRE) method before proceeding with the reasoning to enhance the reasoning ability of LLMs. Our re-organization method involves initially extracting logical relationships from the contextual content, such as documents or paragraphs, and subsequently pruning redundant content to minimize noise. Then, we utilize the re-organized information in the reasoning process. This enables LLMs to deeply understand the contextual content by clearly perceiving these logical relationships, while also ensuring high-quality responses by eliminating potential noise. To demonstrate the effectiveness of our approach in improving the reasoning ability, we conduct experiments using Llama2-70B, GPT-3.5, and GPT-4 on various contextually aware multi-hop reasoning tasks. Using only a zero-shot setting, our method achieves an average absolute improvement of 4% across all tasks, highlighting its potential to improve the reasoning performance of LLMs. Our source code is available at https://github.com/hustcxx/InfoRE. △ Less

Submitted 24 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

Comments: 15 pages, 4 figures

arXiv:2404.13885 [pdf, other]

Surveying Attitudinal Alignment Between Large Language Models Vs. Humans Towards 17 Sustainable Development Goals

Authors: Qingyang Wu, Ying Xu, Tingsong Xiao, Yunze Xiao, Yitong Li, Tianyang Wang, Yichi Zhang, Shanghai Zhong, Yuwei Zhang, Wei Lu, Yifan Yang

Abstract: Large Language Models (LLMs) have emerged as potent tools for advancing the United Nations' Sustainable Development Goals (SDGs). However, the attitudinal disparities between LLMs and humans towards these goals can pose significant challenges. This study conducts a comprehensive review and analysis of the existing literature on the attitudes of LLMs towards the 17 SDGs, emphasizing the comparison… ▽ More Large Language Models (LLMs) have emerged as potent tools for advancing the United Nations' Sustainable Development Goals (SDGs). However, the attitudinal disparities between LLMs and humans towards these goals can pose significant challenges. This study conducts a comprehensive review and analysis of the existing literature on the attitudes of LLMs towards the 17 SDGs, emphasizing the comparison between their attitudes and support for each goal and those of humans. We examine the potential disparities, primarily focusing on aspects such as understanding and emotions, cultural and regional differences, task objective variations, and factors considered in the decision-making process. These disparities arise from the underrepresentation and imbalance in LLM training data, historical biases, quality issues, lack of contextual understanding, and skewed ethical values reflected. The study also investigates the risks and harms that may arise from neglecting the attitudes of LLMs towards the SDGs, including the exacerbation of social inequalities, racial discrimination, environmental destruction, and resource wastage. To address these challenges, we propose strategies and recommendations to guide and regulate the application of LLMs, ensuring their alignment with the principles and goals of the SDGs, and therefore creating a more just, inclusive, and sustainable future. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.13298 [pdf, other]

MARec: Metadata Alignment for cold-start Recommendation

Authors: Julien Monteil, Volodymyr Vaskovych, Wentao Lu, Anirban Majumder, Anton van den Hengel

Abstract: For many recommender systems the primary data source is a historical record of user clicks. The associated click matrix which is often very sparse, however, as the number of users x products can be far larger than the number of clicks, and such sparsity is accentuated in cold-start settings. The sparsity of the click matrix is the reason matrix factorization and autoencoders techniques remain high… ▽ More For many recommender systems the primary data source is a historical record of user clicks. The associated click matrix which is often very sparse, however, as the number of users x products can be far larger than the number of clicks, and such sparsity is accentuated in cold-start settings. The sparsity of the click matrix is the reason matrix factorization and autoencoders techniques remain highly competitive across collaborative filtering datasets. In this work, we propose a simple approach to address cold-start recommendations by leveraging content metadata, Metadata Alignment for cold-start Recommendation. we show that this approach can readily augment existing matrix factorization and autoencoder approaches, enabling a smooth transition to top performing algorithms in warmer set-ups. Our experimental results indicate three separate contributions: first, we show that our proposed framework largely beats SOTA results on 4 cold-start datasets with different sparsity and scale characteristics, with gains ranging from +8.4% to +53.8% on reported ranking metrics; second, we provide an ablation study on the utility of semantic features, and proves the additional gain obtained by leveraging such features ranges between +46.8% and +105.5%; and third, our approach is by construction highly competitive in warm set-ups, and we propose a closed-form solution outperformed by SOTA results by only 0.8% on average. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.12597 [pdf, other]

The phase diagram of kernel interpolation in large dimensions

Authors: Haobo Zhang, Weihao Lu, Qian Lin

Abstract: The generalization ability of kernel interpolation in large dimensions (i.e., $n \asymp d^γ$ for some $γ>0$) might be one of the most interesting problems in the recent renaissance of kernel regression, since it may help us understand the 'benign overfitting phenomenon' reported in the neural networks literature. Focusing on the inner product kernel on the sphere, we fully characterized the exact… ▽ More The generalization ability of kernel interpolation in large dimensions (i.e., $n \asymp d^γ$ for some $γ>0$) might be one of the most interesting problems in the recent renaissance of kernel regression, since it may help us understand the 'benign overfitting phenomenon' reported in the neural networks literature. Focusing on the inner product kernel on the sphere, we fully characterized the exact order of both the variance and bias of large-dimensional kernel interpolation under various source conditions $s\geq 0$. Consequently, we obtained the $(s,γ)$-phase diagram of large-dimensional kernel interpolation, i.e., we determined the regions in $(s,γ)$-plane where the kernel interpolation is minimax optimal, sub-optimal and inconsistent. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 18 pages, 1 figure

arXiv:2404.12014 [pdf, other]

Enhance Robustness of Language Models Against Variation Attack through Graph Integration

Authors: Zi Xiong, Lizhi Qing, Yangyang Kang, Jiawei Liu, Hongsong Li, Changlong Sun, Xiaozhong Liu, Wei Lu

Abstract: The widespread use of pre-trained language models (PLMs) in natural language processing (NLP) has greatly improved performance outcomes. However, these models' vulnerability to adversarial attacks (e.g., camouflaged hints from drug dealers), particularly in the Chinese language with its rich character diversity/variation and complex structures, hatches vital apprehension. In this study, we propose… ▽ More The widespread use of pre-trained language models (PLMs) in natural language processing (NLP) has greatly improved performance outcomes. However, these models' vulnerability to adversarial attacks (e.g., camouflaged hints from drug dealers), particularly in the Chinese language with its rich character diversity/variation and complex structures, hatches vital apprehension. In this study, we propose a novel method, CHinese vAriatioN Graph Enhancement (CHANGE), to increase the robustness of PLMs against character variation attacks in Chinese content. CHANGE presents a novel approach for incorporating a Chinese character variation graph into the PLMs. Through designing different supplementary tasks utilizing the graph structure, CHANGE essentially enhances PLMs' interpretation of adversarially manipulated text. Experiments conducted in a multitude of NLP tasks show that CHANGE outperforms current language models in combating against adversarial attacks and serves as a valuable contribution to robust language model research. These findings contribute to the groundwork on robust language models and highlight the substantial potential of graph-guided pre-training strategies for real-world applications. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 12 pages, 4 figures, accepted by COLING 2024

arXiv:2404.10599 [pdf, other]

Towards free-response paradigm: a theory on decision-making in spiking neural networks

Authors: Zhichao Zhu, Yang Qi, Wenlian Lu, Zhigang Wang, Lu Cao, Jianfeng Feng

Abstract: The energy-efficient and brain-like information processing abilities of Spiking Neural Networks (SNNs) have attracted considerable attention, establishing them as a crucial element of brain-inspired computing. One prevalent challenge encountered by SNNs is the trade-off between inference speed and accuracy, which requires sufficient time to achieve the desired level of performance. Drawing inspira… ▽ More The energy-efficient and brain-like information processing abilities of Spiking Neural Networks (SNNs) have attracted considerable attention, establishing them as a crucial element of brain-inspired computing. One prevalent challenge encountered by SNNs is the trade-off between inference speed and accuracy, which requires sufficient time to achieve the desired level of performance. Drawing inspiration from animal behavior experiments that demonstrate a connection between decision-making reaction times, task complexity, and confidence levels, this study seeks to apply these insights to SNNs. The focus is on understanding how SNNs make inferences, with a particular emphasis on untangling the interplay between signal and noise in decision-making processes. The proposed theoretical framework introduces a new optimization objective for SNN training, highlighting the importance of not only the accuracy of decisions but also the development of predictive confidence through learning from past experiences. Experimental results demonstrate that SNNs trained according to this framework exhibit improved confidence expression, leading to better decision-making outcomes. In addition, a strategy is introduced for efficient decision-making during inference, which allows SNNs to complete tasks more quickly and can use stop** times as indicators of decision confidence. By integrating neuroscience insights with neuromorphic computing, this study opens up new possibilities to explore the capabilities of SNNs and advance their application in complex decision-making scenarios. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: 27 pages, 6 figures, 3 tables

arXiv:2404.07342 [pdf, ps, other]

The global Gan-Gross-Prasad conjecture for Fourier-Jacobi periods on unitary groups

Authors: Paul Boisseau, Weixiao Lu, Hang Xue

Abstract: We prove the Gan-Gross-Prasad conjecture for Fourier-Jacobi periods on unitary groups and an Ichino-Ikeda type refinement. Our strategy is based on the comparison of relative trace formulae formulated by Liu. We develop the full coarse spectral and geometric expansions of the relative trace formulae, and compute relevant spectral terms via zeta integrals and truncated periods. We compare all geome… ▽ More We prove the Gan-Gross-Prasad conjecture for Fourier-Jacobi periods on unitary groups and an Ichino-Ikeda type refinement. Our strategy is based on the comparison of relative trace formulae formulated by Liu. We develop the full coarse spectral and geometric expansions of the relative trace formulae, and compute relevant spectral terms via zeta integrals and truncated periods. We compare all geometric terms and characterize the local geometric comparison in terms of spectral data. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.07108 [pdf, other]

From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications

Authors: Yongqiang Ma, Lizhi Qing, Jiawei Liu, Yangyang Kang, Yue Zhang, Wei Lu, Xiaozhong Liu, Qikai Cheng

Abstract: Evaluating large language models (LLMs) is fundamental, particularly in the context of practical applications. Conventional evaluation methods, typically designed primarily for LLM development, yield numerical scores that ignore the user experience. Therefore, our study shifts the focus from model-centered to human-centered evaluation in the context of AI-powered writing assistance applications. O… ▽ More Evaluating large language models (LLMs) is fundamental, particularly in the context of practical applications. Conventional evaluation methods, typically designed primarily for LLM development, yield numerical scores that ignore the user experience. Therefore, our study shifts the focus from model-centered to human-centered evaluation in the context of AI-powered writing assistance applications. Our proposed metric, termed ``Revision Distance,'' utilizes LLMs to suggest revision edits that mimic the human writing process. It is determined by counting the revision edits generated by LLMs. Benefiting from the generated revision edit details, our metric can provide a self-explained text evaluation result in a human-understandable manner beyond the context-independent score. Our results show that for the easy-writing task, ``Revision Distance'' is consistent with established metrics (ROUGE, Bert-score, and GPT-score), but offers more insightful, detailed feedback and better distinguishes between texts. Moreover, in the context of challenging academic writing tasks, our metric still delivers reliable evaluations where other metrics tend to struggle. Furthermore, our metric also holds significant potential for scenarios lacking reference texts. △ Less

Submitted 10 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: 9 pages, 2 figures, under review

arXiv:2404.07097 [pdf, other]

Fast Encoder-Based 3D from Casual Videos via Point Track Processing

Authors: Yoni Kasten, Wuyue Lu, Haggai Maron

Abstract: This paper addresses the long-standing challenge of reconstructing 3D structures from videos with dynamic content. Current approaches to this problem were not designed to operate on casual videos recorded by standard cameras or require a long optimization time. Aiming to significantly improve the efficiency of previous approaches, we present TracksTo4D, a learning-based approach that enables inf… ▽ More This paper addresses the long-standing challenge of reconstructing 3D structures from videos with dynamic content. Current approaches to this problem were not designed to operate on casual videos recorded by standard cameras or require a long optimization time. Aiming to significantly improve the efficiency of previous approaches, we present TracksTo4D, a learning-based approach that enables inferring 3D structure and camera positions from dynamic content originating from casual videos using a single efficient feed-forward pass. To achieve this, we propose operating directly over 2D point tracks as input and designing an architecture tailored for processing 2D point tracks. Our proposed architecture is designed with two key principles in mind: (1) it takes into account the inherent symmetries present in the input point tracks data, and (2) it assumes that the movement patterns can be effectively represented using a low-rank approximation. TracksTo4D is trained in an unsupervised way on a dataset of casual videos utilizing only the 2D point tracks extracted from the videos, without any 3D supervision. Our experiments show that TracksTo4D can reconstruct a temporal point cloud and camera positions of the underlying video with accuracy comparable to state-of-the-art methods, while drastically reducing runtime by up to 95\%. We further show that TracksTo4D generalizes well to unseen videos of unseen semantic categories at inference time. △ Less

Submitted 26 June, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.05880 [pdf, other]

Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge

Authors: Weikai Lu, Ziqian Zeng, Jianwei Wang, Zhengdong Lu, Zelin Chen, Hui** Zhuang, Cen Chen

Abstract: Jailbreaking attacks can enable Large Language Models (LLMs) to bypass the safeguard and generate harmful content. Existing jailbreaking defense methods have failed to address the fundamental issue that harmful knowledge resides within the model, leading to potential jailbreak risks for LLMs. In this paper, we propose a novel defense method called Eraser, which mainly includes three goals: unlearn… ▽ More Jailbreaking attacks can enable Large Language Models (LLMs) to bypass the safeguard and generate harmful content. Existing jailbreaking defense methods have failed to address the fundamental issue that harmful knowledge resides within the model, leading to potential jailbreak risks for LLMs. In this paper, we propose a novel defense method called Eraser, which mainly includes three goals: unlearning harmful knowledge, retaining general knowledge, and maintaining safety alignment. The intuition is that if an LLM forgets the specific knowledge required to answer a harmful question, it will no longer have the ability to answer harmful questions. The training of Erase does not actually require the model's own harmful knowledge, and it can benefit from unlearning general answers related to harmful queries, which means it does not need assistance from the red team. The experimental results show that Eraser can significantly reduce the jailbreaking success rate for various attacks without compromising the general capabilities of the model. Our codes are available at https://github.com/ZeroNLP/Eraser. △ Less

Submitted 3 July, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.03688 [pdf, other]

Beam test of a baseline vertex detector prototype for CEPC

Authors: Shuqi Li, Tianya Wu, Xinhui Huang, Jia Zhou, Ziyue Yan, Wei Wang, Hao Zeng, Yiming Hu, Xiaoxu Zhang, Zhijun Liang, Wei Wei, Ying Zhang, Xiaomin Wei, Lei Zhang, Ming Qi, Jun Hu, **yu Fu, Hongyu Zhang, Gang Li, Linghui Wu, Mingyi Dong, Xiaoting Li, Raimon Casanova, Liang Zhang, Jianing Dong , et al. (5 additional authors not shown)

Abstract: The Circular Electron Positron Collider (CEPC) has been proposed to enable more thorough and precise measurements of the properties of Higgs, W, and Z bosons, as well as to search for new physics. In response to the stringent performance requirements of the vertex detector for the CEPC, a baseline vertex detector prototype was tested and characterized for the first time using a 6 GeV electron beam… ▽ More The Circular Electron Positron Collider (CEPC) has been proposed to enable more thorough and precise measurements of the properties of Higgs, W, and Z bosons, as well as to search for new physics. In response to the stringent performance requirements of the vertex detector for the CEPC, a baseline vertex detector prototype was tested and characterized for the first time using a 6 GeV electron beam at DESY II Test Beam Line 21. The baseline vertex detector prototype is designed with a cylindrical barrel structure that contains six double-sided detector modules (ladders). Each side of the ladder includes TaichuPix-3 sensors based on Monolithic Active Pixel Sensor (MAPS) technology, a flexible printed circuit, and a carbon fiber support structure. Additionally, the readout electronics and the Data Acquisition system were also examined during this beam test. The performance of the prototype was evaluated using an electron beam that passed through six ladders in a perpendicular direction. The offline data analysis indicates a spatial resolution of about 5 um, with detection efficiency exceeding 99 % and an impact parameter resolution of about 5.1 um. These promising results from this baseline vertex detector prototype mark a significant step toward realizing the optimal vertex detector for the CEPC. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2404.03608 [pdf, other]

Sailor: Open Language Models for South-East Asia

Authors: Longxu Dou, Qian Liu, Guangtao Zeng, Jia Guo, Jiahui Zhou, Wei Lu, Min Lin

Abstract: We present Sailor, a family of open language models ranging from 0.5B to 7B parameters, tailored for South-East Asian (SEA) languages. These models are continually pre-trained from Qwen1.5, a great language model for multilingual use cases. From Qwen1.5, Sailor models accept 200B to 400B tokens, primarily covering the languages of English, Chinese, Vietnamese, Thai, Indonesian, Malay, and Lao. The… ▽ More We present Sailor, a family of open language models ranging from 0.5B to 7B parameters, tailored for South-East Asian (SEA) languages. These models are continually pre-trained from Qwen1.5, a great language model for multilingual use cases. From Qwen1.5, Sailor models accept 200B to 400B tokens, primarily covering the languages of English, Chinese, Vietnamese, Thai, Indonesian, Malay, and Lao. The training leverages several techniques, including BPE dropout for improving the model robustness, aggressive data cleaning and deduplication, and small proxy models to optimize data mixture. Experimental results on four typical tasks indicate that Sailor models demonstrate strong performance across different benchmarks, including commonsense reasoning, question answering, reading comprehension and examination. Embracing the open-source spirit, we share our insights through this report to spark a wider interest in develo** large language models for multilingual use cases. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: Code is available at https://github.com/sail-sg/sailor-llm

arXiv:2404.02291 [pdf, other]

Towards a New Configurable and Practical Remote Automotive Security Testing Platform

Authors: Sekar Kulandaivel, Wenjuan Lu, Brandon Barry, Jorge Guajardo

Abstract: In the automotive security sector, the absence of a testing platform that is configurable, practical, and user-friendly presents considerable challenges. These difficulties are compounded by the intricate design of vehicle systems, the rapid evolution of attack vectors, and the absence of standardized testing methodologies. We propose a next-generation testing platform that addresses several chall… ▽ More In the automotive security sector, the absence of a testing platform that is configurable, practical, and user-friendly presents considerable challenges. These difficulties are compounded by the intricate design of vehicle systems, the rapid evolution of attack vectors, and the absence of standardized testing methodologies. We propose a next-generation testing platform that addresses several challenges in vehicle cybersecurity testing and research domains. In this paper, we detail how the Vehicle Security Engineering Cloud (VSEC) Test platform enables easier access to test beds for efficient vehicle cybersecurity testing and advanced (e.g., penetration, fuzz) testing and how we extend such test beds to benefit automotive security research. We highlight methodology on how to use this platform for a variety of users and use cases with real implemented examples. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: 7 pages, 2 figures

arXiv:2404.02018 [pdf, other]

Large Language Models for Orchestrating Bimanual Robots

Authors: Kun Chu, Xufeng Zhao, Cornelius Weber, Mengdi Li, Wenhao Lu, Stefan Wermter

Abstract: Although there has been rapid progress in endowing robots with the ability to solve complex manipulation tasks, generating control policies for bimanual robots to solve tasks involving two hands is still challenging because of the difficulties in effective temporal and spatial coordination. With emergent abilities in terms of step-by-step reasoning and in-context learning, Large Language Models (L… ▽ More Although there has been rapid progress in endowing robots with the ability to solve complex manipulation tasks, generating control policies for bimanual robots to solve tasks involving two hands is still challenging because of the difficulties in effective temporal and spatial coordination. With emergent abilities in terms of step-by-step reasoning and in-context learning, Large Language Models (LLMs) have taken control of a variety of robotic tasks. However, the nature of language communication via a single sequence of discrete symbols makes LLM-based coordination in continuous space a particular challenge for bimanual tasks. To tackle this challenge for the first time by an LLM, we present LAnguage-model-based Bimanual ORchestration (LABOR), an agent utilizing an LLM to analyze task configurations and devise coordination control policies for addressing long-horizon bimanual tasks. In the simulated environment, the LABOR agent is evaluated through several everyday tasks on the NICOL humanoid robot. Reported success rates indicate that overall coordination efficiency is close to optimal performance, while the analysis of failure causes, classified into spatial and temporal coordination and skill selection, shows that these vary over tasks. The project website can be found at http://labor-agent.github.io △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: The project website can be found at http://labor-agent.github.io

arXiv:2404.01072 [pdf]

How biomedical papers accumulated their clinical citations: A large-scale retrospective analysis based on PubMed

Authors: Xin Li, Xuli Tang, Wei Lu

Abstract: This paper explored the temporal characteristics of clinical citations of biomedical papers, including how long it takes to receive its first clinical citation (the initial stage) and how long it takes to receive two or more clinical citations after its first clinical citation (the build-up stage). Over 23 million biomedical papers in PubMed between 1940 and 2013 and their clinical citations are u… ▽ More This paper explored the temporal characteristics of clinical citations of biomedical papers, including how long it takes to receive its first clinical citation (the initial stage) and how long it takes to receive two or more clinical citations after its first clinical citation (the build-up stage). Over 23 million biomedical papers in PubMed between 1940 and 2013 and their clinical citations are used as the research data. We divide these biomedical papers into three groups and four categories from clinical citation level and translational science perspectives. We compare the temporal characteristics of biomedical papers of different groups or categories. From the perspective of clinical citation level, the results show that highly clinically cited papers had obvious advantages of receiving clinical citations over medium and lowly clinically cited papers in both the initial and build-up stages. Meanwhile, as the number of clinical citations increased in the build-up stage, the difference in the length of time to receive the corresponding number of clinical citations among the three groups of biomedical papers significantly increased. From the perspective of translational science, the results reveal that biomedical papers closer to clinical science more easily receive clinical citations than papers closer to basic science in both the initial and build-up stages. Moreover, we found that highly clinically cited papers had the desperate advantage of receiving clinical citations over even the clinical guidelines or clinical trials. The robustness analysis of the two aspects demonstrates the reliability of our results. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2404.00385 [pdf, other]

Constrained Layout Generation with Factor Graphs

Authors: Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng, Guoji Fu, Yong Liang Goh, Wei Lu, Wee Sun Lee

Abstract: This paper addresses the challenge of object-centric layout generation under spatial constraints, seen in multiple domains including floorplan design process. The design process typically involves specifying a set of spatial constraints that include object attributes like size and inter-object relations such as relative positioning. Existing works, which typically represent objects as single nodes… ▽ More This paper addresses the challenge of object-centric layout generation under spatial constraints, seen in multiple domains including floorplan design process. The design process typically involves specifying a set of spatial constraints that include object attributes like size and inter-object relations such as relative positioning. Existing works, which typically represent objects as single nodes, lack the granularity to accurately model complex interactions between objects. For instance, often only certain parts of an object, like a room's right wall, interact with adjacent objects. To address this gap, we introduce a factor graph based approach with four latent variable nodes for each room, and a factor node for each constraint. The factor nodes represent dependencies among the variables to which they are connected, effectively capturing constraints that are potentially of a higher order. We then develop message-passing on the bipartite graph, forming a factor graph neural network that is trained to produce a floorplan that aligns with the desired requirements. Our approach is simple and generates layouts faithful to the user requirements, demonstrated by a large improvement in IOU scores over existing methods. Additionally, our approach, being inferential and accurate, is well-suited to the practical human-in-the-loop design process where specifications evolve iteratively, offering a practical and powerful tool for AI-guided design. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Comments: To be published at IEEE/CVF CVPR 2024

Showing 1–50 of 1,258 results for author: Lu, W