Skip to main content

Showing 1–50 of 291 results for author: Xiao, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01245  [pdf, other

    cs.AI cs.CY

    SINKT: A Structure-Aware Inductive Knowledge Tracing Model with Large Language Model

    Authors: Lingyue Fu, Hao Guan, Kounianhua Du, Jianghao Lin, Wei Xia, Weinan Zhang, Ruiming Tang, Yasheng Wang, Yong Yu

    Abstract: Knowledge Tracing (KT) aims to determine whether students will respond correctly to the next question, which is a crucial task in intelligent tutoring systems (ITS). In educational KT scenarios, transductive ID-based methods often face severe data sparsity and cold start problems, where interactions between individual students and questions are sparse, and new questions and concepts consistently a… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.19646  [pdf, other

    cs.RO

    Time-optimal Flight in Cluttered Environments via Safe Reinforcement Learning

    Authors: Wei Xiao, Zhaohan Feng, Ziyu Zhou, Jian Sun, Gang Wang, Jie Chen

    Abstract: This paper addresses the problem of guiding a quadrotor through a predefined sequence of waypoints in cluttered environments, aiming to minimize the flight time while avoiding collisions. Previous approaches either suffer from prolonged computational time caused by solving complex non-convex optimization problems or are limited by the inherent smoothness of polynomial trajectory representations, t… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 7 pages, 3 figures,

  3. arXiv:2406.16935  [pdf, other

    eess.SP cs.AI

    Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex

    Authors: Spandan Madan, Will Xiao, Mingran Cao, Hanspeter Pfister, Margaret Livingstone, Gabriel Kreiman

    Abstract: We characterized the generalization capabilities of DNN-based encoding models when predicting neuronal responses from the visual cortex. We collected \textit{MacaqueITBench}, a large-scale dataset of neural population responses from the macaque inferior temporal (IT) cortex to over $300,000$ images, comprising $8,233$ unique natural images presented to seven monkeys over $109$ sessions. Using \tex… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  4. arXiv:2406.14024  [pdf, other

    cs.CL

    LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

    Authors: Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin, Keming Lu, Junyang Lin, Chang Zhou, Wen Xiao, Junjie Hu, Tianyu Liu, Baobao Chang

    Abstract: Mathematical verfier achieves success in mathematical reasoning tasks by validating the correctness of solutions. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedbacks as rationale la… ▽ More

    Submitted 30 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: 9 pages

  5. arXiv:2406.13025  [pdf, other

    cs.LG cs.RO eess.SY

    ABNet: Attention BarrierNet for Safe and Scalable Robot Learning

    Authors: Wei Xiao, Tsun-Hsuan Wang, Daniela Rus

    Abstract: Safe learning is central to AI-enabled robots where a single failure may lead to catastrophic results. Barrier-based method is one of the dominant approaches for safe robot learning. However, this method is not scalable, hard to train, and tends to generate unstable signals under noisy inputs that are challenging to be deployed for robots. To address these challenges, we propose a novel Attentio… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 18 pages

  6. arXiv:2406.12463  [pdf, other

    cs.CV eess.IV

    LFMamba: Light Field Image Super-Resolution with State Space Model

    Authors: Wang xia, Yao Lu, Shunzhou Wang, Ziqi Wang, Peiqi Xia, Tianfei Zhou

    Abstract: Recent years have witnessed significant advancements in light field image super-resolution (LFSR) owing to the progress of modern neural networks. However, these methods often face challenges in capturing long-range dependencies (CNN-based) or encounter quadratic computational complexities (Transformer-based), which limit their performance. Recently, the State Space Model (SSM) with selective scan… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  7. arXiv:2406.11162  [pdf, other

    cs.CL

    How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation

    Authors: Dawulie **ensibieke, Mieradilijiang Maimaiti, Wentao Xiao, Yuanhang Zheng, Xiaobo Wang

    Abstract: Relation Extraction (RE) serves as a crucial technology for transforming unstructured text into structured information, especially within the framework of Knowledge Graph development. Its importance is emphasized by its essential role in various downstream tasks. Besides the conventional RE methods which are based on neural networks and pre-trained language models, large language models (LLMs) are… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

  8. arXiv:2406.10943  [pdf, other

    cs.CV

    Rectified Iterative Disparity for Stereo Matching

    Authors: Weiqing Xiao

    Abstract: Both uncertainty-assisted and iteration-based methods have achieved great success in stereo matching. However, existing uncertainty estimation methods take a single image and the corresponding disparity as input, which imposes higher demands on the estimation network. In this paper, we propose Cost volume-based disparity Uncertainty Estimation (UEC). Based on the rich similarity information in the… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  9. arXiv:2406.08858  [pdf, other

    cs.RO cs.CV cs.LG eess.SY

    OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning

    Authors: Tairan He, Zhengyi Luo, Xialin He, Wenli Xiao, Chong Zhang, Weinan Zhang, Kris Kitani, Changliu Liu, Guanya Shi

    Abstract: We present OmniH2O (Omni Human-to-Humanoid), a learning-based system for whole-body humanoid teleoperation and autonomy. Using kinematic pose as a universal control interface, OmniH2O enables various ways for a human to control a full-sized humanoid with dexterous hands, including using real-time teleoperation through VR headset, verbal instruction, and RGB camera. OmniH2O also enables full autono… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Project page: https://omni.human2humanoid.com/

  10. arXiv:2406.08839  [pdf, other

    cs.CV

    NeRF Director: Revisiting View Selection in Neural Volume Rendering

    Authors: Wenhui Xiao, Rodrigo Santa Cruz, David Ahmedt-Aristizabal, Olivier Salvado, Clinton Fookes, Leo Lebrat

    Abstract: Neural Rendering representations have significantly contributed to the field of 3D computer vision. Given their potential, considerable efforts have been invested to improve their performance. Nonetheless, the essential question of selecting training views is yet to be thoroughly investigated. This key aspect plays a vital role in achieving high-quality results and aligns with the well-known tenet… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: CVPR2024

  11. arXiv:2406.06953  [pdf, other

    cs.CV

    Stepwise Regression and Pre-trained Edge for Robust Stereo Matching

    Authors: Weiqing Xiao, Wei Zhao

    Abstract: Due to the difficulty in obtaining real samples and ground truth, the generalization performance and the fine-tuned performance are critical for the feasibility of stereo matching methods in real-world applications. However, the presence of substantial disparity distributions and density variations across different datasets presents significant challenges for the generalization and fine-tuning of… ▽ More

    Submitted 16 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  12. arXiv:2406.06005  [pdf, other

    cs.RO cs.GR eess.SY

    WoCoCo: Learning Whole-Body Humanoid Control with Sequential Contacts

    Authors: Chong Zhang, Wenli Xiao, Tairan He, Guanya Shi

    Abstract: Humanoid activities involving sequential contacts are crucial for complex robotic interactions and operations in the real world and are traditionally solved by model-based motion planning, which is time-consuming and often relies on simplified dynamics models. Although model-free reinforcement learning (RL) has become a powerful tool for versatile and robust whole-body humanoid control, it still r… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Website and Videos: https://lecar-lab.github.io/wococo/

  13. arXiv:2406.04594  [pdf, other

    cs.DC cs.AI cs.LG

    Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

    Authors: Jianbo Dong, Bin Luo, Jun Zhang, Pengcheng Zhang, Fei Feng, Yikai Zhu, Ang Liu, Zian Chen, Yi Shi, Hairong Jiao, Gang Lu, Yu Guan, Ennan Zhai, Wencong Xiao, Hanyu Zhao, Man Yuan, Siran Yang, Xiang Li, Jiamang Wang, Rui Men, Jianwei Zhang, Huang Zhong, Dennis Cai, Yuan Xie, Binzhang Fu

    Abstract: The emergence of Large Language Models (LLMs) has necessitated the adoption of parallel training techniques, involving the deployment of thousands of GPUs to train a single model. Unfortunately, we have found that the efficiency of current parallel training is often suboptimal, largely due to the following two main issues. Firstly, hardware failures are inevitable, leading to interruptions in the… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  14. arXiv:2406.03243  [pdf, other

    cs.AR cs.DC cs.LG

    Llumnix: Dynamic Scheduling for Large Language Model Serving

    Authors: Biao Sun, Ziming Huang, Hanyu Zhao, Wencong Xiao, Xinyi Zhang, Yong Li, Wei Lin

    Abstract: Inference serving for large language models (LLMs) is the key to unleashing their potential in people's daily lives. However, efficient LLM serving remains challenging today because the requests are inherently heterogeneous and unpredictable in terms of resource and latency requirements, as a result of the diverse applications and the dynamic execution nature of LLMs. Existing systems are fundamen… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: To appear at OSDI '24; open-source repo will be available in June 2024

  15. arXiv:2406.02069  [pdf, other

    cs.CL cs.AI

    PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling

    Authors: Zefan Cai., Yichi Zhang, Bofei Gao, Yuliang Liu, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, Baobao Chang, Junjie Hu, Wen Xiao

    Abstract: In this study, we investigate whether attention-based information flow inside large language models (LLMs) is aggregated through noticeable patterns for long context processing. Our observations reveal that LLMs aggregate information through Pyramidal Information Funneling where attention is scattering widely in lower layers, progressively consolidating within specific contexts, and ultimately foc… ▽ More

    Submitted 16 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  16. arXiv:2406.00439  [pdf, other

    cs.RO cs.CV

    Learning Manipulation by Predicting Interaction

    Authors: Jia Zeng, Qingwen Bu, Bangjun Wang, Wenke Xia, Li Chen, Hao Dong, Haoming Song, Dong Wang, Di Hu, ** Luo, Heming Cui, Bin Zhao, Xuelong Li, Yu Qiao, Hongyang Li

    Abstract: Representation learning approaches for robotic manipulation have boomed in recent years. Due to the scarcity of in-domain robot data, prevailing methodologies tend to leverage large-scale human video datasets to extract generalizable features for visuomotor policy learning. Despite the progress achieved, prior endeavors disregard the interactive dynamics that capture behavior patterns and physical… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: Accepted to RSS 2024. Project page: https://github.com/OpenDriveLab/MPI

  17. arXiv:2405.19586  [pdf, other

    cs.CV cs.LG cs.RO

    SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation

    Authors: Junjie Zhang, Chenjia Bai, Haoran He, Wenke Xia, Zhigang Wang, Bin Zhao, Xiu Li, Xuelong Li

    Abstract: Acquiring a multi-task imitation policy in 3D manipulation poses challenges in terms of scene understanding and action prediction. Current methods employ both 3D representation and multi-view 2D representation to predict the poses of the robot's end-effector. However, they still require a considerable amount of high-quality robot trajectories, and suffer from limited generalization in unseen tasks… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICML 2024. Project page: https://sam-embodied.github.io

  18. arXiv:2405.19487  [pdf, other

    cs.CL

    A Full-duplex Speech Dialogue Scheme Based On Large Language Models

    Authors: Peng Wang, Songshuo Lu, Yaohua Tang, Sijie Yan, Yuanjun Xiong, Wei Xia

    Abstract: We present a generative dialogue system capable of operating in a full-duplex manner, allowing for seamless interaction. It is based on a large language model (LLM) carefully aligned to be aware of a perception module, a motor function module, and the concept of a simple finite state machine (called neural FSM) with two states. The perception and motor function modules operate simultaneously, allo… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  19. arXiv:2405.17627  [pdf, other

    cs.LG

    Salutary Labeling with Zero Human Annotation

    Authors: Wenxiao Xiao, Hongfu Liu

    Abstract: Active learning strategically selects informative unlabeled data points and queries their ground truth labels for model training. The prevailing assumption underlying this machine learning paradigm is that acquiring these ground truth labels will optimally enhance model performance. However, this assumption may not always hold true or maximize learning capacity, particularly considering the costly… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  20. arXiv:2405.16765  [pdf, ps, other

    cs.LG eess.SP

    Study of Robust Direction Finding Based on Joint Sparse Representation

    Authors: Y. Li, W. Xiao, L. Zhao, Z. Huang, Q. Li, L. Li, R. C. de Lamare

    Abstract: Standard Direction of Arrival (DOA) estimation methods are typically derived based on the Gaussian noise assumption, making them highly sensitive to outliers. Therefore, in the presence of impulsive noise, the performance of these methods may significantly deteriorate. In this paper, we model impulsive noise as Gaussian noise mixed with sparse outliers. By exploiting their statistical differences,… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 6 pages, 4 figures

  21. arXiv:2405.15202  [pdf, other

    cs.CL cs.CR

    Cross-Task Defense: Instruction-Tuning LLMs for Content Safety

    Authors: Yu Fu, Wen Xiao, Jia Chen, Jiachen Li, Evangelos Papalexakis, Aichi Chien, Yue Dong

    Abstract: Recent studies reveal that Large Language Models (LLMs) face challenges in balancing safety with utility, particularly when processing long texts for NLP tasks like summarization and translation. Despite defenses against malicious short questions, the ability of LLMs to safely handle dangerous long content, such as manuals teaching illicit activities, remains unclear. Our work aims to develop robu… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: accepted to NAACL2024 TrustNLP workshop

  22. arXiv:2405.12442  [pdf, other

    cs.IR cs.AI

    Learning Structure and Knowledge Aware Representation with Large Language Models for Concept Recommendation

    Authors: Qingyao Li, Wei Xia, Kounianhua Du, Qiji Zhang, Weinan Zhang, Ruiming Tang, Yong Yu

    Abstract: Concept recommendation aims to suggest the next concept for learners to study based on their knowledge states and the human knowledge system. While knowledge states can be predicted using knowledge tracing models, previous approaches have not effectively integrated the human knowledge system into the process of designing these educational models. In the era of rapidly evolving Large Language Model… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 11 pages, 8 figures

  23. arXiv:2405.11024  [pdf, other

    cs.LG cs.AI

    GraSS: Combining Graph Neural Networks with Expert Knowledge for SAT Solver Selection

    Authors: Zhanguang Zhang, Didier Chetelat, Joseph Cotnareanu, Amur Ghose, Wenyi Xiao, Hui-Ling Zhen, Yingxue Zhang, Jianye Hao, Mark Coates, Mingxuan Yuan

    Abstract: Boolean satisfiability (SAT) problems are routinely solved by SAT solvers in real-life applications, yet solving time can vary drastically between solvers for the same instance. This has motivated research into machine learning models that can predict, for a given SAT instance, which solver to select among several options. Existing SAT solver selection methods all rely on some hand-picked instance… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024

  24. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  25. arXiv:2405.02355  [pdf, other

    cs.SE cs.AI

    CodeGRAG: Extracting Composed Syntax Graphs for Retrieval Augmented Cross-Lingual Code Generation

    Authors: Kounianhua Du, Renting Rui, Huacan Chai, Lingyue Fu, Wei Xia, Yasheng Wang, Ruiming Tang, Yong Yu, Weinan Zhang

    Abstract: Utilizing large language models to generate codes has shown promising meaning in software development revolution. Despite the intelligence shown by the general large language models, their specificity in code generation can still be improved due to the syntactic gap and mismatched vocabulary existing among natural language and different programming languages. In addition, programming languages are… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  26. arXiv:2405.02180  [pdf, other

    cs.LG eess.SY

    A Flow-Based Model for Conditional and Probabilistic Electricity Consumption Profile Generation and Prediction

    Authors: Weijie Xia, Chenguang Wang, Peter Palensky, Pedro P. Vergara

    Abstract: Residential Load Profile (RLP) generation and prediction are critical for the operation and planning of distribution networks, especially as diverse low-carbon technologies (e.g., photovoltaic and electric vehicles) are increasingly adopted. This paper introduces a novel flow-based generative model, termed Full Convolutional Profile Flow (FCPFlow), which is uniquely designed for both conditional a… ▽ More

    Submitted 9 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  27. arXiv:2404.16147  [pdf, other

    cs.RO

    Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model

    Authors: Yongqi Zhao, Wenbo Xiao, Tomislav Mihalj, Jia Hu, Arno Eichberger

    Abstract: The advent of Large Language Models (LLM) provides new insights to validate Automated Driving Systems (ADS). In the herein-introduced work, a novel approach to extracting scenarios from naturalistic driving datasets is presented. A framework called Chat2Scenario is proposed leveraging the advanced Natural Language Processing (NLP) capabilities of LLM to understand and identify different driving sc… ▽ More

    Submitted 26 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: IEEE Intelligent Vehicles Symposium (IV 2024)

  28. arXiv:2404.14233  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

    Authors: Wenyi Xiao, Ziwei Huang, Leilei Gan, Wanggui He, Haoyuan Li, Zhelun Yu, Hao Jiang, Fei Wu, Linchao Zhu

    Abstract: The rapidly develo** Large Vision Language Models (LVLMs) have shown notable capabilities on a range of multi-modal tasks, but still face the hallucination phenomena where the generated texts do not align with the given contexts, significantly restricting the usages of LVLMs. Most previous work detects and mitigates hallucination at the coarse-grained level or requires expensive annotation (e.g.… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  29. arXiv:2404.13804  [pdf, other

    cs.DC cs.LG cs.NI eess.SY

    Adaptive Heterogeneous Client Sampling for Federated Learning over Wireless Networks

    Authors: Bing Luo, Wenli Xiao, Shiqiang Wang, Jianwei Huang, Leandros Tassiulas

    Abstract: Federated learning (FL) algorithms usually sample a fraction of clients in each round (partial participation) when the number of participants is large and the server's communication bandwidth is limited. Recent works on the convergence analysis of FL have focused on unbiased client sampling, e.g., sampling uniformly at random, which suffers from slow wall-clock time for convergence due to high deg… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: Published in IEEE Transactions on Mobile Computing (TMC). arXiv admin note: substantial text overlap with arXiv:2112.11256

  30. arXiv:2404.13033  [pdf, other

    cs.CL

    Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs

    Authors: Biyang Guo, He Wang, Wenyilin Xiao, Hong Chen, Zhuxin Lee, Songqiao Han, Hailiang Huang

    Abstract: In the burgeoning field of Large Language Models (LLMs) like ChatGPT and LLaMA, Prompt Engineering (PE) is renowned for boosting zero-shot or in-context learning (ICL) through prompt modifications. Yet, the realm of the sample design for downstream fine-tuning, crucial for task-specific LLM adaptation, is largely unexplored. This paper introduces Sample Design Engineering (SDE), a methodical appro… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 23 pages, 12 figures, 14 tables

  31. arXiv:2404.12728  [pdf, other

    cs.CL

    Relevant or Random: Can LLMs Truly Perform Analogical Reasoning?

    Authors: Chengwei Qin, Wenhan Xia, Tan Wang, Fangkai Jiao, Yuchen Hu, Bosheng Ding, Ruirui Chen, Shafiq Joty

    Abstract: Analogical reasoning is a unique ability of humans to address unfamiliar challenges by transferring strategies from relevant past experiences. One key finding in psychology is that compared with irrelevant past experiences, recalling relevant ones can help humans better handle new tasks. Coincidentally, the NLP community has also recently found that self-generating relevant examples in the context… ▽ More

    Submitted 23 June, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  32. arXiv:2404.07202  [pdf, other

    cs.CV cs.AI cs.CL

    UMBRAE: Unified Multimodal Decoding of Brain Signals

    Authors: Weihao Xia, Raoul de Charette, Cengiz Öztireli, **g-Hao Xue

    Abstract: We address prevailing challenges of the brain-powered research, departing from the observation that the literature hardly recover accurate spatial information and require subject-specific models. To address these challenges, we propose UMBRAE, a unified multimodal decoding of brain signals. First, to extract instance-level conceptual and spatial details from neural signals, we introduce an efficie… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Project Page: https://weihaox.github.io/UMBRAE

  33. arXiv:2404.02507  [pdf, other

    cs.CL

    Lifelong Event Detection with Embedding Space Separation and Compaction

    Authors: Chengwei Qin, Ruirui Chen, Ruochen Zhao, Wenhan Xia, Shafiq Joty

    Abstract: To mitigate forgetting, existing lifelong event detection methods typically maintain a memory module and replay the stored memory data during the learning of a new task. However, the simple combination of memory data and new-task samples can still result in substantial forgetting of previously acquired knowledge, which may occur due to the potential overlap between the feature distribution of new… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: NAACL 2024 main conference

  34. arXiv:2403.12959  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.RO

    WHAC: World-grounded Humans and Cameras

    Authors: Wanqi Yin, Zhongang Cai, Ruisi Wang, Fanzhou Wang, Chen Wei, Haiyi Mei, Weiye Xiao, Zhitao Yang, Qing** Sun, Atsushi Yamashita, Ziwei Liu, Lei Yang

    Abstract: Estimating human and camera trajectories with accurate scale in the world coordinate system from a monocular video is a highly desirable yet challenging and ill-posed problem. In this study, we aim to recover expressive parametric human models (i.e., SMPL-X) and corresponding camera poses jointly, by leveraging the synergy between three critical players: the world, the human, and the camera. Our a… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Homepage: https://wqyin.github.io/projects/WHAC/

  35. arXiv:2403.11671  [pdf, other

    cs.AR cs.AI cs.CE cs.LG cs.SE

    HDLdebugger: Streamlining HDL debugging with Large Language Models

    Authors: Xufeng Yao, Haoyang Li, Tsz Ho Chan, Wenyi Xiao, Mingxuan Yuan, Yu Huang, Lei Chen, Bei Yu

    Abstract: In the domain of chip design, Hardware Description Languages (HDLs) play a pivotal role. However, due to the complex syntax of HDLs and the limited availability of online resources, debugging HDL codes remains a difficult and time-intensive task, even for seasoned engineers. Consequently, there is a pressing need to develop automated HDL code debugging models, which can alleviate the burden on har… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 13 pages,5 figures

  36. arXiv:2403.04436  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation

    Authors: Tairan He, Zhengyi Luo, Wenli Xiao, Chong Zhang, Kris Kitani, Changliu Liu, Guanya Shi

    Abstract: We present Human to Humanoid (H2O), a reinforcement learning (RL) based framework that enables real-time whole-body teleoperation of a full-sized humanoid robot with only an RGB camera. To create a large-scale retargeted motion dataset of human movements for humanoid robots, we propose a scalable "sim-to-data" process to filter and pick feasible motions using a privileged motion imitator. Afterwar… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Project website: https://human2humanoid.com/

  37. arXiv:2403.03669  [pdf, other

    stat.ML cs.LG

    Spectral Algorithms on Manifolds through Diffusion

    Authors: Weichun Xia, Lei Shi

    Abstract: The existing research on spectral algorithms, applied within a Reproducing Kernel Hilbert Space (RKHS), has primarily focused on general kernel functions, often neglecting the inherent structure of the input feature space. Our paper introduces a new perspective, asserting that input data are situated within a low-dimensional manifold embedded in a higher-dimensional Euclidean space. We study the c… ▽ More

    Submitted 7 March, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

  38. arXiv:2403.03517  [pdf, other

    cs.AI

    IB-Net: Initial Branch Network for Variable Decision in Boolean Satisfiability

    Authors: Tsz Ho Chan, Wenyi Xiao, Junhua Huang, Huiling Zhen, Guangji Tian, Mingxuan Yuan

    Abstract: Boolean Satisfiability problems are vital components in Electronic Design Automation, particularly within the Logic Equivalence Checking process. Currently, SAT solvers are employed for these problems and neural network is tried as assistance to solvers. However, as SAT problems in the LEC context are distinctive due to their predominantly unsatisfiability nature and a substantial proportion of UN… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 7 pages, 12 figures

  39. arXiv:2403.02990  [pdf, other

    cs.CL cs.AI

    Data Augmentation using LLMs: Data Perspectives, Learning Paradigms and Challenges

    Authors: Bosheng Ding, Chengwei Qin, Ruochen Zhao, Tianze Luo, Xinze Li, Guizhen Chen, Wenhan Xia, Junjie Hu, Anh Tuan Luu, Shafiq Joty

    Abstract: In the rapidly evolving field of large language models (LLMs), data augmentation (DA) has emerged as a pivotal technique for enhancing model performance by diversifying training examples without the need for additional data collection. This survey explores the transformative impact of LLMs on DA, particularly addressing the unique challenges and opportunities they present in the context of natural… ▽ More

    Submitted 27 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  40. arXiv:2402.11294  [pdf, other

    cs.IT eess.SP

    Power Optimization for Integrated Active and Passive Sensing in DFRC Systems

    Authors: Xingliang Lou, Wenchao Xia, Kai-Kit Wong, Haitao Zhao, Tony Q. S. Quek, Hongbo Zhu

    Abstract: Most existing works on dual-function radar-communication (DFRC) systems mainly focus on active sensing, but ignore passive sensing. To leverage multi-static sensing capability, we explore integrated active and passive sensing (IAPS) in DFRC systems to remedy sensing performance. The multi-antenna base station (BS) is responsible for communication and active sensing by transmitting signals to user… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

  41. arXiv:2401.17583  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion

    Authors: Tairan He, Chong Zhang, Wenli Xiao, Guanqi He, Changliu Liu, Guanya Shi

    Abstract: Legged robots navigating cluttered environments must be jointly agile for efficient task execution and safe to avoid collisions with obstacles or humans. Existing studies either develop conservative controllers (< 1.0 m/s) to ensure safety, or focus on agility without considering potentially fatal collisions. This paper introduces Agile But Safe (ABS), a learning-based control framework that enabl… ▽ More

    Submitted 21 May, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: Published at RSS 2024, Project website: https://agile-but-safe.github.io/

  42. arXiv:2401.15287  [pdf, other

    cs.CV cs.DM math.NA

    Applications of Tao General Difference in Discrete Domain

    Authors: Linmi Tao, Ruiyang Liu, Donglai Tao, Wu Xia, Feilong Ma, Yu Cheng, **gmao Cui

    Abstract: Numerical difference computation is one of the cores and indispensable in the modern digital era. Tao general difference (TGD) is a novel theory and approach to difference computation for discrete sequences and arrays in multidimensional space. Built on the solid theoretical foundation of the general difference in a finite interval, the TGD operators demonstrate exceptional signal processing capab… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: This paper is the application part of the paper "Tao General Differential and Difference: Theory and Application". The theory part of the paper is renamed as "A Theory of General Difference in Continuous and Discrete Domain", which is Arxived in arXiv:2305.08098v2

  43. arXiv:2401.09705  [pdf, other

    cs.RO eess.SY

    Learning Hybrid Policies for MPC with Application to Drone Flight in Unknown Dynamic Environments

    Authors: Zhaohan Feng, Jie Chen, Wei Xiao, Jian Sun, Bin Xin, Gang Wang

    Abstract: In recent years, drones have found increased applications in a wide array of real-world tasks. Model predictive control (MPC) has emerged as a practical method for drone flight control, owing to its robustness against modeling errors/uncertainties and external disturbances. However, MPC's sensitivity to manually tuned parameters can lead to rapid performance degradation when faced with unknown env… ▽ More

    Submitted 25 January, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: To be published in Unmanned Systems

  44. arXiv:2401.08664  [pdf, other

    cs.AI cs.CL

    Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges

    Authors: Qingyao Li, Lingyue Fu, Weiming Zhang, Xianyu Chen, **gwei Yu, Wei Xia, Weinan Zhang, Ruiming Tang, Yong Yu

    Abstract: Online education platforms, leveraging the internet to distribute education resources, seek to provide convenient education but often fall short in real-time communication with students. They often struggle to address the diverse obstacles students encounter throughout their learning journey. Solving the problems encountered by students poses a significant challenge for traditional deep learning m… ▽ More

    Submitted 26 April, 2024; v1 submitted 27 December, 2023; originally announced January 2024.

    Comments: 31 pages, 5 figures, 1 table

  45. arXiv:2401.04151  [pdf, other

    cs.LG cs.CL

    Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning

    Authors: Wenhan Xia, Chengwei Qin, Elad Hazan

    Abstract: Fine-tuning is the primary methodology for tailoring pre-trained large language models to specific tasks. As the model's scale and the diversity of tasks expand, parameter-efficient fine-tuning methods are of paramount importance. One of the most widely used family of methods is low-rank adaptation (LoRA) and its variants. LoRA encodes weight update as the product of two low-rank matrices. Despite… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: Work in progress

  46. arXiv:2401.03506  [pdf, other

    eess.AS cs.LG cs.SD

    DiarizationLM: Speaker Diarization Post-Processing with Large Language Models

    Authors: Quan Wang, Yiling Huang, Guanlong Zhao, Evan Clark, Wei Xia, Hank Liao

    Abstract: In this paper, we introduce DiarizationLM, a framework to leverage large language models (LLM) to post-process the outputs from a speaker diarization system. Various goals can be achieved with the proposed framework, such as improving the readability of the diarized transcript, or reducing the word diarization error rate (WDER). In this framework, the outputs of the automatic speech recognition (A… ▽ More

    Submitted 26 June, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  47. arXiv:2401.03401  [pdf

    cs.CL

    Empirical Study of Large Language Models as Automated Essay Scoring Tools in English Composition__Taking TOEFL Independent Writing Task for Example

    Authors: Wei Xia, Shaoguang Mao, Chan**g Zheng

    Abstract: Large language models have demonstrated exceptional capabilities in tasks involving natural language generation, reasoning, and comprehension. This study aims to construct prompts and comments grounded in the diverse scoring criteria delineated within the official TOEFL guide. The primary objective is to assess the capabilities and constraints of ChatGPT, a prominent representative of large langua… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

  48. arXiv:2401.02669  [pdf, other

    cs.DC cs.AR

    Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache

    Authors: Bin Lin, Tao Peng, Chen Zhang, Minmin Sun, Lanbo Li, Hanyu Zhao, Wencong Xiao, Qi Xu, Xiafei Qiu, Shen Li, Zhigang Ji, Yong Li, Wei Lin

    Abstract: The rapid proliferation of Large Language Models (LLMs) has been a driving force in the growth of cloud-based LLM services, which are now integral to advancing AI applications. However, the dynamic auto-regressive nature of LLM service, along with the need to support exceptionally long context lengths, demands the flexible allocation and release of substantial resources. This presents considerable… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  49. arXiv:2401.01721  [pdf, other

    cs.IT eess.SP

    Limited Feedback on Measurements: Sharing a Codebook or a Generative Model?

    Authors: Nurettin Turan, Benedikt Fesl, Michael Joham, Zhengxiang Ma, Anthony C. K. Soong, Baoling Sheen, Weimin Xiao, Wolfgang Utschick

    Abstract: Discrete Fourier transform (DFT) codebook-based solutions are well-established for limited feedback schemes in frequency division duplex (FDD) systems. In recent years, data-aided solutions have been shown to achieve higher performance, enabled by the adaptivity of the feedback scheme to the propagation environment of the base station (BS) cell. In particular, a versatile limited feedback scheme u… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  50. arXiv:2312.17055  [pdf, other

    cs.CL

    Improving In-context Learning via Bidirectional Alignment

    Authors: Chengwei Qin, Wenhan Xia, Fangkai Jiao, Chen Chen, Yuchen Hu, Bosheng Ding, Shafiq Joty

    Abstract: Large language models (LLMs) have shown impressive few-shot generalization on many tasks via in-context learning (ICL). Despite their success in showing such emergent abilities, the scale and complexity of larger models also lead to unprecedentedly high computational demands and deployment challenges. In reaction, researchers explore transferring the powerful capabilities of larger models to more… ▽ More

    Submitted 24 June, 2024; v1 submitted 28 December, 2023; originally announced December 2023.