Skip to main content

Showing 1–50 of 153 results for author: Xiong, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16079  [pdf, other

    cs.CL cs.AI

    EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection

    Authors: Zheng Li, Dawei Zhu, Qilong Ma, Weimin Xiong, Sujian Li

    Abstract: Personality is a fundamental construct in psychology, reflecting an individual's behavior, thinking, and emotional patterns. Previous researches have made some progress in personality detection, primarily by utilizing the whole text to predict personality. However, these studies generally tend to overlook psychological knowledge: they rarely apply the well-established correlations between emotion… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.12845  [pdf, other

    cs.LG cs.CL

    Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts

    Authors: Haoxiang Wang, Wei Xiong, Tengyang Xie, Han Zhao, Tong Zhang

    Abstract: Reinforcement learning from human feedback (RLHF) has emerged as the primary method for aligning large language models (LLMs) with human preferences. The RLHF process typically starts by training a reward model (RM) using human preference data. Conventional RMs are trained on pairwise responses to the same user request, with relative ratings indicating which response humans prefer. The trained RM… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Technical report v1. Code and model are released at https://github.com/RLHFlow/RLHF-Reward-Modeling/

  3. arXiv:2406.11176  [pdf, other

    cs.CL cs.AI

    Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement

    Authors: Weimin Xiong, Yifan Song, Xiutian Zhao, Wenhao Wu, Xun Wang, Ke Wang, Cheng Li, Wei Peng, Sujian Li

    Abstract: Large language model agents have exhibited exceptional performance across a range of complex interactive tasks. Recent approaches have utilized tuning with expert trajectories to enhance agent performance, yet they primarily concentrate on outcome rewards, which may lead to errors or suboptimal actions due to the absence of process supervision signals. In this paper, we introduce the Iterative ste… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  4. arXiv:2406.02069  [pdf, other

    cs.CL cs.AI

    PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling

    Authors: Zefan Cai., Yichi Zhang, Bofei Gao, Yuliang Liu, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, Baobao Chang, Junjie Hu, Wen Xiao

    Abstract: In this study, we investigate whether attention-based information flow inside large language models (LLMs) is aggregated through noticeable patterns for long context processing. Our observations reveal that LLMs aggregate information through Pyramidal Information Funneling where attention is scattering widely in lower layers, progressively consolidating within specific contexts, and ultimately foc… ▽ More

    Submitted 16 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  5. arXiv:2405.17051  [pdf, other

    cs.LG cs.AI

    BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics

    Authors: Hao Wu, Xingjian Shi, Ziyue Huang, Penghao Zhao, Wei Xiong, **bao Xue, Yangyu Tao, Xiaomeng Huang, Weiyan Wang

    Abstract: Data-driven deep learning has emerged as the new paradigm to model complex physical space-time systems. These data-driven methods learn patterns by optimizing statistical metrics and tend to overlook the adherence to physical laws, unlike traditional model-driven numerical methods. Thus, they often generate predictions that are not physically realistic. On the other hand, by sampling a large amoun… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  6. arXiv:2405.10610  [pdf, other

    cs.CV

    Driving Referring Video Object Segmentation with Vision-Language Pre-trained Models

    Authors: Zikun Zhou, Wentao Xiong, Li Zhou, Xin Li, Zhenyu He, Yaowei Wang

    Abstract: The crux of Referring Video Object Segmentation (RVOS) lies in modeling dense text-video relations to associate abstract linguistic concepts with dynamic visual contents at pixel-level. Current RVOS methods typically use vision and language models pre-trained independently as backbones. As images and texts are mapped to uncoupled feature spaces, they face the arduous task of learning Vision-Langua… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  7. arXiv:2405.07863  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    RLHF Workflow: From Reward Modeling to Online RLHF

    Authors: Hanze Dong, Wei Xiong, Bo Pang, Haoxiang Wang, Han Zhao, Yingbo Zhou, Nan Jiang, Doyen Sahoo, Caiming Xiong, Tong Zhang

    Abstract: We present the workflow of Online Iterative Reinforcement Learning from Human Feedback (RLHF) in this technical report, which is widely reported to outperform its offline counterpart by a large margin in the recent large language model (LLM) literature. However, existing open-source RLHF projects are still largely confined to the offline learning setting. In this technical report, we aim to fill i… ▽ More

    Submitted 12 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

  8. arXiv:2405.05193  [pdf, other

    cs.CR

    Systematic Use of Random Self-Reducibility against Physical Attacks

    Authors: Ferhat Erata, TingHung Chiu, Anthony Etim, Srilalith Nampally, Tejas Raju, Rajashree Ramu, Ruzica Piskac, Timos Antonopoulos, Wenjie Xiong, Jakub Szefer

    Abstract: This work presents a novel, black-box software-based countermeasure against physical attacks including power side-channel and fault-injection attacks. The approach uses the concept of random self-reducibility and self-correctness to add randomness and redundancy in the execution for protection. Our approach is at the operation level, is not algorithm-specific, and thus, can be applied for protecti… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  9. arXiv:2405.05164  [pdf, other

    cs.CV

    ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion

    Authors: Bing Zhu, Zixin He, Weiyi Xiong, Guanhua Ding, Jianan Liu, Tao Huang, Wei Chen, Wei Xiang

    Abstract: Millimeter wave (mmWave) radar is a non-intrusive privacy and relatively convenient and inexpensive device, which has been demonstrated to be applicable in place of RGB cameras in human indoor pose estimation tasks. However, mmWave radar relies on the collection of reflected signals from the target, and the radar signals containing information is difficult to be fully applied. This has been a long… ▽ More

    Submitted 28 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  10. arXiv:2405.01525  [pdf, other

    cs.CL cs.AI

    FLAME: Factuality-Aware Alignment for Large Language Models

    Authors: Sheng-Chieh Lin, Luyu Gao, Barlas Oguz, Wenhan Xiong, Jimmy Lin, Wen-tau Yih, Xilun Chen

    Abstract: Alignment is a standard procedure to fine-tune pre-trained large language models (LLMs) to follow natural language instructions and serve as helpful AI assistants. We have observed, however, that the conventional alignment process fails to enhance the factual accuracy of LLMs, and often leads to the generation of more false facts (i.e. hallucination). In this paper, we study how to make the LLM al… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  11. arXiv:2404.18922  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    DPO Meets PPO: Reinforced Token Optimization for RLHF

    Authors: Han Zhong, Guhao Feng, Wei Xiong, Li Zhao, Di He, Jiang Bian, Liwei Wang

    Abstract: In the classical Reinforcement Learning from Human Feedback (RLHF) framework, Proximal Policy Optimization (PPO) is employed to learn from sparse, sentence-level rewards -- a challenging scenario in traditional deep reinforcement learning. Despite the great successes of PPO in the alignment of state-of-the-art closed-source large language models (LLMs), its open-source implementation is still larg… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  12. arXiv:2404.08801  [pdf, other

    cs.LG cs.CL

    Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

    Authors: Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou

    Abstract: The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy. We introduce Megalodon, a neural architecture for efficient sequence modeling with unlimited co… ▽ More

    Submitted 16 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 9 pages, 6 figures and 8 tables

  13. arXiv:2404.05717  [pdf, other

    cs.CV cs.AI

    SwapAnything: Enabling Arbitrary Object Swap** in Personalized Visual Editing

    Authors: **g Gu, Yilin Wang, Nanxuan Zhao, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang

    Abstract: Effective editing of personal content holds a pivotal role in enabling individuals to express their creativity, weaving captivating narratives within their visual stories, and elevate the overall quality and impact of their visual content. Therefore, in this work, we introduce SwapAnything, a novel framework that can swap any objects in an image with personalized concepts given by the reference, w… ▽ More

    Submitted 6 May, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: 18 pages, 16 figures, 3 tables

  14. arXiv:2403.11401  [pdf, other

    cs.CV cs.AI

    Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning

    Authors: Rao Fu, **gyu Liu, Xilun Chen, Yixin Nie, Wenhan Xiong

    Abstract: This paper introduces Scene-LLM, a 3D-visual-language model that enhances embodied agents' abilities in interactive 3D indoor environments by integrating the reasoning strengths of Large Language Models (LLMs). Scene-LLM adopts a hybrid 3D visual feature representation, that incorporates dense spatial information and supports scene state updates. The model employs a projection layer to efficiently… ▽ More

    Submitted 22 March, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  15. arXiv:2403.10701  [pdf, other

    cs.CV

    IMPRINT: Generative Object Compositing by Learning Identity-Preserving Representation

    Authors: Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, He Zhang, Wei Xiong, Daniel Aliaga

    Abstract: Generative object compositing emerges as a promising new avenue for compositional image editing. However, the requirement of object identity preservation poses a significant challenge, limiting practical usage of most existing methods. In response, this paper introduces IMPRINT, a novel diffusion-based generative model trained with a two-stage learning framework that decouples learning of identity… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  16. arXiv:2403.08730  [pdf, other

    cs.CL cs.CV

    Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization

    Authors: Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang

    Abstract: Multimodal Large Language Models (MLLMs) excel in generating responses based on visual inputs. However, they often suffer from a bias towards generating responses similar to their pretraining corpus, overshadowing the importance of visual information. We treat this bias as a "preference" for pretraining statistics, which hinders the model's grounding in visual input. To mitigate this issue, we pro… ▽ More

    Submitted 3 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  17. arXiv:2402.18571  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

    Authors: Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang

    Abstract: Fine-grained control over large language models (LLMs) remains a significant challenge, hindering their adaptability to diverse user needs. While Reinforcement Learning from Human Feedback (RLHF) shows promise in aligning LLMs, its reliance on scalar rewards often limits its ability to capture diverse user preferences in real-world applications. To address this limitation, we introduce the Directi… ▽ More

    Submitted 6 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: The code and model are released at https://github.com/Haoxiang-Wang/directional-preference-alignment

  18. arXiv:2402.17525  [pdf, other

    cs.CV

    Diffusion Model-Based Image Editing: A Survey

    Authors: Yi Huang, Jiancheng Huang, Yifan Liu, Mingfu Yan, Jiaxi Lv, Jianzhuang Liu, Wei Xiong, He Zhang, Shifeng Chen, Liangliang Cao

    Abstract: Denoising diffusion models have emerged as a powerful tool for various image generation and editing tasks, facilitating the synthesis of visual content in an unconditional or input-conditional manner. The core idea behind them is learning to reverse the process of gradually adding noise to images, allowing them to generate high-quality samples from a complex distribution. In this survey, we provid… ▽ More

    Submitted 16 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  19. arXiv:2402.07314  [pdf, other

    cs.LG stat.ML

    Online Iterative Reinforcement Learning from Human Feedback with General Preference Model

    Authors: Chenlu Ye, Wei Xiong, Yuheng Zhang, Nan Jiang, Tong Zhang

    Abstract: We study Reinforcement Learning from Human Feedback (RLHF) under a general preference oracle. In particular, we do not assume that there exists a reward function and the preference signal is drawn from the Bradley-Terry model as most of the prior works do. We consider a standard mathematical formulation, the reverse-KL regularized minimax game between two LLMs for RLHF under general preference ora… ▽ More

    Submitted 25 April, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: RLHF, Preference Learning, Alignment for LLMs

  20. arXiv:2401.02831  [pdf, other

    cs.CV eess.IV

    Two-stage Progressive Residual Dense Attention Network for Image Denoising

    Authors: Wencong Wu, An Ge, Guannan Lv, Yuelong Xia, Yungang Zhang, Wen Xiong

    Abstract: Deep convolutional neural networks (CNNs) for image denoising can effectively exploit rich hierarchical features and have achieved great success. However, many deep CNN-based denoising models equally utilize the hierarchical features of noisy images without paying attention to the more important and useful features, leading to relatively low performance. To address the issue, we design a new Two-s… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  21. arXiv:2312.15124  [pdf, other

    quant-ph cs.ET cs.LG stat.ML

    On fundamental aspects of quantum extreme learning machines

    Authors: Weijie Xiong, Giorgio Facelli, Mehrad Sahebi, Owen Agnel, Thiparat Chotibut, Supanut Thanasilp, Zoë Holmes

    Abstract: Quantum Extreme Learning Machines (QELMs) have emerged as a promising framework for quantum machine learning. Their appeal lies in the rich feature map induced by the dynamics of a quantum substrate - the quantum reservoir - and the efficient post-measurement training via linear regression. Here we study the expressivity of QELMs by decomposing the prediction of QELMs into a Fourier series. We sho… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: 16+17 pages, 8+2 figures

  22. arXiv:2312.11456  [pdf, other

    cs.LG cs.AI stat.ML

    Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

    Authors: Wei Xiong, Hanze Dong, Chenlu Ye, Ziqi Wang, Han Zhong, Heng Ji, Nan Jiang, Tong Zhang

    Abstract: This paper studies the alignment process of generative models with Reinforcement Learning from Human Feedback (RLHF). We first identify the primary challenges of existing popular methods like offline PPO and offline DPO as lacking in strategical exploration of the environment. Then, to understand the mathematical principle of RLHF, we consider a standard mathematical formulation, the reverse-KL re… ▽ More

    Submitted 1 May, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: 53 pages; theoretical study and algorithmic design of iterative RLHF and DPO

  23. arXiv:2312.10387  [pdf, other

    cs.HC

    Measuring the Sense of Presence and Learning Efficacy in Immersive Virtual Assembly Training

    Authors: Weichao Lin, Liang Chen, Wei Xiong, Kang Ran, Anlan Fan

    Abstract: With the rapid progress in virtual reality (VR) technology, the scope of VR applications has greatly expanded across various domains. However, the superiority of VR training over traditional methods and its impact on learning efficacy are still uncertain. To investigate whether VR training is more effective than traditional methods, we designed virtual training systems for mechanical assembly on b… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: 12 pages, 8 figures

  24. arXiv:2312.08403  [pdf, other

    cs.AI

    Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model

    Authors: Hao Wu, Yuxuan Liang, Wei Xiong, Zhengyang Zhou, Wei Huang, Shilong Wang, Kun Wang

    Abstract: Efficiently modeling spatio-temporal (ST) physical processes and observations presents a challenging problem for the deep learning community. Many recent studies have concentrated on meticulously reconciling various advantages, leading to designed models that are neither simple nor practical. To address this issue, this paper presents a systematic study on existing shortcomings faced by off-the-sh… ▽ More

    Submitted 3 June, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

    ACM Class: I.2.3

  25. arXiv:2312.06886  [pdf, other

    cs.CV

    Relightful Harmonization: Lighting-aware Portrait Background Replacement

    Authors: Mengwei Ren, Wei Xiong, Jae Shin Yoon, Zhixin Shu, Jianming Zhang, HyunJoon Jung, Guido Gerig, He Zhang

    Abstract: Portrait harmonization aims to composite a subject into a new background, adjusting its lighting and color to ensure harmony with the background scene. Existing harmonization techniques often only focus on adjusting the global color and brightness of the foreground and ignore crucial illumination cues from the background such as apparent lighting direction, leading to unrealistic compositions. We… ▽ More

    Submitted 7 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 camera ready

  26. arXiv:2312.00553  [pdf

    cs.HC eess.SP

    A Spatio-Temporal Graph Convolutional Network for Gesture Recognition from High-Density Electromyography

    Authors: Wenjuan Zhong, Yuyang Zhang, Peiwen Fu, Wenxuan Xiong, Mingming Zhang

    Abstract: Accurate hand gesture prediction is crucial for effective upper-limb prosthetic limbs control. As the high flexibility and multiple degrees of freedom exhibited by human hands, there has been a growing interest in integrating deep networks with high-density surface electromyography (HD-sEMG) grids to enhance gesture recognition capabilities. However, many existing methods fall short in fully explo… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  27. arXiv:2311.14316  [pdf, other

    eess.SP cs.AI

    Windformer:Bi-Directional Long-Distance Spatio-Temporal Network For Wind Speed Prediction

    Authors: Xuewei Li, Zewen Shang, Zhiqiang Liu, Jian Yu, Wei Xiong, Mei Yu

    Abstract: Wind speed prediction is critical to the management of wind power generation. Due to the large range of wind speed fluctuations and wake effect, there may also be strong correlations between long-distance wind turbines. This difficult-to-extract feature has become a bottleneck for improving accuracy. History and future time information includes the trend of airflow changes, whether this dynamic in… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

  28. arXiv:2311.09193  [pdf, other

    cs.CL cs.AI cs.CV

    The Role of Chain-of-Thought in Complex Vision-Language Reasoning Task

    Authors: Yifan Wu, Pengchuan Zhang, Wenhan Xiong, Barlas Oguz, James C. Gee, Yixin Nie

    Abstract: The study explores the effectiveness of the Chain-of-Thought approach, known for its proficiency in language tasks by breaking them down into sub-tasks and intermediate steps, in improving vision-language tasks that demand sophisticated perception and reasoning. We present the "Description then Decision" strategy, which is inspired by how humans process signals. This strategy significantly improve… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  29. arXiv:2311.07395  [pdf

    cs.RO cs.AI

    Predicting Continuous Locomotion Modes via Multidimensional Feature Learning from sEMG

    Authors: Peiwen Fu, Wenjuan Zhong, Yuyang Zhang, Wenxuan Xiong, Yuzhou Lin, Yanlong Tai, Lin Meng, Mingming Zhang

    Abstract: Walking-assistive devices require adaptive control methods to ensure smooth transitions between various modes of locomotion. For this purpose, detecting human locomotion modes (e.g., level walking or stair ascent) in advance is crucial for improving the intelligence and transparency of such robotic systems. This study proposes Deep-STF, a unified end-to-end deep learning model designed for integra… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: 10 pages,7 figures

  30. arXiv:2310.09436  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Sub-network Discovery and Soft-masking for Continual Learning of Mixed Tasks

    Authors: Zixuan Ke, Bing Liu, Wenhan Xiong, Asli Celikyilmaz, Haoran Li

    Abstract: Continual learning (CL) has two main objectives: preventing catastrophic forgetting (CF) and encouraging knowledge transfer (KT). The existing literature mainly focused on overcoming CF. Some work has also been done on KT when the tasks are similar. To our knowledge, only one method has been proposed to learn a sequence of mixed tasks. However, these techniques still suffer from CF and/or limited… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

    Comments: https://github.com/ZixuanKe/PyContinual

    Journal ref: EMNLP 2023 (findings)

  31. arXiv:2310.06547  [pdf, other

    cs.CL cs.AI

    Rationale-Enhanced Language Models are Better Continual Relation Learners

    Authors: Weimin Xiong, Yifan Song, Peiyi Wang, Sujian Li

    Abstract: Continual relation extraction (CRE) aims to solve the problem of catastrophic forgetting when learning a sequence of newly emerging relations. Recent CRE studies have found that catastrophic forgetting arises from the model's lack of robustness against future analogous relations. To address the issue, we introduce rationale, i.e., the explanations of relation classification results generated by la… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023

  32. arXiv:2310.06362  [pdf, other

    cs.CL

    InfoCL: Alleviating Catastrophic Forgetting in Continual Text Classification from An Information Theoretic Perspective

    Authors: Yifan Song, Peiyi Wang, Weimin Xiong, Dawei Zhu, Tianyu Liu, Zhifang Sui, Sujian Li

    Abstract: Continual learning (CL) aims to constantly learn new knowledge over time while avoiding catastrophic forgetting on old tasks. We focus on continual text classification under the class-incremental setting. Recent CL studies have identified the severe performance decrease on analogous classes as a key factor for catastrophic forgetting. In this paper, through an in-depth exploration of the represent… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023. An improved version of arXiv:2305.07289

  33. arXiv:2310.05727  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    The Program Testing Ability of Large Language Models for Code

    Authors: Weimin Xiong, Yiwen Guo, Hao Chen

    Abstract: Recent development of large language models (LLMs) for code like CodeX and CodeT5+ demonstrates tremendous promise in achieving code intelligence. Their ability of synthesizing code that completes a program for performing a pre-defined task has been intensively tested and verified on benchmark datasets including HumanEval and MBPP. Yet, evaluation of these LLMs from more perspectives (than just pr… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  34. arXiv:2309.16039  [pdf, other

    cs.CL

    Effective Long-Context Scaling of Foundation Models

    Authors: Wenhan Xiong, **gyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

    Abstract: We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our model series are built through continual pretraining from Llama 2 with longer training sequences and on a dataset where long texts are upsampled. We perform extensive evaluation on language modeling, synthetic context probing tasks, and a wide range of research benchmarks. On research benchm… ▽ More

    Submitted 13 November, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

  35. arXiv:2309.11258  [pdf, other

    cs.GR cs.CV

    TwinTex: Geometry-aware Texture Generation for Abstracted 3D Architectural Models

    Authors: Weidan Xiong, Hongqian Zhang, Botao Peng, Ziyu Hu, Yongli Wu, Jianwei Guo, Hui Huang

    Abstract: Coarse architectural models are often generated at scales ranging from individual buildings to scenes for downstream applications such as Digital Twin City, Metaverse, LODs, etc. Such piece-wise planar models can be abstracted as twins from 3D dense reconstructions. However, these models typically lack realistic texture relative to the real building or scene, making them unsuitable for vivid displ… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: Accepted to SIGGRAPH ASIA 2023

  36. arXiv:2309.06256  [pdf, other

    cs.LG

    Mitigating the Alignment Tax of RLHF

    Authors: Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan Yao, Tong Zhang

    Abstract: LLMs acquire a wide range of abilities during pre-training, but aligning LLMs under Reinforcement Learning with Human Feedback (RLHF) can lead to forgetting, which is also known as the alignment tax. To empirically verify this hypothesis, we conducted experiments with existing RLHF algorithms using OpenLLaMA-3B, which revealed a pronounced alignment tax in NLP tasks. On the other hand, despite var… ▽ More

    Submitted 5 February, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: 28 Pages

  37. arXiv:2308.16137  [pdf, other

    cs.CL cs.AI

    LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models

    Authors: Chi Han, Qifan Wang, Hao Peng, Wenhan Xiong, Yu Chen, Heng Ji, Sinong Wang

    Abstract: Today's large language models (LLMs) typically train on short text segments (e.g., <4K tokens) due to the quadratic complexity of their Transformer architectures. As a result, their performance suffers drastically on inputs longer than those encountered during training, substantially limiting their applications in real-world tasks involving long contexts such as encoding scientific articles, code… ▽ More

    Submitted 24 June, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

    Comments: NAACL 2024 Outstanding paper, 9 pages, 6 figures

  38. arXiv:2308.12950  [pdf, other

    cs.CL

    Code Llama: Open Foundation Models for Code

    Authors: Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, **gyu Liu, Romain Sauvestre, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom , et al. (1 additional authors not shown)

    Abstract: We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks. We provide multiple flavors to cover a wide range of applications: foundation models (Code Llama), Python specializations (Code Llama… ▽ More

    Submitted 31 January, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

  39. arXiv:2308.03152  [pdf, other

    physics.ao-ph cs.AI cs.LG

    AI-GOMS: Large AI-Driven Global Ocean Modeling System

    Authors: Wei Xiong, Yanfei Xiang, Hao Wu, Shuyi Zhou, Yuze Sun, Muyuan Ma, Xiaomeng Huang

    Abstract: Ocean modeling is a powerful tool for simulating the physical, chemical, and biological processes of the ocean, which is the foundation for marine science research and operational oceanography. Modern numerical ocean modeling mainly consists of governing equations and numerical algorithms. Nonlinear instability, computational expense, low reusability efficiency and high coupling costs have gradual… ▽ More

    Submitted 10 August, 2023; v1 submitted 6 August, 2023; originally announced August 2023.

  40. arXiv:2307.13209  [pdf

    cs.RO cs.AI

    Gait Cycle-Inspired Learning Strategy for Continuous Prediction of Knee Joint Trajectory from sEMG

    Authors: Xueming Fu, Hao Zheng, Luyan Liu, Wenjuan Zhong, Haowen Liu, Wenxuan Xiong, Yuyang Zhang, Yifeng Chen, Dong Wei, Mingjie Dong, Yefeng Zheng, Mingming Zhang

    Abstract: Predicting lower limb motion intent is vital for controlling exoskeleton robots and prosthetic limbs. Surface electromyography (sEMG) attracts increasing attention in recent years as it enables ahead-of-time prediction of motion intentions before actual movement. However, the estimation performance of human joint trajectory remains a challenging problem due to the inter- and intra-subject variatio… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

  41. arXiv:2307.11795  [pdf, other

    eess.AS cs.AI cs.CL cs.LG

    Prompting Large Language Models with Speech Recognition Abilities

    Authors: Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, **xi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

    Abstract: Large language models have proven themselves highly flexible, able to solve a wide range of generative tasks, such as abstractive summarization and open-ended question answering. In this paper we extend the capabilities of LLMs by directly attaching a small audio encoder allowing it to perform speech recognition. By directly prepending a sequence of audial embeddings to the text token embeddings,… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  42. arXiv:2307.10784  [pdf, other

    cs.CV

    SMURF: Spatial Multi-Representation Fusion for 3D Object Detection with 4D Imaging Radar

    Authors: Jianan Liu, Qiuchi Zhao, Weiyi Xiong, Tao Huang, Qing-Long Han, Bing Zhu

    Abstract: The 4D Millimeter wave (mmWave) radar is a promising technology for vehicle sensing due to its cost-effectiveness and operability in adverse weather conditions. However, the adoption of this technology has been hindered by sparsity and noise issues in radar point cloud data. This paper introduces spatial multi-representation fusion (SMURF), a novel approach to 3D object detection using a single 4D… ▽ More

    Submitted 5 October, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted by IEEE Transactions on Intelligent Vehicles

  43. LXL: LiDAR Excluded Lean 3D Object Detection with 4D Imaging Radar and Camera Fusion

    Authors: Weiyi Xiong, Jianan Liu, Tao Huang, Qing-Long Han, Yuxuan Xia, Bing Zhu

    Abstract: As an emerging technology and a relatively affordable device, the 4D imaging radar has already been confirmed effective in performing 3D object detection in autonomous driving. Nevertheless, the sparsity and noisiness of 4D radar point clouds hinder further performance improvement, and in-depth studies about its fusion with other modalities are lacking. On the other hand, as a new image view trans… ▽ More

    Submitted 3 October, 2023; v1 submitted 2 July, 2023; originally announced July 2023.

    Comments: Accepted by IEEE Transactions on Intelligent Vehicles

  44. arXiv:2306.12420  [pdf, other

    cs.CL cs.AI

    LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

    Authors: Shizhe Diao, Rui Pan, Hanze Dong, Ka Shun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang

    Abstract: Foundation models have demonstrated a great ability to achieve general human-level intelligence far beyond traditional approaches. As the technique keeps attracting attention from the AI community, an increasing number of foundation models are becoming publicly accessible. However, a significant shortcoming of most of these models lies in their performance in specialized-domain and task-specific a… ▽ More

    Submitted 5 May, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: Published in NAACL 2024 Demo Track

  45. arXiv:2306.08364  [pdf, other

    stat.ML cs.IT cs.LG

    Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources

    Authors: Chengshuai Shi, Wei Xiong, Cong Shen, **g Yang

    Abstract: Existing theoretical studies on offline reinforcement learning (RL) mostly consider a dataset sampled directly from the target task. In practice, however, data often come from several heterogeneous but related sources. Motivated by this gap, this work aims at rigorously understanding offline RL with multiple datasets that are collected from randomly perturbed versions of the target task instead of… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  46. arXiv:2306.06624  [pdf, other

    cs.CL

    RestGPT: Connecting Large Language Models with Real-World RESTful APIs

    Authors: Yifan Song, Weimin Xiong, Dawei Zhu, Wenhao Wu, Han Qian, Mingbo Song, Hailiang Huang, Cheng Li, Ke Wang, Rong Yao, Ye Tian, Sujian Li

    Abstract: Tool-augmented large language models (LLMs) have achieved remarkable progress in tackling a broad range of tasks. However, existing methods are mainly restricted to specifically designed tools and fail to fulfill complex instructions, having great limitations when confronted with real-world scenarios. In this paper, we explore a more realistic scenario by connecting LLMs with RESTful APIs, which a… ▽ More

    Submitted 26 August, 2023; v1 submitted 11 June, 2023; originally announced June 2023.

    Comments: Add RestBench to evaluate RestGPT

  47. arXiv:2306.03235  [pdf, other

    cs.LG cs.CR

    Information Flow Control in Machine Learning through Modular Model Architecture

    Authors: Trishita Tiwari, Suchin Gururangan, Chuan Guo, Weizhe Hua, Sanjay Kariyappa, Udit Gupta, Wenjie Xiong, Kiwan Maeng, Hsien-Hsin S. Lee, G. Edward Suh

    Abstract: In today's machine learning (ML) models, any part of the training data can affect its output. This lack of control for information flow from training data to model output is a major obstacle in training models on sensitive data when access control only allows individual users to access a subset of data. To enable secure machine learning for access controlled data, we propose the notion of informat… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  48. arXiv:2306.03067  [pdf, other

    cs.CL cs.AI

    Interactive Editing for Text Summarization

    Authors: Yujia Xie, Xun Wang, Si-Qing Chen, Wayne Xiong, Pengcheng He

    Abstract: Summarizing lengthy documents is a common and essential task in our daily lives. Although recent advancements in neural summarization models can assist in crafting general-purpose summaries, human writers often have specific requirements that call for a more customized approach. To address this need, we introduce REVISE (Refinement and Editing via Iterative Summarization Enhancement), an innovativ… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  49. arXiv:2305.18286  [pdf, other

    cs.CV cs.AI

    Photoswap: Personalized Subject Swap** in Images

    Authors: **g Gu, Yilin Wang, Nanxuan Zhao, Tsu-Jui Fu, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang

    Abstract: In an era where images and visual content dominate our digital landscape, the ability to manipulate and personalize these images has become a necessity. Envision seamlessly substituting a tabby cat lounging on a sunlit window sill in a photograph with your own playful puppy, all while preserving the original charm and composition of the image. We present Photoswap, a novel approach that enables th… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: 14 pages

  50. arXiv:2305.18258  [pdf, other

    cs.LG cs.AI cs.GT math.OC stat.ML

    Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration

    Authors: Zhihan Liu, Miao Lu, Wei Xiong, Han Zhong, Hao Hu, Shenao Zhang, Sirui Zheng, Zhuoran Yang, Zhaoran Wang

    Abstract: In online reinforcement learning (online RL), balancing exploration and exploitation is crucial for finding an optimal policy in a sample-efficient way. To achieve this, existing sample-efficient online RL algorithms typically consist of three components: estimation, planning, and exploration. However, in order to cope with general function approximators, most of them involve impractical algorithm… ▽ More

    Submitted 25 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.