Skip to main content

Showing 1–7 of 7 results for author: Zu, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.11550  [pdf, other

    cs.CL cs.AI

    LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

    Authors: Jun Zhao, Can Zu, Hao Xu, Yi Lu, Wei He, Yiwen Ding, Tao Gui, Qi Zhang, Xuan**g Huang

    Abstract: Large language models (LLMs) have demonstrated impressive performance in understanding language and executing complex reasoning tasks. However, LLMs with long context windows have been notorious for their expensive training costs and high inference latency. Even the most advanced models such as GPT-4 and Claude2 often make mistakes when processing inputs of over $100k$ tokens, a phenomenon also kn… ▽ More

    Submitted 13 March, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  2. arXiv:2402.11525  [pdf, other

    cs.CL cs.LG

    Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution

    Authors: Nuo Xu, Jun Zhao, Can Zu, Sixian Li, Lu Chen, Zhihao Zhang, Rui Zheng, Shihan Dou, Wenjuan Qin, Tao Gui, Qi Zhang, Xuan**g Huang

    Abstract: Faithfulness, expressiveness, and elegance is the constant pursuit in machine translation. However, traditional metrics like \textit{BLEU} do not strictly align with human preference of translation quality. In this paper, we explore leveraging reinforcement learning with human feedback (\textit{RLHF}) to improve translation quality. It is non-trivial to collect a large high-quality dataset of huma… ▽ More

    Submitted 27 February, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  3. arXiv:2304.08085  [pdf, other

    cs.CL cs.AI

    InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction

    Authors: Xiao Wang, Weikang Zhou, Can Zu, Han Xia, Tianze Chen, Yuansen Zhang, Rui Zheng, Junjie Ye, Qi Zhang, Tao Gui, Jihua Kang, **gsheng Yang, Siyuan Li, Chunsai Du

    Abstract: Large language models have unlocked strong multi-task capabilities from reading instructive prompts. However, recent studies have shown that existing large models still have difficulty with information extraction tasks. For example, gpt-3.5-turbo achieved an F1 score of 18.22 on the Ontonotes dataset, which is significantly lower than the state-of-the-art performance. In this paper, we propose Ins… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

  4. arXiv:2303.10420  [pdf

    cs.CL

    A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models

    Authors: Junjie Ye, Xuanting Chen, Nuo Xu, Can Zu, Zekai Shao, Shichun Liu, Yuhan Cui, Zeyang Zhou, Chao Gong, Yang Shen, Jie Zhou, Siming Chen, Tao Gui, Qi Zhang, Xuan**g Huang

    Abstract: GPT series models, such as GPT-3, CodeX, InstructGPT, ChatGPT, and so on, have gained considerable attention due to their exceptional natural language processing capabilities. However, despite the abundance of research on the difference in capabilities between GPT series models and fine-tuned models, there has been limited attention given to the evolution of GPT series models' capabilities over ti… ▽ More

    Submitted 23 December, 2023; v1 submitted 18 March, 2023; originally announced March 2023.

  5. arXiv:2303.00293  [pdf

    cs.CL

    How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks

    Authors: Xuanting Chen, Junjie Ye, Can Zu, Nuo Xu, Rui Zheng, Minlong Peng, Jie Zhou, Tao Gui, Qi Zhang, Xuan**g Huang

    Abstract: The GPT-3.5 models have demonstrated impressive performance in various Natural Language Processing (NLP) tasks, showcasing their strong understanding and reasoning capabilities. However, their robustness and abilities to handle various complexities of the open world have yet to be explored, which is especially crucial in assessing the stability of models and is a key aspect of trustworthy AI. In t… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    MSC Class: 68-06 ACM Class: I.2

  6. ASMFS: Adaptive-Similarity-based Multi-modality Feature Selection for Classification of Alzheimer's Disease

    Authors: Yuang Shi, Chen Zu, Mei Hong, Lu** Zhou, Lei Wang, Xi Wu, Jiliu Zhou, Daoqiang Zhang, Yan Wang

    Abstract: With the increasing amounts of high-dimensional heterogeneous data to be processed, multi-modality feature selection has become an important research direction in medical image analysis. Traditional methods usually depict the data structure using fixed and predefined similarity matrix for each modality separately, without considering the potential relationship structure across different modalities… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

    Comments: 27 pages, 10 figures

  7. arXiv:2006.06278   

    eess.IV cs.CV

    DSU-net: Dense SegU-net for automatic head-and-neck tumor segmentation in MR images

    Authors: Pin Tang, Chen Zu, Mei Hong, Rui Yan, Xingchen Peng, Jianghong Xiao, Xi Wu, Jiliu Zhou, Lu** Zhou, Yan Wang

    Abstract: Precise and accurate segmentation of the most common head-and-neck tumor, nasopharyngeal carcinoma (NPC), in MRI sheds light on treatment and regulatory decisions making. However, the large variations in the lesion size and shape of NPC, boundary ambiguity, as well as the limited available annotated samples conspire NPC segmentation in MRI towards a challenging task. In this paper, we propose a De… ▽ More

    Submitted 19 December, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: This research needs to be advanced in the future