Skip to main content

Showing 1–3 of 3 results for author: Luong, T Q

.
  1. arXiv:2401.08967  [pdf, other

    cs.CL

    ReFT: Reasoning with Reinforced Fine-Tuning

    Authors: Trung Quoc Luong, Xinbo Zhang, Zhanming Jie, Peng Sun, Xiaoran **, Hang Li

    Abstract: One way to enhance the reasoning capability of Large Language Models (LLMs) is to conduct Supervised Fine-Tuning (SFT) using Chain-of-Thought (CoT) annotations. This approach does not show sufficiently strong generalization ability, however, because the training only relies on the given CoT data. In math problem-solving, for example, there is usually only one annotated reasoning path for each ques… ▽ More

    Submitted 27 June, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: ACL 2024 main conference; adjust with reviewer comments; 13 pages

  2. arXiv:2309.11054  [pdf, other

    cs.CL cs.AI cs.LG cs.PL

    Design of Chain-of-Thought in Math Problem Solving

    Authors: Zhanming Jie, Trung Quoc Luong, Xinbo Zhang, Xiaoran **, Hang Li

    Abstract: Chain-of-Thought (CoT) plays a crucial role in reasoning for math problem solving. We conduct a comprehensive examination of methods for designing CoT, comparing conventional natural language CoT with various program CoTs, including the self-describing program, the comment-describing program, and the non-describing program. Furthermore, we investigate the impact of programming language on program… ▽ More

    Submitted 30 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: 15 pages

  3. arXiv:2305.10448  [pdf, other

    cs.CL cs.AI

    Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding

    Authors: Shuwei Feng, Tianyang Zhan, Zhanming Jie, Trung Quoc Luong, Xiaoran **

    Abstract: This paper presents GenDoc, a general sequence-to-sequence document understanding model pre-trained with unified masking across three modalities: text, image, and layout. The proposed model utilizes an encoder-decoder architecture, which allows for increased adaptability to a wide range of downstream tasks with diverse output formats, in contrast to the encoder-only models commonly employed in doc… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.