Skip to main content

Showing 1–4 of 4 results for author: Deik, D G X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.16510  [pdf, other

    cs.AI cs.CL cs.LG

    Meta-Task Planning for Language Agents

    Authors: Cong Zhang, Derrick Goh Xin Deik, Dexun Li, Hao Zhang, Yong Liu

    Abstract: The rapid advancement of neural language models has sparked a new surge of intelligent agent research. Unlike traditional agents, large language model-based agents (LLM agents) have emerged as a promising paradigm for achieving artificial general intelligence (AGI) due to their superior reasoning and generalization capabilities. Effective planning is crucial for the success of LLM agents in real-w… ▽ More

    Submitted 30 May, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

  2. arXiv:2404.15103  [pdf, other

    cs.CL

    Multi-view Content-aware Indexing for Long Document Retrieval

    Authors: Kuicai Dong, Derrick Goh Xin Deik, Yi Quan Lee, Hao Zhang, Xiangyang Li, Cong Zhang, Yong Liu

    Abstract: Long document question answering (DocQA) aims to answer questions from long documents over 10k words. They usually contain content structures such as sections, sub-sections, and paragraph demarcations. However, the indexing methods of long documents remain under-explored, while existing systems generally employ fixed-length chunking. As they do not consider content structures, the resultant chunks… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  3. arXiv:2402.09764  [pdf, other

    cs.AI

    Aligning Crowd Feedback via Distributional Preference Reward Modeling

    Authors: Dexun Li, Cong Zhang, Kuicai Dong, Derrick Goh Xin Deik, Ruiming Tang, Yong Liu

    Abstract: Deep Reinforcement Learning is widely used for aligning Large Language Models (LLM) with human preference. However, the conventional reward modelling is predominantly dependent on human annotations provided by a select cohort of individuals. Such dependence may unintentionally result in skewed models that reflect the inclinations of these annotators, thereby failing to adequately represent the wid… ▽ More

    Submitted 30 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  4. arXiv:2310.13669  [pdf, other

    cs.LG cs.AI cs.CL cs.PL

    Automatic Unit Test Data Generation and Actor-Critic Reinforcement Learning for Code Synthesis

    Authors: Philip John Gorinski, Matthieu Zimmer, Gerasimos Lampouras, Derrick Goh Xin Deik, Ignacio Iacobacci

    Abstract: The advent of large pre-trained language models in the domain of Code Synthesis has shown remarkable performance on various benchmarks, treating the problem of Code Generation in a fashion similar to Natural Language Generation, trained with a Language Modelling (LM) objective. In addition, the property of programming language code being precisely evaluable with respect to its semantics -- through… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 9 pages + 4 pages appendix; 4 Figures, 4 Tables, 1 Algorithm; Accepted to Findings of EMNLP 2023