Skip to main content

Showing 1–6 of 6 results for author: Do, P N

.
  1. arXiv:2403.15882  [pdf, other

    cs.CL

    VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding

    Authors: Phong Nguyen-Thuan Do, Son Quoc Tran, Phu Gia Hoang, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of tasks. To establish a standardized set of benchmarks for Vietnamese NLU, we introduce the first Vietnamese Language Understanding Evaluation (VLUE) benchm… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: Accepted at NAACL 2024 (Findings)

  2. arXiv:2309.05103  [pdf, other

    cs.CL cs.AI

    AGent: A Novel Pipeline for Automatically Creating Unanswerable Questions

    Authors: Son Quoc Tran, Gia-Huy Do, Phong Nguyen-Thuan Do, Matt Kretchmar, Xinya Du

    Abstract: The development of large high-quality datasets and high-performing models have led to significant advancements in the domain of Extractive Question Answering (EQA). This progress has sparked considerable interest in exploring unanswerable questions within the EQA domain. Training EQA models with unanswerable questions helps them avoid extracting misleading or incorrect answers for queries that lac… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: 16 pages, 10 tables, 3 figures

  3. arXiv:2303.13355  [pdf, other

    cs.CL cs.AI

    Revealing Weaknesses of Vietnamese Language Models Through Unanswerable Questions in Machine Reading Comprehension

    Authors: Son Quoc Tran, Phong Nguyen-Thuan Do, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: Although the curse of multilinguality significantly restricts the language abilities of multilingual models in monolingual settings, researchers now still have to rely on multilingual models to develop state-of-the-art systems in Vietnamese Machine Reading Comprehension. This difficulty in researching is because of the limited number of high-quality works in develo** Vietnamese language models.… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted at The 2023 EACL Student Research Workshop

  4. arXiv:2302.00094  [pdf, other

    cs.AI

    The Impacts of Unanswerable Questions on the Robustness of Machine Reading Comprehension Models

    Authors: Son Quoc Tran, Phong Nguyen-Thuan Do, Uyen Le, Matt Kretchmar

    Abstract: Pretrained language models have achieved super-human performances on many Machine Reading Comprehension (MRC) benchmarks. Nevertheless, their relative inability to defend against adversarial attacks has spurred skepticism about their natural language understanding. In this paper, we ask whether training with unanswerable questions in SQuAD 2.0 can help improve the robustness of MRC models against… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

    Comments: Accepted atThe 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023)

  5. arXiv:2204.07002  [pdf, other

    cs.CL

    XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-based Textual Knowledge Source

    Authors: Kiet Van Nguyen, Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models. A reader-based QA system is a high-level search engi… ▽ More

    Submitted 13 August, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted by ACIIDS 2022

  6. arXiv:2105.09043  [pdf, other

    cs.CL

    Sentence Extraction-Based Machine Reading Comprehension for Vietnamese

    Authors: Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Kiet Van Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: The development of natural language processing (NLP) in general and machine reading comprehension in particular has attracted the great attention of the research community. In recent years, there are a few datasets for machine reading comprehension tasks in Vietnamese with large sizes, such as UIT-ViQuAD and UIT-ViNewsQA. However, the datasets are not diverse in answers to serve the research. In t… ▽ More

    Submitted 11 June, 2021; v1 submitted 19 May, 2021; originally announced May 2021.

    Comments: Accepted by KSEM 2021 (International Conference on Knowledge Science, Engineering and Management)