Skip to main content

Showing 1–5 of 5 results for author: Vu, H M

.
  1. arXiv:2401.03790  [pdf, other

    cs.LG cs.CR cs.PL cs.SE

    Inferring Properties of Graph Neural Networks

    Authors: Dat Nguyen, Hieu M. Vu, Cong-Thanh Le, Bach Le, David Lo, ThanhVu Nguyen, Corina Pasareanu

    Abstract: We propose GNNInfer, the first automatic property inference technique for GNNs. To tackle the challenge of varying input structures in GNNs, GNNInfer first identifies a set of representative influential structures that contribute significantly towards the prediction of a GNN. Using these structures, GNNInfer converts each pair of an influential structure and the GNN to their equivalent FNN and the… ▽ More

    Submitted 2 March, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: 20 pages main paper, 10 pages for appendix

  2. arXiv:2209.12561  [pdf, other

    cs.IR cs.CV cs.LG

    Improving Document Image Understanding with Reinforcement Finetuning

    Authors: Bao-Sinh Nguyen, Dung Tien Le, Hieu M. Vu, Tuan Anh D. Nguyen, Minh-Tien Nguyen, Hung Le

    Abstract: Successful Artificial Intelligence systems often require numerous labeled data to extract information from document images. In this paper, we investigate the problem of improving the performance of Artificial Intelligence systems in understanding document images, especially in cases where training data is limited. We address the problem by proposing a novel finetuning method using reinforcement le… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: Accepted to ICONIP 2022

  3. arXiv:2205.13434  [pdf, other

    cs.CL cs.AI

    Jointly Learning Span Extraction and Sequence Labeling for Information Extraction from Business Documents

    Authors: Nguyen Hong Son, Hieu M. Vu, Tuan-Anh D. Nguyen, Minh-Tien Nguyen

    Abstract: This paper introduces a new information extraction model for business documents. Different from prior studies which only base on span extraction or sequence labeling, the model takes into account advantage of both span extraction and sequence labeling. The combination allows the model to deal with long documents with sparse information (the small amount of extracted information). The model is trai… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted to IJCNN 2022

  4. arXiv:2106.00978  [pdf, other

    cs.AI

    A Span Extraction Approach for Information Extraction on Visually-Rich Documents

    Authors: Tuan-Anh D. Nguyen, Hieu M. Vu, Nguyen Hong Son, Minh-Tien Nguyen

    Abstract: Information extraction (IE) for visually-rich documents (VRDs) has achieved SOTA performance recently thanks to the adaptation of Transformer-based language models, which shows the great potential of pre-training methods. In this paper, we present a new approach to improve the capability of language model pre-training on VRDs. Firstly, we introduce a new query-based IE model that employs span extr… ▽ More

    Submitted 6 July, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted to Document Images and Language Workshop at ICDAR 2021

  5. arXiv:2010.05322  [pdf, other

    cs.CV

    Revising FUNSD dataset for key-value detection in document images

    Authors: Hieu M. Vu, Diep Thi-Ngoc Nguyen

    Abstract: FUNSD is one of the limited publicly available datasets for information extraction from document im-ages. The information in the FUNSD dataset is defined by text areas of four categories ("key", "value", "header", "other", and "background") and connectivity between areas as key-value relations. In-specting FUNSD, we found several inconsistency in labeling, which impeded its applicability to thekey… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.