Skip to main content

Showing 1–6 of 6 results for author: Son, N H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2205.13434  [pdf, other

    cs.CL cs.AI

    Jointly Learning Span Extraction and Sequence Labeling for Information Extraction from Business Documents

    Authors: Nguyen Hong Son, Hieu M. Vu, Tuan-Anh D. Nguyen, Minh-Tien Nguyen

    Abstract: This paper introduces a new information extraction model for business documents. Different from prior studies which only base on span extraction or sequence labeling, the model takes into account advantage of both span extraction and sequence labeling. The combination allows the model to deal with long documents with sparse information (the small amount of extracted information). The model is trai… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted to IJCNN 2022

  2. arXiv:2204.03958  [pdf, other

    cs.CL cs.AI

    Enhance Incomplete Utterance Restoration by Joint Learning Token Extraction and Text Generation

    Authors: Shumpei Inoue, Tsungwei Liu, Nguyen Hong Son, Minh-Tien Nguyen

    Abstract: This paper introduces a model for incomplete utterance restoration (IUR) called JET (\textbf{J}oint learning token \textbf{E}xtraction and \textbf{T}ext generation). Different from prior studies that only work on extraction or abstraction datasets, we design a simple but effective model, working for both scenarios of IUR. Our design simulates the nature of IUR, where omitted tokens from the contex… ▽ More

    Submitted 28 July, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

    Comments: This paper was accepted by 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022). It includes 10 pages, 2 figures

  3. arXiv:2106.00978  [pdf, other

    cs.AI

    A Span Extraction Approach for Information Extraction on Visually-Rich Documents

    Authors: Tuan-Anh D. Nguyen, Hieu M. Vu, Nguyen Hong Son, Minh-Tien Nguyen

    Abstract: Information extraction (IE) for visually-rich documents (VRDs) has achieved SOTA performance recently thanks to the adaptation of Transformer-based language models, which shows the great potential of pre-training methods. In this paper, we present a new approach to improve the capability of language model pre-training on VRDs. Firstly, we introduce a new query-based IE model that employs span extr… ▽ More

    Submitted 6 July, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted to Document Images and Language Workshop at ICDAR 2021

  4. arXiv:2004.02465  [pdf, ps, other

    cs.DM math.CO

    Independent sets of closure operations

    Authors: Nguyen Hoang Son

    Abstract: In this paper independent sets of closure operations are introduced. We characterize minimal keys and antikeys of closure operations in terms of independent sets. We establish an expression on the connection between minimal keys and antikeys of closure operations based on independent sets. We construct two combinatorial algorithms for finding all minimal keys and all antikeys of a given closure op… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

    MSC Class: [2010]:68R99; 68P15

  5. arXiv:2003.03064  [pdf, other

    cs.IR cs.CL cs.LG

    Transfer Learning for Information Extraction with Limited Data

    Authors: Minh-Tien Nguyen, Viet-Anh Phan, Le Thai Linh, Nguyen Hong Son, Le Tien Dung, Miku Hirano, Hajime Hotta

    Abstract: This paper presents a practical approach to fine-grained information extraction. Through plenty of experiences of authors in practically applying information extraction to business process automation, there can be found a couple of fundamental technical challenges: (i) the availability of labeled data is usually limited and (ii) highly detailed classification is required. The main idea of our prop… ▽ More

    Submitted 8 June, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: 14 pages, 5 figures, PACLING conference

  6. The method of detecting online password attacks based on high-level protocol analysis and clustering techniques

    Authors: Nguyen Hong Son, Ha Thanh Dung

    Abstract: Although there have been many solutions applied, the safety challenges related to the password security mechanism are not reduced. The reason for this is that while the means and tools to support password attacks are becoming more and more abundant, the number of transaction systems through the Internet is increasing, and new services systems appear. For example, IoT also uses password-based authe… ▽ More

    Submitted 4 December, 2019; originally announced December 2019.

    Comments: 13 pages, 7 figures

    Journal ref: International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.6, November 2019