Search | arXiv e-print repository

A LayoutLMv3-Based Model for Enhanced Relation Extraction in Visually-Rich Documents

Authors: Wiam Adnan, Joel Tang, Yassine Bel Khayat Zouggari, Seif Edinne Laatiri, Laurent Lam, Fabien Caspani

Abstract: Document Understanding is an evolving field in Natural Language Processing (NLP). In particular, visual and spatial features are essential in addition to the raw text itself and hence, several multimodal models were developed in the field of Visual Document Understanding (VDU). However, while research is mainly focused on Key Information Extraction (KIE), Relation Extraction (RE) between identifie… ▽ More Document Understanding is an evolving field in Natural Language Processing (NLP). In particular, visual and spatial features are essential in addition to the raw text itself and hence, several multimodal models were developed in the field of Visual Document Understanding (VDU). However, while research is mainly focused on Key Information Extraction (KIE), Relation Extraction (RE) between identified entities is still under-studied. For instance, RE is crucial to regroup entities or obtain a comprehensive hierarchy of data in a document. In this paper, we present a model that, initialized from LayoutLMv3, can match or outperform the current state-of-the-art results in RE applied to Visually-Rich Documents (VRD) on FUNSD and CORD datasets, without any specific pre-training and with fewer parameters. We also report an extensive ablation study performed on FUNSD, highlighting the great impact of certain features and modelization choices on the performances. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: Accepted at the International Conference on Document Analysis and Recognition (ICDAR 2024)

arXiv:2305.13934 [pdf, other]

doi 10.1038/s41598-023-38964-3

Perception, performance, and detectability of conversational artificial intelligence across 32 university courses

Authors: Hazem Ibrahim, Fengyuan Liu, Rohail Asim, Balaraju Battu, Sidahmed Benabderrahmane, Bashar Alhafni, Wifag Adnan, Tuka Alhanai, Bedoor AlShebli, Riyadh Baghdadi, Jocelyn J. Bélanger, Elena Beretta, Kemal Celik, Moumena Chaqfeh, Mohammed F. Daqaq, Zaynab El Bernoussi, Daryl Fougnie, Borja Garcia de Soto, Alberto Gandolfi, Andras Gyorgy, Nizar Habash, J. Andrew Harris, Aaron Kaufman, Lefteris Kirousis, Korhan Kocak , et al. (14 additional authors not shown)

Abstract: The emergence of large language models has led to the development of powerful tools such as ChatGPT that can produce text indistinguishable from human-generated work. With the increasing accessibility of such technology, students across the globe may utilize it to help with their school work -- a possibility that has sparked discussions on the integrity of student evaluations in the age of artific… ▽ More The emergence of large language models has led to the development of powerful tools such as ChatGPT that can produce text indistinguishable from human-generated work. With the increasing accessibility of such technology, students across the globe may utilize it to help with their school work -- a possibility that has sparked discussions on the integrity of student evaluations in the age of artificial intelligence (AI). To date, it is unclear how such tools perform compared to students on university-level courses. Further, students' perspectives regarding the use of such tools, and educators' perspectives on treating their use as plagiarism, remain unknown. Here, we compare the performance of ChatGPT against students on 32 university-level courses. We also assess the degree to which its use can be detected by two classifiers designed specifically for this purpose. Additionally, we conduct a survey across five countries, as well as a more in-depth survey at the authors' institution, to discern students' and educators' perceptions of ChatGPT's use. We find that ChatGPT's performance is comparable, if not superior, to that of students in many courses. Moreover, current AI-text classifiers cannot reliably detect ChatGPT's use in school work, due to their propensity to classify human-written answers as AI-generated, as well as the ease with which AI-generated text can be edited to evade detection. Finally, we find an emerging consensus among students to use the tool, and among educators to treat this as plagiarism. Our findings offer insights that could guide policy discussions addressing the integration of AI into educational frameworks. △ Less

Submitted 7 May, 2023; originally announced May 2023.

Comments: 17 pages, 4 figures

arXiv:1702.02537 [pdf]

Soft Biometrics: Gender Recognition from Unconstrained Face Images using Local Feature Descriptor

Authors: Olasimbo Ayodeji Arigbabu, Sharifah Mumtazah Syed Ahmad, Wan Azizun Wan Adnan, Salman Yussof, Saif Mahmood

Abstract: Gender recognition from unconstrained face images is a challenging task due to the high degree of misalignment, pose, expression, and illumination variation. In previous works, the recognition of gender from unconstrained face images is approached by utilizing image alignment, exploiting multiple samples per individual to improve the learning ability of the classifier, or learning gender based on… ▽ More Gender recognition from unconstrained face images is a challenging task due to the high degree of misalignment, pose, expression, and illumination variation. In previous works, the recognition of gender from unconstrained face images is approached by utilizing image alignment, exploiting multiple samples per individual to improve the learning ability of the classifier, or learning gender based on prior knowledge about pose and demographic distributions of the dataset. However, image alignment increases the complexity and time of computation, while the use of multiple samples or having prior knowledge about data distribution is unrealistic in practical applications. This paper presents an approach for gender recognition from unconstrained face images. Our technique exploits the robustness of local feature descriptor to photometric variations to extract the shape description of the 2D face image using a single sample image per individual. The results obtained from experiments on Labeled Faces in the Wild (LFW) dataset describe the effectiveness of the proposed method. The essence of this study is to investigate the most suitable functions and parameter settings for recognizing gender from unconstrained face images. △ Less

Submitted 8 February, 2017; originally announced February 2017.

Journal ref: Journal of Information and Communication Technology (JICT), 2015

Showing 1–3 of 3 results for author: Adnan, W