FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Authors:
Guillaume Jaume,
Hazim Kemal Ekenel,
Jean-Philippe Thiran
Abstract:
We present a new dataset for form understanding in noisy scanned documents (FUNSD) that aims at extracting and structuring the textual content of forms. The dataset comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely in appearance, making form understanding (FoUn) a challenging task. The proposed dataset can be used for various tasks, including text detectio…
▽ More
We present a new dataset for form understanding in noisy scanned documents (FUNSD) that aims at extracting and structuring the textual content of forms. The dataset comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely in appearance, making form understanding (FoUn) a challenging task. The proposed dataset can be used for various tasks, including text detection, optical character recognition, spatial layout analysis, and entity labeling/linking. To the best of our knowledge, this is the first publicly available dataset with comprehensive annotations to address FoUn task. We also present a set of baselines and introduce metrics to evaluate performance on the FUNSD dataset, which can be downloaded at https://guillaumejaume.github.io/FUNSD/.
△ Less
Submitted 29 October, 2019; v1 submitted 27 May, 2019;
originally announced May 2019.
Strengths and Weaknesses of Deep Learning Models for Face Recognition Against Image Degradations
Authors:
Klemen Grm,
Vitomir Štruc,
Anais Artiges,
Matthieu Caron,
Hazim Kemal Ekenel
Abstract:
Deep convolutional neural networks (CNNs) based approaches are the state-of-the-art in various computer vision tasks, including face recognition. Considerable research effort is currently being directed towards further improving deep CNNs by focusing on more powerful model architectures and better learning techniques. However, studies systematically exploring the strengths and weaknesses of existi…
▽ More
Deep convolutional neural networks (CNNs) based approaches are the state-of-the-art in various computer vision tasks, including face recognition. Considerable research effort is currently being directed towards further improving deep CNNs by focusing on more powerful model architectures and better learning techniques. However, studies systematically exploring the strengths and weaknesses of existing deep models for face recognition are still relatively scarce in the literature. In this paper, we try to fill this gap and study the effects of different covariates on the verification performance of four recent deep CNN models using the Labeled Faces in the Wild (LFW) dataset. Specifically, we investigate the influence of covariates related to: image quality -- blur, JPEG compression, occlusion, noise, image brightness, contrast, missing pixels; and model characteristics -- CNN architecture, color information, descriptor computation; and analyze their impact on the face verification performance of AlexNet, VGG-Face, GoogLeNet, and SqueezeNet. Based on comprehensive and rigorous experimentation, we identify the strengths and weaknesses of the deep learning models, and present key areas for potential future research. Our results indicate that high levels of noise, blur, missing pixels, and brightness have a detrimental effect on the verification performance of all models, whereas the impact of contrast changes and compression artifacts is limited. It has been found that the descriptor computation strategy and color information does not have a significant influence on performance.
△ Less
Submitted 4 October, 2017;
originally announced October 2017.