Search | arXiv e-print repository

Embedding Aggregation for Forensic Facial Comparison

Authors: Rafael Oliveira Ribeiro, João C. R. Neves, Arnout C. C. Ruifrok, Flavio de Barros Vidal

Abstract: In forensic facial comparison, questioned-source images are usually captured in uncontrolled environments, with non-uniform lighting, and from non-cooperative subjects. The poor quality of such material usually compromises their value as evidence in legal matters. On the other hand, in forensic casework, multiple images of the person of interest are usually available. In this paper, we propose to… ▽ More In forensic facial comparison, questioned-source images are usually captured in uncontrolled environments, with non-uniform lighting, and from non-cooperative subjects. The poor quality of such material usually compromises their value as evidence in legal matters. On the other hand, in forensic casework, multiple images of the person of interest are usually available. In this paper, we propose to aggregate deep neural network embeddings from various images of the same person to improve performance in facial verification. We observe significant performance improvements, especially for very low-quality images. Further improvements are obtained by aggregating embeddings of more images and by applying quality-weighted aggregation. We demonstrate the benefits of this approach in forensic evaluation settings with the development and validation of score-based likelihood ratio systems and report improvements in Cllr of up to 95% (from 0.249 to 0.012) for CCTV images and of up to 96% (from 0.083 to 0.003) for social media images. △ Less

Submitted 29 April, 2023; originally announced May 2023.

Comments: 13 pages, 8 figures, submitted to Forensic Science International

ACM Class: I.4; I.5

arXiv:2207.00748 [pdf, other]

doi 10.1007/s10032-022-00406-7

Sequence-aware multimodal page classification of Brazilian legal documents

Authors: Pedro H. Luz de Araujo, Ana Paula G. S. de Almeida, Fabricio A. Braz, Nilton C. da Silva, Flavio de Barros Vidal, Teofilo E. de Campos

Abstract: The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases -- which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate ou… ▽ More The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases -- which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate our methods on a novel multimodal dataset of 6,510 lawsuits (339,478 pages) with manual annotation assigning each page to one of six classes. Each lawsuit is an ordered sequence of pages, which are stored both as an image and as a corresponding text extracted through optical character recognition. We first train two unimodal classifiers: a ResNet pre-trained on ImageNet is fine-tuned on the images, and a convolutional network with filters of multiple kernel sizes is trained from scratch on document texts. We use them as extractors of visual and textual features, which are then combined through our proposed Fusion Module. Our Fusion Module can handle missing textual or visual input by using learned embeddings for missing data. Moreover, we experiment with bi-directional Long Short-Term Memory (biLSTM) networks and linear-chain conditional random fields to model the sequential nature of the pages. The multimodal approaches outperform both textual and visual classifiers, especially when leveraging the sequential nature of the pages. △ Less

Submitted 15 July, 2022; v1 submitted 2 July, 2022; originally announced July 2022.

Comments: 11 pages, 6 figures. This preprint, which was originally written on 8 April 2021, has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in the International Journal on Document Analysis and Recognition, and is available online at https://doi.org/10.1007/s10032-022-00406-7 and https://rdcu.be/cRvvV

Journal ref: International Journal on Document Analysis and Recognition.2022

arXiv:2109.13885 [pdf, other]

Turning old models fashion again: Recycling classical CNN networks using the Lattice Transformation

Authors: Ana Paula G. S. de Almeida, Flavio de Barros Vidal

Abstract: In the early 1990s, the first signs of life of the CNN era were given: LeCun et al. proposed a CNN model trained by the backpropagation algorithm to classify low-resolution images of handwritten digits. Undoubtedly, it was a breakthrough in the field of computer vision. But with the rise of other classification methods, it fell out fashion. That was until 2012, when Krizhevsky et al. revived the i… ▽ More In the early 1990s, the first signs of life of the CNN era were given: LeCun et al. proposed a CNN model trained by the backpropagation algorithm to classify low-resolution images of handwritten digits. Undoubtedly, it was a breakthrough in the field of computer vision. But with the rise of other classification methods, it fell out fashion. That was until 2012, when Krizhevsky et al. revived the interest in CNNs by exhibiting considerably higher image classification accuracy on the ImageNet challenge. Since then, the complexity of the architectures are exponentially increasing and many structures are rapidly becoming obsolete. Using multistream networks as a base and the feature infusion precept, we explore the proposed LCNN cross-fusion strategy to use the backbones of former state-of-the-art networks on image classification in order to discover if the technique is able to put these designs back in the game. In this paper, we showed that we can obtain an increase of accuracy up to 63.21% on the NORB dataset we comparing with the original structure. However, no technique is definitive. While our goal is to try to reuse previous state-of-the-art architectures with few modifications, we also expose the disadvantages of our explored strategy. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Comments: 21 pages, 13 figures

MSC Class: 65D19; 68T07 ACM Class: I.2; I.4

arXiv:2103.03717 [pdf, other]

NemaNet: A convolutional neural network model for identification of nematodes soybean crop in brazil

Authors: Andre da Silva Abade, Lucas Faria Porto, Paulo Afonso Ferreira, Flavio de Barros Vidal

Abstract: Phytoparasitic nematodes (or phytonematodes) are causing severe damage to crops and generating large-scale economic losses worldwide. In soybean crops, annual losses are estimated at 10.6% of world production. Besides, identifying these species through microscopic analysis by an expert with taxonomy knowledge is often laborious, time-consuming, and susceptible to failure. In this perspective, robu… ▽ More Phytoparasitic nematodes (or phytonematodes) are causing severe damage to crops and generating large-scale economic losses worldwide. In soybean crops, annual losses are estimated at 10.6% of world production. Besides, identifying these species through microscopic analysis by an expert with taxonomy knowledge is often laborious, time-consuming, and susceptible to failure. In this perspective, robust and automatic approaches are necessary for identifying phytonematodes capable of providing correct diagnoses for the classification of species and subsidizing the taking of all control and prevention measures. This work presents a new public data set called NemaDataset containing 3,063 microscopic images from five nematode species with the most significant damage relevance for the soybean crop. Additionally, we propose a new Convolutional Neural Network (CNN) model defined as NemaNet and a comparative assessment with thirteen popular models of CNNs, all of them representing the state of the art classification and recognition. The general average calculated for each model, on a from-scratch training, the NemaNet model reached 96.99% accuracy, while the best evaluation fold reached 98.03%. In training with transfer learning, the average accuracy reached 98.88\%. The best evaluation fold reached 99.34% and achieve an overall accuracy improvement over 6.83% and 4.1%, for from-scratch and transfer learning training, respectively, when compared to other popular models. △ Less

Submitted 5 March, 2021; originally announced March 2021.

Comments: 21 pages, 13 figures

MSC Class: 65D19 ACM Class: I.4; I.2.10

arXiv:2009.04365 [pdf, other]

Plant Diseases recognition on images using Convolutional Neural Networks: A Systematic Review

Authors: Andre S. Abade, Paulo Afonso Ferreira, Flavio de Barros Vidal

Abstract: Plant diseases are considered one of the main factors influencing food production and minimize losses in production, and it is essential that crop diseases have fast detection and recognition. The recent expansion of deep learning methods has found its application in plant disease detection, offering a robust tool with highly accurate results. In this context, this work presents a systematic revie… ▽ More Plant diseases are considered one of the main factors influencing food production and minimize losses in production, and it is essential that crop diseases have fast detection and recognition. The recent expansion of deep learning methods has found its application in plant disease detection, offering a robust tool with highly accurate results. In this context, this work presents a systematic review of the literature that aims to identify the state of the art of the use of convolutional neural networks(CNN) in the process of identification and classification of plant diseases, delimiting trends, and indicating gaps. In this sense, we present 121 papers selected in the last ten years with different approaches to treat aspects related to disease detection, characteristics of the data set, the crops and pathogens investigated. From the results of the systematic review, it is possible to understand the innovative trends regarding the use of CNNs in the identification of plant diseases and to identify the gaps that need the attention of the research community. △ Less

Submitted 9 September, 2020; originally announced September 2020.

Comments: 47 pages, 11 figures

MSC Class: 65D19 ACM Class: I.4.9

arXiv:2008.00157 [pdf, other]

doi 10.1049/el.2019.2631

L-CNN: A Lattice cross-fusion strategy for multistream convolutional neural networks

Authors: Ana Paula G. S. de Almeida, Flavio de Barros Vidal

Abstract: This paper proposes a fusion strategy for multistream convolutional networks, the Lattice Cross Fusion. This approach crosses signals from convolution layers performing mathematical operation-based fusions right before pooling layers. Results on a purposely worsened CIFAR-10, a popular image classification data set, with a modified AlexNet-LCNN version show that this novel method outperforms by 46… ▽ More This paper proposes a fusion strategy for multistream convolutional networks, the Lattice Cross Fusion. This approach crosses signals from convolution layers performing mathematical operation-based fusions right before pooling layers. Results on a purposely worsened CIFAR-10, a popular image classification data set, with a modified AlexNet-LCNN version show that this novel method outperforms by 46% the baseline single stream network, with faster convergence, stability, and robustness. △ Less

Submitted 31 July, 2020; originally announced August 2020.

Comments: 5 pages, 3 figures

MSC Class: 68T07 ACM Class: I.2.10

Journal ref: Electronics Letters, vol. 55, no. 22, pp. 1180-1182, 2029

arXiv:1908.02353 [pdf, other]

Estimating sex and age for forensic applications using machine learning based on facial measurements from frontal cephalometric landmarks

Authors: Lucas F. Porto, Laise N. Correia Lima, Ademir Franco, Donald M. Pianto, Carlos Eduardo Machado Palhares, Donald M. Pianto, Flavio de Barros Vidal

Abstract: Facial analysis permits many investigations some of the most important of which are craniofacial identification, facial recognition, and age and sex estimation. In forensics, photo-anthropometry describes the study of facial growth and allows the identification of patterns in facial skull development by using a group of cephalometric landmarks to estimate anthropological information. In several ar… ▽ More Facial analysis permits many investigations some of the most important of which are craniofacial identification, facial recognition, and age and sex estimation. In forensics, photo-anthropometry describes the study of facial growth and allows the identification of patterns in facial skull development by using a group of cephalometric landmarks to estimate anthropological information. In several areas, automation of manual procedures has achieved advantages over and similar measurement confidence as a forensic expert. This manuscript presents an approach using photo-anthropometric indexes, generated from frontal faces cephalometric landmarks, to create an artificial neural network classifier that allows the estimation of anthropological information, in this specific case age and sex. The work is focused on four tasks: i) sex estimation over ages from 5 to 22 years old, evaluating the interference of age on sex estimation; ii) age estimation from photo-anthropometric indexes for four age intervals (1 year, 2 years, 4 years and 5 years); iii) age group estimation for thresholds of over 14 and over 18 years old; and; iv) the provision of a new data set, available for academic purposes only, with a large and complete set of facial photo-anthropometric points marked and checked by forensic experts, measured from over 18,000 faces of individuals from Brazil over the last 4 years. The proposed classifier obtained significant results, using this new data set, for the sex estimation of individuals over 14 years old, achieving accuracy values greater than 0.85 by the F_1 measure. For age estimation, the accuracy results are 0.72 for measure with an age interval of 5 years. For the age group estimation, the measures of accuracy are greater than 0.93 and 0.83 for thresholds of 14 and 18 years, respectively. △ Less

Submitted 6 August, 2019; originally announced August 2019.

Comments: 17 pages, 17 figures

arXiv:1904.10816 [pdf, other]

Automatic cephalometric landmarks detection on frontal faces: an approach based on supervised learning techniques

Authors: Lucas Faria Porto, Laise Nascimento Correia Lima, Marta Flores, Andrea Valsecchi, Oscar Ibanez, Carlos Eduardo Machado Palhares, Flavio de Barros Vidal

Abstract: Facial landmarks are employed in many research areas such as facial recognition, craniofacial identification, age and sex estimation among the most important. In the forensic field, the focus is on the analysis of a particular set of facial landmarks, defined as cephalometric landmarks. Previous works demonstrated that the descriptive adequacy of these anatomical references for an indirect applica… ▽ More Facial landmarks are employed in many research areas such as facial recognition, craniofacial identification, age and sex estimation among the most important. In the forensic field, the focus is on the analysis of a particular set of facial landmarks, defined as cephalometric landmarks. Previous works demonstrated that the descriptive adequacy of these anatomical references for an indirect application (photo-anthropometric description) increased the marking precision of these points, contributing to a greater reliability of these analyzes. However, most of them are performed manually and all of them are subjectivity inherent to the expert examiners. In this sense, the purpose of this work is the development and validation of automatic techniques to detect cephalometric landmarks from digital images of frontal faces in forensic field. The presented approach uses a combination of computer vision and image processing techniques within a supervised learning procedures. The proposed methodology obtains similar precision to a group of human manual cephalometric reference markers and result to be more accurate against others state-of-the-art facial landmark detection frameworks. It achieves a normalized mean distance (in pixel) error of 0.014, similar to the mean inter-expert dispersion (0.009) and clearly better than other automatic approaches also analyzed along of this work (0.026 and 0.101). △ Less

Submitted 24 April, 2019; originally announced April 2019.

Comments: 24 pages, 6 figures

MSC Class: 68T45

Showing 1–8 of 8 results for author: Vidal, F d B