Search | arXiv e-print repository

Embedding Aggregation for Forensic Facial Comparison

Authors: Rafael Oliveira Ribeiro, João C. R. Neves, Arnout C. C. Ruifrok, Flavio de Barros Vidal

Abstract: In forensic facial comparison, questioned-source images are usually captured in uncontrolled environments, with non-uniform lighting, and from non-cooperative subjects. The poor quality of such material usually compromises their value as evidence in legal matters. On the other hand, in forensic casework, multiple images of the person of interest are usually available. In this paper, we propose to… ▽ More In forensic facial comparison, questioned-source images are usually captured in uncontrolled environments, with non-uniform lighting, and from non-cooperative subjects. The poor quality of such material usually compromises their value as evidence in legal matters. On the other hand, in forensic casework, multiple images of the person of interest are usually available. In this paper, we propose to aggregate deep neural network embeddings from various images of the same person to improve performance in facial verification. We observe significant performance improvements, especially for very low-quality images. Further improvements are obtained by aggregating embeddings of more images and by applying quality-weighted aggregation. We demonstrate the benefits of this approach in forensic evaluation settings with the development and validation of score-based likelihood ratio systems and report improvements in Cllr of up to 95% (from 0.249 to 0.012) for CCTV images and of up to 96% (from 0.083 to 0.003) for social media images. △ Less

Submitted 29 April, 2023; originally announced May 2023.

Comments: 13 pages, 8 figures, submitted to Forensic Science International

ACM Class: I.4; I.5

arXiv:2304.13081 [pdf, other]

Organizational Governance of Emerging Technologies: AI Adoption in Healthcare

Authors: Jee Young Kim, William Boag, Freya Gulamali, Alifia Hasan, Henry David Jeffry Hogg, Mark Lifson, Deirdre Mulligan, Manesh Patel, Inioluwa Deborah Raji, Ajai Sehgal, Keo Shaw, Danny Tobey, Alexandra Valladares, David Vidal, Suresh Balu, Mark Sendak

Abstract: Private and public sector structures and norms refine how emerging technology is used in practice. In healthcare, despite a proliferation of AI adoption, the organizational governance surrounding its use and integration is often poorly understood. What the Health AI Partnership (HAIP) aims to do in this research is to better define the requirements for adequate organizational governance of AI syst… ▽ More Private and public sector structures and norms refine how emerging technology is used in practice. In healthcare, despite a proliferation of AI adoption, the organizational governance surrounding its use and integration is often poorly understood. What the Health AI Partnership (HAIP) aims to do in this research is to better define the requirements for adequate organizational governance of AI systems in healthcare settings and support health system leaders to make more informed decisions around AI adoption. To work towards this understanding, we first identify how the standards for the AI adoption in healthcare may be designed to be used easily and efficiently. Then, we map out the precise decision points involved in the practical institutional adoption of AI technology within specific health systems. Practically, we achieve this through a multi-organizational collaboration with leaders from major health systems across the United States and key informants from related fields. Working with the consultancy IDEO [dot] org, we were able to conduct usability-testing sessions with healthcare and AI ethics professionals. Usability analysis revealed a prototype structured around mock key decision points that align with how organizational leaders approach technology adoption. Concurrently, we conducted semi-structured interviews with 89 professionals in healthcare and other relevant fields. Using a modified grounded theory approach, we were able to identify 8 key decision points and comprehensive procedures throughout the AI adoption lifecycle. This is one of the most detailed qualitative analyses to date of the current governance structures and processes involved in AI adoption by health systems in the United States. We hope these findings can inform future efforts to build capabilities to promote the safe, effective, and responsible adoption of emerging technologies in healthcare. △ Less

Submitted 10 May, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

arXiv:2211.08190 [pdf]

Reconocimiento de Objetos a partir de Nube de Puntos en un Veículo Aéreo no Tripulado

Authors: Agustina Marion de Freitas Vidal, Anthony Rodriguez, Richard Suarez, André Kelbouscas, Ricardo Grando

Abstract: Currently, research in robotics, artificial intelligence and drones are advancing exponentially, they are directly or indirectly related to various areas of the economy, from agriculture to industry. With this context, this project covers these topics guiding them, seeking to provide a framework that is capable of hel** to develop new future researchers. For this, we use an aerial vehicle that w… ▽ More Currently, research in robotics, artificial intelligence and drones are advancing exponentially, they are directly or indirectly related to various areas of the economy, from agriculture to industry. With this context, this project covers these topics guiding them, seeking to provide a framework that is capable of hel** to develop new future researchers. For this, we use an aerial vehicle that works autonomously and is capable of map** the scenario and providing useful information to the end user. This occurs from a communication between a simple programming language (Scratch) and one of the most important and efficient robot operating systems today (ROS). This is how we managed to develop a tool capable of generating a 3D map and detecting objects using the camera attached to the drone. Although this tool can be used in the advanced fields of industry, it is also an important advance for the research sector. The implementation of this tool in intermediate-level institutions is aspired to provide the ability to carry out high-level projects from a simple programming language. △ Less

Submitted 23 October, 2022; originally announced November 2022.

Comments: in Spanish language. Articulo aceptado en la FEBITEC 2022

arXiv:2208.02760 [pdf, other]

OCFR 2022: Competition on Occluded Face Recognition From Synthetically Generated Structure-Aware Occlusions

Authors: Pedro C. Neto, Fadi Boutros, Joao Ribeiro Pinto, Naser Damer, Ana F. Sequeira, Jaime S. Cardoso, Messaoud Bengherabi, Abderaouf Bousnat, Sana Boucheta, Nesrine Hebbadj, Mustafa Ekrem Erakın, Uğur Demir, Hazım Kemal Ekenel, Pedro Beber de Queiroz Vidal, David Menotti

Abstract: This work summarizes the IJCB Occluded Face Recognition Competition 2022 (IJCB-OCFR-2022) embraced by the 2022 International Joint Conference on Biometrics (IJCB 2022). OCFR-2022 attracted a total of 3 participating teams, from academia. Eventually, six valid submissions were submitted and then evaluated by the organizers. The competition was held to address the challenge of face recognition in th… ▽ More This work summarizes the IJCB Occluded Face Recognition Competition 2022 (IJCB-OCFR-2022) embraced by the 2022 International Joint Conference on Biometrics (IJCB 2022). OCFR-2022 attracted a total of 3 participating teams, from academia. Eventually, six valid submissions were submitted and then evaluated by the organizers. The competition was held to address the challenge of face recognition in the presence of severe face occlusions. The participants were free to use any training data and the testing data was built by the organisers by synthetically occluding parts of the face images using a well-known dataset. The submitted solutions presented innovations and performed very competitively with the considered baseline. A major output of this competition is a challenging, realistic, and diverse, and publicly available occluded face recognition benchmark with well defined evaluation protocols. △ Less

Submitted 15 August, 2022; v1 submitted 4 August, 2022; originally announced August 2022.

Comments: Accepted at International Joint Conference on Biometrics 2022

arXiv:2207.00748 [pdf, other]

doi 10.1007/s10032-022-00406-7

Sequence-aware multimodal page classification of Brazilian legal documents

Authors: Pedro H. Luz de Araujo, Ana Paula G. S. de Almeida, Fabricio A. Braz, Nilton C. da Silva, Flavio de Barros Vidal, Teofilo E. de Campos

Abstract: The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases -- which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate ou… ▽ More The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases -- which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate our methods on a novel multimodal dataset of 6,510 lawsuits (339,478 pages) with manual annotation assigning each page to one of six classes. Each lawsuit is an ordered sequence of pages, which are stored both as an image and as a corresponding text extracted through optical character recognition. We first train two unimodal classifiers: a ResNet pre-trained on ImageNet is fine-tuned on the images, and a convolutional network with filters of multiple kernel sizes is trained from scratch on document texts. We use them as extractors of visual and textual features, which are then combined through our proposed Fusion Module. Our Fusion Module can handle missing textual or visual input by using learned embeddings for missing data. Moreover, we experiment with bi-directional Long Short-Term Memory (biLSTM) networks and linear-chain conditional random fields to model the sequential nature of the pages. The multimodal approaches outperform both textual and visual classifiers, especially when leveraging the sequential nature of the pages. △ Less

Submitted 15 July, 2022; v1 submitted 2 July, 2022; originally announced July 2022.

Comments: 11 pages, 6 figures. This preprint, which was originally written on 8 April 2021, has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in the International Journal on Document Analysis and Recognition, and is available online at https://doi.org/10.1007/s10032-022-00406-7 and https://rdcu.be/cRvvV

Journal ref: International Journal on Document Analysis and Recognition.2022

arXiv:2109.13885 [pdf, other]

Turning old models fashion again: Recycling classical CNN networks using the Lattice Transformation

Authors: Ana Paula G. S. de Almeida, Flavio de Barros Vidal

Abstract: In the early 1990s, the first signs of life of the CNN era were given: LeCun et al. proposed a CNN model trained by the backpropagation algorithm to classify low-resolution images of handwritten digits. Undoubtedly, it was a breakthrough in the field of computer vision. But with the rise of other classification methods, it fell out fashion. That was until 2012, when Krizhevsky et al. revived the i… ▽ More In the early 1990s, the first signs of life of the CNN era were given: LeCun et al. proposed a CNN model trained by the backpropagation algorithm to classify low-resolution images of handwritten digits. Undoubtedly, it was a breakthrough in the field of computer vision. But with the rise of other classification methods, it fell out fashion. That was until 2012, when Krizhevsky et al. revived the interest in CNNs by exhibiting considerably higher image classification accuracy on the ImageNet challenge. Since then, the complexity of the architectures are exponentially increasing and many structures are rapidly becoming obsolete. Using multistream networks as a base and the feature infusion precept, we explore the proposed LCNN cross-fusion strategy to use the backbones of former state-of-the-art networks on image classification in order to discover if the technique is able to put these designs back in the game. In this paper, we showed that we can obtain an increase of accuracy up to 63.21% on the NORB dataset we comparing with the original structure. However, no technique is definitive. While our goal is to try to reuse previous state-of-the-art architectures with few modifications, we also expose the disadvantages of our explored strategy. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Comments: 21 pages, 13 figures

MSC Class: 65D19; 68T07 ACM Class: I.2; I.4

arXiv:2103.03717 [pdf, other]

NemaNet: A convolutional neural network model for identification of nematodes soybean crop in brazil

Authors: Andre da Silva Abade, Lucas Faria Porto, Paulo Afonso Ferreira, Flavio de Barros Vidal

Abstract: Phytoparasitic nematodes (or phytonematodes) are causing severe damage to crops and generating large-scale economic losses worldwide. In soybean crops, annual losses are estimated at 10.6% of world production. Besides, identifying these species through microscopic analysis by an expert with taxonomy knowledge is often laborious, time-consuming, and susceptible to failure. In this perspective, robu… ▽ More Phytoparasitic nematodes (or phytonematodes) are causing severe damage to crops and generating large-scale economic losses worldwide. In soybean crops, annual losses are estimated at 10.6% of world production. Besides, identifying these species through microscopic analysis by an expert with taxonomy knowledge is often laborious, time-consuming, and susceptible to failure. In this perspective, robust and automatic approaches are necessary for identifying phytonematodes capable of providing correct diagnoses for the classification of species and subsidizing the taking of all control and prevention measures. This work presents a new public data set called NemaDataset containing 3,063 microscopic images from five nematode species with the most significant damage relevance for the soybean crop. Additionally, we propose a new Convolutional Neural Network (CNN) model defined as NemaNet and a comparative assessment with thirteen popular models of CNNs, all of them representing the state of the art classification and recognition. The general average calculated for each model, on a from-scratch training, the NemaNet model reached 96.99% accuracy, while the best evaluation fold reached 98.03%. In training with transfer learning, the average accuracy reached 98.88\%. The best evaluation fold reached 99.34% and achieve an overall accuracy improvement over 6.83% and 4.1%, for from-scratch and transfer learning training, respectively, when compared to other popular models. △ Less

Submitted 5 March, 2021; originally announced March 2021.

Comments: 21 pages, 13 figures

MSC Class: 65D19 ACM Class: I.4; I.2.10

arXiv:2009.04365 [pdf, other]

Plant Diseases recognition on images using Convolutional Neural Networks: A Systematic Review

Authors: Andre S. Abade, Paulo Afonso Ferreira, Flavio de Barros Vidal

Abstract: Plant diseases are considered one of the main factors influencing food production and minimize losses in production, and it is essential that crop diseases have fast detection and recognition. The recent expansion of deep learning methods has found its application in plant disease detection, offering a robust tool with highly accurate results. In this context, this work presents a systematic revie… ▽ More Plant diseases are considered one of the main factors influencing food production and minimize losses in production, and it is essential that crop diseases have fast detection and recognition. The recent expansion of deep learning methods has found its application in plant disease detection, offering a robust tool with highly accurate results. In this context, this work presents a systematic review of the literature that aims to identify the state of the art of the use of convolutional neural networks(CNN) in the process of identification and classification of plant diseases, delimiting trends, and indicating gaps. In this sense, we present 121 papers selected in the last ten years with different approaches to treat aspects related to disease detection, characteristics of the data set, the crops and pathogens investigated. From the results of the systematic review, it is possible to understand the innovative trends regarding the use of CNNs in the identification of plant diseases and to identify the gaps that need the attention of the research community. △ Less

Submitted 9 September, 2020; originally announced September 2020.

Comments: 47 pages, 11 figures

MSC Class: 65D19 ACM Class: I.4.9

arXiv:2008.00157 [pdf, other]

doi 10.1049/el.2019.2631

L-CNN: A Lattice cross-fusion strategy for multistream convolutional neural networks

Authors: Ana Paula G. S. de Almeida, Flavio de Barros Vidal

Abstract: This paper proposes a fusion strategy for multistream convolutional networks, the Lattice Cross Fusion. This approach crosses signals from convolution layers performing mathematical operation-based fusions right before pooling layers. Results on a purposely worsened CIFAR-10, a popular image classification data set, with a modified AlexNet-LCNN version show that this novel method outperforms by 46… ▽ More This paper proposes a fusion strategy for multistream convolutional networks, the Lattice Cross Fusion. This approach crosses signals from convolution layers performing mathematical operation-based fusions right before pooling layers. Results on a purposely worsened CIFAR-10, a popular image classification data set, with a modified AlexNet-LCNN version show that this novel method outperforms by 46% the baseline single stream network, with faster convergence, stability, and robustness. △ Less

Submitted 31 July, 2020; originally announced August 2020.

Comments: 5 pages, 3 figures

MSC Class: 68T07 ACM Class: I.2.10

Journal ref: Electronics Letters, vol. 55, no. 22, pp. 1180-1182, 2029

arXiv:2007.11053 [pdf, ps, other]

Curriculum Vitae Recommendation Based on Text Mining

Authors: Honorio Apaza Alanoca, Americo A. Rubin de Celis Vidal, Josimar Edinson Chire Saire

Abstract: During the last years, the development in diverse areas related to computer science and internet, allowed to generate new alternatives for decision making in the selection of personnel for state and private companies. In order to optimize this selection process, the recommendation systems are the most suitable for working with explicit information related to the likes and dislikes of employers or… ▽ More During the last years, the development in diverse areas related to computer science and internet, allowed to generate new alternatives for decision making in the selection of personnel for state and private companies. In order to optimize this selection process, the recommendation systems are the most suitable for working with explicit information related to the likes and dislikes of employers or end users, since this information allows to generate lists of recommendations based on collaboration or similarity of content. Therefore, this research takes as a basis these characteristics contained in the database of curricula and job offers, which correspond to the Peruvian ambit, which highlights the experience, knowledge and skills of each candidate, which are described in textual terms or words. This research focuses on the problem: how we can take advantage from the growth of unstructured information about job offers and curriculum vitae on different websites for CV recommendation. So, we use the techniques from Text Mining and Natural Language Processing. Then, as a relevant technique for the present study, we emphasize the technique frequency of the Term - Inverse Frequency of the documents (TF-IDF), which allows identifying the most relevant CVs in relation to a job offer of website through the average values (TF-IDF). So, the weighted value can be used as a qualification value of the relevant curriculum vitae for the recommendation. △ Less

Submitted 21 July, 2020; originally announced July 2020.

arXiv:1908.02353 [pdf, other]

Estimating sex and age for forensic applications using machine learning based on facial measurements from frontal cephalometric landmarks

Authors: Lucas F. Porto, Laise N. Correia Lima, Ademir Franco, Donald M. Pianto, Carlos Eduardo Machado Palhares, Donald M. Pianto, Flavio de Barros Vidal

Abstract: Facial analysis permits many investigations some of the most important of which are craniofacial identification, facial recognition, and age and sex estimation. In forensics, photo-anthropometry describes the study of facial growth and allows the identification of patterns in facial skull development by using a group of cephalometric landmarks to estimate anthropological information. In several ar… ▽ More Facial analysis permits many investigations some of the most important of which are craniofacial identification, facial recognition, and age and sex estimation. In forensics, photo-anthropometry describes the study of facial growth and allows the identification of patterns in facial skull development by using a group of cephalometric landmarks to estimate anthropological information. In several areas, automation of manual procedures has achieved advantages over and similar measurement confidence as a forensic expert. This manuscript presents an approach using photo-anthropometric indexes, generated from frontal faces cephalometric landmarks, to create an artificial neural network classifier that allows the estimation of anthropological information, in this specific case age and sex. The work is focused on four tasks: i) sex estimation over ages from 5 to 22 years old, evaluating the interference of age on sex estimation; ii) age estimation from photo-anthropometric indexes for four age intervals (1 year, 2 years, 4 years and 5 years); iii) age group estimation for thresholds of over 14 and over 18 years old; and; iv) the provision of a new data set, available for academic purposes only, with a large and complete set of facial photo-anthropometric points marked and checked by forensic experts, measured from over 18,000 faces of individuals from Brazil over the last 4 years. The proposed classifier obtained significant results, using this new data set, for the sex estimation of individuals over 14 years old, achieving accuracy values greater than 0.85 by the F_1 measure. For age estimation, the accuracy results are 0.72 for measure with an age interval of 5 years. For the age group estimation, the measures of accuracy are greater than 0.93 and 0.83 for thresholds of 14 and 18 years, respectively. △ Less

Submitted 6 August, 2019; originally announced August 2019.

Comments: 17 pages, 17 figures

arXiv:1904.10816 [pdf, other]

Automatic cephalometric landmarks detection on frontal faces: an approach based on supervised learning techniques

Authors: Lucas Faria Porto, Laise Nascimento Correia Lima, Marta Flores, Andrea Valsecchi, Oscar Ibanez, Carlos Eduardo Machado Palhares, Flavio de Barros Vidal

Abstract: Facial landmarks are employed in many research areas such as facial recognition, craniofacial identification, age and sex estimation among the most important. In the forensic field, the focus is on the analysis of a particular set of facial landmarks, defined as cephalometric landmarks. Previous works demonstrated that the descriptive adequacy of these anatomical references for an indirect applica… ▽ More Facial landmarks are employed in many research areas such as facial recognition, craniofacial identification, age and sex estimation among the most important. In the forensic field, the focus is on the analysis of a particular set of facial landmarks, defined as cephalometric landmarks. Previous works demonstrated that the descriptive adequacy of these anatomical references for an indirect application (photo-anthropometric description) increased the marking precision of these points, contributing to a greater reliability of these analyzes. However, most of them are performed manually and all of them are subjectivity inherent to the expert examiners. In this sense, the purpose of this work is the development and validation of automatic techniques to detect cephalometric landmarks from digital images of frontal faces in forensic field. The presented approach uses a combination of computer vision and image processing techniques within a supervised learning procedures. The proposed methodology obtains similar precision to a group of human manual cephalometric reference markers and result to be more accurate against others state-of-the-art facial landmark detection frameworks. It achieves a normalized mean distance (in pixel) error of 0.014, similar to the mean inter-expert dispersion (0.009) and clearly better than other automatic approaches also analyzed along of this work (0.026 and 0.101). △ Less

Submitted 24 April, 2019; originally announced April 2019.

Comments: 24 pages, 6 figures

MSC Class: 68T45

Showing 1–12 of 12 results for author: Vidal, D