-
Sequence-aware multimodal page classification of Brazilian legal documents
Authors:
Pedro H. Luz de Araujo,
Ana Paula G. S. de Almeida,
Fabricio A. Braz,
Nilton C. da Silva,
Flavio de Barros Vidal,
Teofilo E. de Campos
Abstract:
The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases -- which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate ou…
▽ More
The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases -- which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate our methods on a novel multimodal dataset of 6,510 lawsuits (339,478 pages) with manual annotation assigning each page to one of six classes. Each lawsuit is an ordered sequence of pages, which are stored both as an image and as a corresponding text extracted through optical character recognition. We first train two unimodal classifiers: a ResNet pre-trained on ImageNet is fine-tuned on the images, and a convolutional network with filters of multiple kernel sizes is trained from scratch on document texts. We use them as extractors of visual and textual features, which are then combined through our proposed Fusion Module. Our Fusion Module can handle missing textual or visual input by using learned embeddings for missing data. Moreover, we experiment with bi-directional Long Short-Term Memory (biLSTM) networks and linear-chain conditional random fields to model the sequential nature of the pages. The multimodal approaches outperform both textual and visual classifiers, especially when leveraging the sequential nature of the pages.
△ Less
Submitted 15 July, 2022; v1 submitted 2 July, 2022;
originally announced July 2022.
-
Turning old models fashion again: Recycling classical CNN networks using the Lattice Transformation
Authors:
Ana Paula G. S. de Almeida,
Flavio de Barros Vidal
Abstract:
In the early 1990s, the first signs of life of the CNN era were given: LeCun et al. proposed a CNN model trained by the backpropagation algorithm to classify low-resolution images of handwritten digits. Undoubtedly, it was a breakthrough in the field of computer vision. But with the rise of other classification methods, it fell out fashion. That was until 2012, when Krizhevsky et al. revived the i…
▽ More
In the early 1990s, the first signs of life of the CNN era were given: LeCun et al. proposed a CNN model trained by the backpropagation algorithm to classify low-resolution images of handwritten digits. Undoubtedly, it was a breakthrough in the field of computer vision. But with the rise of other classification methods, it fell out fashion. That was until 2012, when Krizhevsky et al. revived the interest in CNNs by exhibiting considerably higher image classification accuracy on the ImageNet challenge. Since then, the complexity of the architectures are exponentially increasing and many structures are rapidly becoming obsolete. Using multistream networks as a base and the feature infusion precept, we explore the proposed LCNN cross-fusion strategy to use the backbones of former state-of-the-art networks on image classification in order to discover if the technique is able to put these designs back in the game. In this paper, we showed that we can obtain an increase of accuracy up to 63.21% on the NORB dataset we comparing with the original structure. However, no technique is definitive. While our goal is to try to reuse previous state-of-the-art architectures with few modifications, we also expose the disadvantages of our explored strategy.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
L-CNN: A Lattice cross-fusion strategy for multistream convolutional neural networks
Authors:
Ana Paula G. S. de Almeida,
Flavio de Barros Vidal
Abstract:
This paper proposes a fusion strategy for multistream convolutional networks, the Lattice Cross Fusion. This approach crosses signals from convolution layers performing mathematical operation-based fusions right before pooling layers. Results on a purposely worsened CIFAR-10, a popular image classification data set, with a modified AlexNet-LCNN version show that this novel method outperforms by 46…
▽ More
This paper proposes a fusion strategy for multistream convolutional networks, the Lattice Cross Fusion. This approach crosses signals from convolution layers performing mathematical operation-based fusions right before pooling layers. Results on a purposely worsened CIFAR-10, a popular image classification data set, with a modified AlexNet-LCNN version show that this novel method outperforms by 46% the baseline single stream network, with faster convergence, stability, and robustness.
△ Less
Submitted 31 July, 2020;
originally announced August 2020.
-
Document classification using a Bi-LSTM to unclog Brazil's supreme court
Authors:
Fabricio Ataides Braz,
Nilton Correia da Silva,
Teofilo Emidio de Campos,
Felipe Borges S. Chaves,
Marcelo H. S. Ferreira,
Pedro Henrique Inazawa,
Victor H. D. Coelho,
Bernardo Pablo Sukiennik,
Ana Paula Goncalves Soares de Almeida,
Flavio Barros Vidal,
Davi Alves Bezerra,
Davi B. Gusmao,
Gabriel G. Ziegler,
Ricardo V. C. Fernandes,
Roberta Zumblick,
Fabiano Hartmann Peixoto
Abstract:
The Brazilian court system is currently the most clogged up judiciary system in the world. Thousands of lawsuit cases reach the supreme court every day. These cases need to be analyzed in order to be associated to relevant tags and allocated to the right team. Most of the cases reach the court as raster scanned documents with widely variable levels of quality. One of the first steps for the analys…
▽ More
The Brazilian court system is currently the most clogged up judiciary system in the world. Thousands of lawsuit cases reach the supreme court every day. These cases need to be analyzed in order to be associated to relevant tags and allocated to the right team. Most of the cases reach the court as raster scanned documents with widely variable levels of quality. One of the first steps for the analysis is to classify these documents. In this paper we present a Bidirectional Long Short-Term Memory network (Bi-LSTM) to classify these pieces of legal document.
△ Less
Submitted 27 November, 2018;
originally announced November 2018.