Search | arXiv e-print repository

arXiv:2401.06790 [pdf, other]

Using Zero-shot Prompting in the Automatic Creation and Expansion of Topic Taxonomies for Tagging Retail Banking Transactions

Authors: Daniel de S. Moraes, Pedro T. C. Santos, Polyana B. da Costa, Matheus A. S. Pinto, Ivan de J. P. Pinto, Álvaro M. G. da Veiga, Sergio Colcher, Antonio J. G. Busson, Rafael H. Rocha, Rennan Gaio, Rafael Miceli, Gabriela Tourinho, Marcos Rabaioli, Leandro Santos, Fellipe Marques, David Favaro

Abstract: This work presents an unsupervised method for automatically constructing and expanding topic taxonomies using instruction-based fine-tuned LLMs (Large Language Models). We apply topic modeling and keyword extraction techniques to create initial topic taxonomies and LLMs to post-process the resulting terms and create a hierarchy. To expand an existing taxonomy with new terms, we use zero-shot promp… ▽ More This work presents an unsupervised method for automatically constructing and expanding topic taxonomies using instruction-based fine-tuned LLMs (Large Language Models). We apply topic modeling and keyword extraction techniques to create initial topic taxonomies and LLMs to post-process the resulting terms and create a hierarchy. To expand an existing taxonomy with new terms, we use zero-shot prompting to find out where to add new nodes, which, to our knowledge, is the first work to present such an approach to taxonomy tasks. We use the resulting taxonomies to assign tags that characterize merchants from a retail bank dataset. To evaluate our work, we asked 12 volunteers to answer a two-part form in which we first assessed the quality of the taxonomies created and then the tags assigned to merchants based on that taxonomy. The evaluation revealed a coherence rate exceeding 90% for the chosen taxonomies. The taxonomies' expansion with LLMs also showed exciting results for parent node prediction, with an f1-score above 70% in our taxonomies. △ Less

Submitted 11 February, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

arXiv:2312.07730 [pdf, other]

doi 10.5753/bwaif.2023.229322

Hierarchical Classification of Financial Transactions Through Context-Fusion of Transformer-based Embeddings and Taxonomy-aware Attention Layer

Authors: Antonio J. G. Busson, Rafael Rocha, Rennan Gaio, Rafael Miceli, Ivan Pereira, Daniel de S. Moraes, Sérgio Colcher, Alvaro Veiga, Bruno Rizzi, Francisco Evangelista, Leandro Santos, Fellipe Marques, Marcos Rabaioli, Diego Feldberg, Debora Mattos, João Pasqua, Diogo Dias

Abstract: This work proposes the Two-headed DragoNet, a Transformer-based model for hierarchical multi-label classification of financial transactions. Our model is based on a stack of Transformers encoder layers that generate contextual embeddings from two short textual descriptors (merchant name and business activity), followed by a Context Fusion layer and two output heads that classify transactions accor… ▽ More This work proposes the Two-headed DragoNet, a Transformer-based model for hierarchical multi-label classification of financial transactions. Our model is based on a stack of Transformers encoder layers that generate contextual embeddings from two short textual descriptors (merchant name and business activity), followed by a Context Fusion layer and two output heads that classify transactions according to a hierarchical two-level taxonomy (macro and micro categories). Finally, our proposed Taxonomy-aware Attention Layer corrects predictions that break categorical hierarchy rules defined in the given taxonomy. Our proposal outperforms classical machine learning methods in experiments of macro-category classification by achieving an F1-score of 93\% on a card dataset and 95% on a current account dataset. △ Less

Submitted 12 December, 2023; originally announced December 2023.

arXiv:2312.07721 [pdf, other]

doi 10.5753/webmedia_estendido.2023.234354

Saturn Platform: Foundation Model Operations and Generative AI for Financial Services

Authors: Antonio J. G. Busson, Rennan Gaio, Rafael H. Rocha, Francisco Evangelista, Bruno Rizzi, Luan Carvalho, Rafael Miceli, Marcos Rabaioli, David Favaro

Abstract: Saturn is an innovative platform that assists Foundation Model (FM) building and its integration with IT operations (Ops). It is custom-made to meet the requirements of data scientists, enabling them to effectively create and implement FMs while enhancing collaboration within their technical domain. By offering a wide range of tools and features, Saturn streamlines and automates different stages o… ▽ More Saturn is an innovative platform that assists Foundation Model (FM) building and its integration with IT operations (Ops). It is custom-made to meet the requirements of data scientists, enabling them to effectively create and implement FMs while enhancing collaboration within their technical domain. By offering a wide range of tools and features, Saturn streamlines and automates different stages of FM development, making it an invaluable asset for data science teams. This white paper introduces prospective applications of generative AI models derived from FMs in the financial sector. △ Less

Submitted 12 December, 2023; originally announced December 2023.

arXiv:2010.11732 [pdf, other]

doi 10.1145/3428658.3430967

A Cluster-Matching-Based Method for Video Face Recognition

Authors: Paulo R C Mendes, Antonio J G Busson, Sérgio Colcher, Daniel Schwabe, Álan L V Guedes, Carlos Laufer

Abstract: Face recognition systems are present in many modern solutions and thousands of applications in our daily lives. However, current solutions are not easily scalable, especially when it comes to the addition of new targeted people. We propose a cluster-matching-based approach for face recognition in video. In our approach, we use unsupervised learning to cluster the faces present in both the dataset… ▽ More Face recognition systems are present in many modern solutions and thousands of applications in our daily lives. However, current solutions are not easily scalable, especially when it comes to the addition of new targeted people. We propose a cluster-matching-based approach for face recognition in video. In our approach, we use unsupervised learning to cluster the faces present in both the dataset and targeted videos selected for face recognition. Moreover, we design a cluster matching heuristic to associate clusters in both sets that is also capable of identifying when a face belongs to a non-registered person. Our method has achieved a recall of 99.435% and a precision of 99.131% in the task of video face recognition. Besides performing face recognition, it can also be used to determine the video segments where each person is present. △ Less

Submitted 19 October, 2020; originally announced October 2020.

Comments: 13 pages

arXiv:2010.05760 [pdf, other]

Video Quality Enhancement Using Deep Learning-Based Prediction Models for Quantized DCT Coefficients in MPEG I-frames

Authors: Antonio J G Busson, Paulo R C Mendes, Daniel de S Moraes, Álvaro M da Veiga, Álan L V Guedes, Sérgio Colcher

Abstract: Recent works have successfully applied some types of Convolutional Neural Networks (CNNs) to reduce the noticeable distortion resulting from the lossy JPEG/MPEG compression technique. Most of them are built upon the processing made on the spatial domain. In this work, we propose a MPEG video decoder that is purely based on the frequency-to-frequency domain: it reads the quantized DCT coefficients… ▽ More Recent works have successfully applied some types of Convolutional Neural Networks (CNNs) to reduce the noticeable distortion resulting from the lossy JPEG/MPEG compression technique. Most of them are built upon the processing made on the spatial domain. In this work, we propose a MPEG video decoder that is purely based on the frequency-to-frequency domain: it reads the quantized DCT coefficients received from a low-quality I-frames bitstream and, using a deep learning-based model, predicts the missing coefficients in order to recompose the same frames with enhanced quality. In experiments with a video dataset, our best model was able to improve from frames with quantized DCT coefficients corresponding to a Quality Factor (QF) of 10 to enhanced quality frames with QF slightly near to 20. △ Less

Submitted 9 October, 2020; originally announced October 2020.

arXiv:2010.04676 [pdf, other]

A Clustering-Based Method for Automatic Educational Video Recommendation Using Deep Face-Features of Lecturers

Authors: Paulo R. C. Mendes, Eduardo S. Vieira, Álan L. V. Guedes, Antonio J. G. Busson, Sérgio Colcher

Abstract: Discovering and accessing specific content within educational video bases is a challenging task, mainly because of the abundance of video content and its diversity. Recommender systems are often used to enhance the ability to find and select content. But, recommendation mechanisms, especially those based on textual information, exhibit some limitations, such as being error-prone to manually create… ▽ More Discovering and accessing specific content within educational video bases is a challenging task, mainly because of the abundance of video content and its diversity. Recommender systems are often used to enhance the ability to find and select content. But, recommendation mechanisms, especially those based on textual information, exhibit some limitations, such as being error-prone to manually created keywords or due to imprecise speech recognition. This paper presents a method for generating educational video recommendation using deep face-features of lecturers without identifying them. More precisely, we use an unsupervised face clustering mechanism to create relations among the videos based on the lecturer's presence. Then, for a selected educational video taken as a reference, we recommend the ones where the presence of the same lecturers is detected. Moreover, we rank these recommended videos based on the amount of time the referenced lecturers were present. For this task, we achieved a mAP value of 99.165%. △ Less

Submitted 9 October, 2020; originally announced October 2020.

arXiv:2005.03626 [pdf, other]

Seismic Shot Gather Noise Localization Using a Multi-Scale Feature-Fusion-Based Neural Network

Authors: Antonio José G. Busson, Sérgio Colcher, Ruy Luiz Milidiú, Bruno Pereira Dias, André Bulcão

Abstract: Deep learning-based models, such as convolutional neural networks, have advanced various segments of computer vision. However, this technology is rarely applied to seismic shot gather noise localization problem. This letter presents an investigation on the effectiveness of a multi-scale feature-fusion-based network for seismic shot-gather noise localization. Herein, we describe the following: (1)… ▽ More Deep learning-based models, such as convolutional neural networks, have advanced various segments of computer vision. However, this technology is rarely applied to seismic shot gather noise localization problem. This letter presents an investigation on the effectiveness of a multi-scale feature-fusion-based network for seismic shot-gather noise localization. Herein, we describe the following: (1) the construction of a real-world dataset of seismic noise localization based on 6,500 seismograms; (2) a multi-scale feature-fusion-based detector that uses the MobileNet combined with the Feature Pyramid Net as the backbone; and (3) the Single Shot multi-box detector for box classification/regression. Additionally, we propose the use of the Focal Loss function that improves the detector's prediction accuracy. The proposed detector achieves an [email protected] of 78.67\% in our empirical evaluation. △ Less

Submitted 7 May, 2020; originally announced May 2020.

arXiv:1912.01148 [pdf, other]

A Deep Convolutional Network for Seismic Shot-Gather Image Quality Classification

Authors: Eduardo Betine Bucker, Antonio José Grandson Busson, Ruy Luiz Milidiú, Sérgio Colcher, Bruno Pereira Dias, André Bulcão

Abstract: Deep Learning-based models such as Convolutional Neural Networks, have led to significant advancements in several areas of computing applications. Seismogram quality assurance is a relevant Geophysics task, since in the early stages of seismic processing, we are required to identify and fix noisy sail lines. In this work, we introduce a real-world seismogram quality classification dataset based on… ▽ More Deep Learning-based models such as Convolutional Neural Networks, have led to significant advancements in several areas of computing applications. Seismogram quality assurance is a relevant Geophysics task, since in the early stages of seismic processing, we are required to identify and fix noisy sail lines. In this work, we introduce a real-world seismogram quality classification dataset based on 6,613 examples, manually labeled by human experts as good, bad or ugly, according to their noise intensity. This dataset is used to train a CNN classifier for seismic shot-gathers quality prediction. In our empirical evaluation, we observe an F1-score of 93.56% in the test set. △ Less

Submitted 2 December, 2019; originally announced December 2019.

arXiv:1911.03974 [pdf, other]

A Multimodal CNN-based Tool to Censure Inappropriate Video Scenes

Authors: Pedro V. A. de Freitas, Paulo R. C. Mendes, Gabriel N. P. dos Santos, Antonio José G. Busson, Álan Livio Guedes, Sérgio Colcher, Ruy Luiz Milidiú

Abstract: Due to the extensive use of video-sharing platforms and services for their storage, the amount of such media on the internet has become massive. This volume of data makes it difficult to control the kind of content that may be present in such video files. One of the main concerns regarding the video content is if it has an inappropriate subject matter, such as nudity, violence, or other potentiall… ▽ More Due to the extensive use of video-sharing platforms and services for their storage, the amount of such media on the internet has become massive. This volume of data makes it difficult to control the kind of content that may be present in such video files. One of the main concerns regarding the video content is if it has an inappropriate subject matter, such as nudity, violence, or other potentially disturbing content. More than telling if a video is either appropriate or inappropriate, it is also important to identify which parts of it contain such content, for preserving parts that would be discarded in a simple broad analysis. In this work, we present a multimodal~(using audio and image features) architecture based on Convolutional Neural Networks (CNNs) for detecting inappropriate scenes in video files. In the task of classifying video files, our model achieved 98.95\% and 98.94\% of F1-score for the appropriate and inappropriate classes, respectively. We also present a censoring tool that automatically censors inappropriate segments of a video file. △ Less

Submitted 10 November, 2019; originally announced November 2019.

Showing 1–9 of 9 results for author: Busson, A J G