-
Multi-level Product Category Prediction through Text Classification
Authors:
Wesley Ferreira Maia,
Angelo Carmignani,
Gabriel Bortoli,
Lucas Maretti,
David Luz,
Daniel Camilo Fuentes Guzman,
Marcos Jardel Henriques,
Francisco Louzada Neto
Abstract:
This article investigates applying advanced machine learning models, specifically LSTM and BERT, for text classification to predict multiple categories in the retail sector. The study demonstrates how applying data augmentation techniques and the focal loss function can significantly enhance accuracy in classifying products into multiple categories using a robust Brazilian retail dataset. The LSTM…
▽ More
This article investigates applying advanced machine learning models, specifically LSTM and BERT, for text classification to predict multiple categories in the retail sector. The study demonstrates how applying data augmentation techniques and the focal loss function can significantly enhance accuracy in classifying products into multiple categories using a robust Brazilian retail dataset. The LSTM model, enriched with Brazilian word embedding, and BERT, known for its effectiveness in understanding complex contexts, were adapted and optimized for this specific task. The results showed that the BERT model, with an F1 Macro Score of up to $99\%$ for segments, $96\%$ for categories and subcategories and $93\%$ for name products, outperformed LSTM in more detailed categories. However, LSTM also achieved high performance, especially after applying data augmentation and focal loss techniques. These results underscore the effectiveness of NLP techniques in retail and highlight the importance of the careful selection of modelling and preprocessing strategies. This work contributes significantly to the field of NLP in retail, providing valuable insights for future research and practical applications.
△ Less
Submitted 3 March, 2024;
originally announced March 2024.
-
Feature Selection Approach with Missing Values Conducted for Statistical Learning: A Case Study of Entrepreneurship Survival Dataset
Authors:
Diego Nascimento,
Anderson Ara,
Francisco Louzada Neto
Abstract:
In this article, we investigate the features which enhanced discriminate the survival in the micro and small business (MSE) using the approach of data mining with feature selection. According to the complexity of the data set, we proposed a comparison of three data imputation methods such as mean imputation (MI), k-nearest neighbor (KNN) and expectation maximization (EM) using mutually the selecti…
▽ More
In this article, we investigate the features which enhanced discriminate the survival in the micro and small business (MSE) using the approach of data mining with feature selection. According to the complexity of the data set, we proposed a comparison of three data imputation methods such as mean imputation (MI), k-nearest neighbor (KNN) and expectation maximization (EM) using mutually the selection of variables technique, whereby t-test, then through the data mining process using logistic regression classification methods, naive Bayes algorithm, linear discriminant analysis and support vector machine hence comparing their respective performances. The experimental results will be spread in develo** a model to predict the MSE survival, providing a better understanding in the topic once it is a significant part of the Brazilian' GPA and macroeconomy.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.
-
Bayesian model averaging: A systematic review and conceptual classification
Authors:
Tiago M. Fragoso,
Francisco Louzada Neto
Abstract:
Bayesian Model Averaging (BMA) is an application of Bayesian inference to the problems of model selection, combined estimation and prediction that produces a straightforward model choice criteria and less risky predictions. However, the application of BMA is not always straightforward, leading to diverse assumptions and situational choices on its different aspects. Despite the widespread applicati…
▽ More
Bayesian Model Averaging (BMA) is an application of Bayesian inference to the problems of model selection, combined estimation and prediction that produces a straightforward model choice criteria and less risky predictions. However, the application of BMA is not always straightforward, leading to diverse assumptions and situational choices on its different aspects. Despite the widespread application of BMA in the literature, there were not many accounts of these differences and trends besides a few landmark revisions in the late 1990s and early 2000s, therefore not taking into account any advancements made in the last 15 years. In this work, we present an account of these developments through a careful content analysis of 587 articles in BMA published between 1996 and 2014. We also develop a conceptual classification scheme to better describe this vast literature, understand its trends and future directions and provide guidance for the researcher interested in both the application and development of the methodology. The results of the classification scheme and content review are then used to discuss the present and future of the BMA literature.
△ Less
Submitted 29 September, 2015;
originally announced September 2015.
-
Cluster Model For Reactions Induced By Weakly Bound And/Or Exotic Halo Nuclei With Medium-Mass Targets
Authors:
C. Beck,
N. Rowley,
P. Papka,
S. Courtin,
M. Rousseau,
F. A. Souza,
N. Carlin,
F. Liguori Neto,
M. M. De Moura,
M. G. Del Santo,
A. A. I. Suade,
M. G. Munhoz,
E. M. Szanto,
A. Szanto De Toledo,
N. Keeley,
A. Diaz-Torres,
. K. Hagino
Abstract:
An experimental overview of reactions induced by the stable, but weakly-bound nuclei 6Li, 7Li and 9Be, and by the exotic, halo nuclei 6He, 8He, 8B, and 11Be on medium-mass targets, such as 58Ni, 59Co or 64Zn, is presented. Existing data on elastic scattering, total reaction cross sections, fusion processes, breakup and transfer channels are discussed in the framework of a CDCC approach taking into…
▽ More
An experimental overview of reactions induced by the stable, but weakly-bound nuclei 6Li, 7Li and 9Be, and by the exotic, halo nuclei 6He, 8He, 8B, and 11Be on medium-mass targets, such as 58Ni, 59Co or 64Zn, is presented. Existing data on elastic scattering, total reaction cross sections, fusion processes, breakup and transfer channels are discussed in the framework of a CDCC approach taking into account the breakup degree of freedom.
△ Less
Submitted 9 September, 2010;
originally announced September 2010.