-
Skin cancer diagnosis using NIR spectroscopy data of skin lesions in vivo using machine learning algorithms
Authors:
Flavio P. Loss,
Pedro H. da Cunha,
Matheus B. Rocha,
Madson Poltronieri Zanoni,
Leandro M. de Lima,
Isadora Tavares Nascimento,
Isabella Rezende,
Tania R. P. Canuto,
Luciana de Paula Vieira,
Renan Rossoni,
Maria C. S. Santos,
Patricia Lyra Frasson,
Wanderson Romão,
Paulo R. Filgueiras,
Renato A. Krohling
Abstract:
Skin lesions are classified in benign or malignant. Among the malignant, melanoma is a very aggressive cancer and the major cause of deaths. So, early diagnosis of skin cancer is very desired. In the last few years, there is a growing interest in computer aided diagnostic (CAD) using most image and clinical data of the lesion. These sources of information present limitations due to their inability…
▽ More
Skin lesions are classified in benign or malignant. Among the malignant, melanoma is a very aggressive cancer and the major cause of deaths. So, early diagnosis of skin cancer is very desired. In the last few years, there is a growing interest in computer aided diagnostic (CAD) using most image and clinical data of the lesion. These sources of information present limitations due to their inability to provide information of the molecular structure of the lesion. NIR spectroscopy may provide an alternative source of information to automated CAD of skin lesions. The most commonly used techniques and classification algorithms used in spectroscopy are Principal Component Analysis (PCA), Partial Least Squares - Discriminant Analysis (PLS-DA), and Support Vector Machines (SVM). Nonetheless, there is a growing interest in applying the modern techniques of machine and deep learning (MDL) to spectroscopy. One of the main limitations to apply MDL to spectroscopy is the lack of public datasets. Since there is no public dataset of NIR spectral data to skin lesions, as far as we know, an effort has been made and a new dataset named NIR-SC-UFES, has been collected, annotated and analyzed generating the gold-standard for classification of NIR spectral data to skin cancer. Next, the machine learning algorithms XGBoost, CatBoost, LightGBM, 1D-convolutional neural network (1D-CNN) were investigated to classify cancer and non-cancer skin lesions. Experimental results indicate the best performance obtained by LightGBM with pre-processing using standard normal variate (SNV), feature extraction providing values of 0.839 for balanced accuracy, 0.851 for recall, 0.852 for precision, and 0.850 for F-score. The obtained results indicate the first steps in CAD of skin lesions aiming the automated triage of patients with skin lesions in vivo using NIR spectral data.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets
Authors:
Leandro M. de Lima,
Renato A. Krohling
Abstract:
Skin cancer is one of the most common types of cancer in the world. Different computer-aided diagnosis systems have been proposed to tackle skin lesion diagnosis, most of them based in deep convolutional neural networks. However, recent advances in computer vision achieved state-of-art results in many tasks, notably Transformer-based networks. We explore and evaluate advances in computer vision ar…
▽ More
Skin cancer is one of the most common types of cancer in the world. Different computer-aided diagnosis systems have been proposed to tackle skin lesion diagnosis, most of them based in deep convolutional neural networks. However, recent advances in computer vision achieved state-of-art results in many tasks, notably Transformer-based networks. We explore and evaluate advances in computer vision architectures, training methods and multimodal feature fusion for skin lesion diagnosis task. Experiments show that PiT ($0.800 \pm 0.006$), CoaT ($0.780 \pm 0.024$) and ViT ($0.771 \pm 0.018$) backbone models with MetaBlock fusion achieved state-of-art results for balanced accuracy metric in PAD-UFES-20 dataset.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
A visualization tool for data analysis on higher education dropout: a case study at UFES
Authors:
Pedro P. Ladeira,
Leandro M. de Lima,
Renato A. Krohling
Abstract:
Through the analysis of cultural, socioeconomic and academic performance aspects it is possible to map the profile of the students and their motivations to drop out. This article aims to create a computational tool for data visualization that allows drawing the profile of students to support educational institutions managers in the definition of dropout avoidance policies. We present a method to t…
▽ More
Through the analysis of cultural, socioeconomic and academic performance aspects it is possible to map the profile of the students and their motivations to drop out. This article aims to create a computational tool for data visualization that allows drawing the profile of students to support educational institutions managers in the definition of dropout avoidance policies. We present a method to treat data collected by higher education institutions over the years, analyze them to understand the dropout and provide that information to the university and the general public. Eight questions were proposed to clarify the dropout from the Federal University of Espírito Santo, Brazil. The questions were answered through the dashboard that helps to understand the causes of dropout. It is expected that this tool can be used by others educational institutions to draw student profiles contributing to possible resolution of the problem.
△ Less
Submitted 29 January, 2022;
originally announced January 2022.
-
LegalNLP -- Natural Language Processing methods for the Brazilian Legal Language
Authors:
Felipe Maia Polo,
Gabriel Caiaffa Floriano Mendonça,
Kauê Capellato J. Parreira,
Lucka Gianvechio,
Peterson Cordeiro,
Jonathan Batista Ferreira,
Leticia Maria Paz de Lima,
Antônio Carlos do Amaral Maia,
Renato Vicente
Abstract:
We present and make available pre-trained language models (Phraser, Word2Vec, Doc2Vec, FastText, and BERT) for the Brazilian legal language, a Python package with functions to facilitate their use, and a set of demonstrations/tutorials containing some applications involving them. Given that our material is built upon legal texts coming from several Brazilian courts, this initiative is extremely he…
▽ More
We present and make available pre-trained language models (Phraser, Word2Vec, Doc2Vec, FastText, and BERT) for the Brazilian legal language, a Python package with functions to facilitate their use, and a set of demonstrations/tutorials containing some applications involving them. Given that our material is built upon legal texts coming from several Brazilian courts, this initiative is extremely helpful for the Brazilian legal field, which lacks other open and specific tools and language models. Our main objective is to catalyze the use of natural language processing tools for legal texts analysis by the Brazilian industry, government, and academia, providing the necessary tools and accessible material.
△ Less
Submitted 5 October, 2021;
originally announced October 2021.
-
Discovering an Aid Policy to Minimize Student Evasion Using Offline Reinforcement Learning
Authors:
Leandro M. de Lima,
Renato A. Krohling
Abstract:
High dropout rates in tertiary education expose a lack of efficiency that causes frustration of expectations and financial waste. Predicting students at risk is not enough to avoid student dropout. Usually, an appropriate aid action must be discovered and applied in the proper time for each student. To tackle this sequential decision-making problem, we propose a decision support method to the sele…
▽ More
High dropout rates in tertiary education expose a lack of efficiency that causes frustration of expectations and financial waste. Predicting students at risk is not enough to avoid student dropout. Usually, an appropriate aid action must be discovered and applied in the proper time for each student. To tackle this sequential decision-making problem, we propose a decision support method to the selection of aid actions for students using offline reinforcement learning to support decision-makers effectively avoid student dropout. Additionally, a discretization of student's state space applying two different clustering methods is evaluated. Our experiments using logged data of real students shows, through off-policy evaluation, that the method should achieve roughly 1.0 to 1.5 times as much cumulative reward as the logged policy. So, it is feasible to help decision-makers apply appropriate aid actions and, possibly, reduce student dropout.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.