Search | arXiv e-print repository

Classifying Dry Eye Disease Patients from Healthy Controls Using Machine Learning and Metabolomics Data

Authors: Sajad Amouei Sheshkal, Morten Gundersen, Michael Alexander Riegler, Øygunn Aass Utheim, Kjell Gunnar Gundersen, Hugo Lewi Hammer

Abstract: Dry eye disease is a common disorder of the ocular surface, leading patients to seek eye care. Clinical signs and symptoms are currently used to diagnose dry eye disease. Metabolomics, a method for analyzing biological systems, has been found helpful in identifying distinct metabolites in patients and in detecting metabolic profiles that may indicate dry eye disease at early stages. In this study,… ▽ More Dry eye disease is a common disorder of the ocular surface, leading patients to seek eye care. Clinical signs and symptoms are currently used to diagnose dry eye disease. Metabolomics, a method for analyzing biological systems, has been found helpful in identifying distinct metabolites in patients and in detecting metabolic profiles that may indicate dry eye disease at early stages. In this study, we explored using machine learning and metabolomics information to identify which cataract patients suffered from dry eye disease. As there is no one-size-fits-all machine learning model for metabolomics data, choosing the most suitable model can significantly affect the quality of predictions and subsequent metabolomics analyses. To address this challenge, we conducted a comparative analysis of nine machine learning models on three metabolomics data sets from cataract patients with and without dry eye disease. The models were evaluated and optimized using nested k-fold cross-validation. To assess the performance of these models, we selected a set of suitable evaluation metrics tailored to the data set's challenges. The logistic regression model overall performed the best, achieving the highest area under the curve score of 0.8378, balanced accuracy of 0.735, Matthew's correlation coefficient of 0.5147, an F1-score of 0.8513, and a specificity of 0.5667. Additionally, following the logistic regression, the XGBoost and Random Forest models also demonstrated good performance. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2107.00471 [pdf, other]

doi 10.1371/journal.pone.0267976

SinGAN-Seg: Synthetic training data generation for medical image segmentation

Authors: Vajira Thambawita, Pegah Salehi, Sajad Amouei Sheshkal, Steven A. Hicks, Hugo L. Hammer, Sravanthi Parasa, Thomas de Lange, Pål Halvorsen, Michael A. Riegler

Abstract: Analyzing medical data to find abnormalities is a time-consuming and costly task, particularly for rare abnormalities, requiring tremendous efforts from medical experts. Artificial intelligence has become a popular tool for the automatic processing of medical data, acting as a supportive tool for doctors. However, the machine learning models used to build these tools are highly dependent on the da… ▽ More Analyzing medical data to find abnormalities is a time-consuming and costly task, particularly for rare abnormalities, requiring tremendous efforts from medical experts. Artificial intelligence has become a popular tool for the automatic processing of medical data, acting as a supportive tool for doctors. However, the machine learning models used to build these tools are highly dependent on the data used to train them. Large amounts of data can be difficult to obtain in medicine due to privacy, expensive and time-consuming annotations, and a general lack of data samples for infrequent lesions. Here, we present a novel synthetic data generation pipeline, called SinGAN-Seg, to produce synthetic medical images with corresponding masks using a single training image. Our method is different from the traditional GANs because our model needs only a single image and the corresponding ground truth to train. Our method produces alternative artificial segmentation datasets with ground truth masks when real datasets are not allowed to share. The pipeline is evaluated using qualitative and quantitative comparisons between real and synthetic data to show that the style transfer technique used in our pipeline significantly improves the quality of the generated data and our method is better than other state-of-the-art GANs to prepare synthetic images when the size of training datasets are limited. By training UNet++ using both real and the synthetic data generated from the SinGAN-Seg pipeline, we show that models trained with synthetic data have very close performances to those trained on real data when the datasets have a considerable amount of data. In contrast, Synthetic data generated from the SinGAN-Seg pipeline can improve the performance of segmentation models when training datasets do not have a considerable amount of data. The code is available on GitHub. △ Less

Submitted 25 April, 2022; v1 submitted 29 June, 2021; originally announced July 2021.

arXiv:2008.09448 [pdf]

An Improved Person Re-identification Method by light-weight convolutional neural network

Authors: Sajad Amouei Sheshkal, Kazim Fouladi-Ghaleh, Hossein Aghababa

Abstract: Person Re-identification is defined as a recognizing process where the person is observed by non-overlap** cameras at different places. In the last decade, the rise in the applications and importance of Person Re-identification for surveillance systems popularized this subject in different areas of computer vision. Person Re-identification is faced with challenges such as low resolution, varying… ▽ More Person Re-identification is defined as a recognizing process where the person is observed by non-overlap** cameras at different places. In the last decade, the rise in the applications and importance of Person Re-identification for surveillance systems popularized this subject in different areas of computer vision. Person Re-identification is faced with challenges such as low resolution, varying poses, illumination, background clutter, and occlusion, which could affect the result of recognizing process. The present paper aims to improve Person Re-identification using transfer learning and application of verification loss function within the framework of Siamese network. The Siamese network receives image pairs as inputs and extract their features via a pre-trained model. EfficientNet was employed to obtain discriminative features and reduce the demands for data. The advantages of verification loss were used in the network learning. Experiments showed that the proposed model performs better than state-of-the-art methods on the CUHK01 dataset. For example, rank5 accuracies are 95.2% (+5.7) for the CUHK01 datasets. It also achieved an acceptable percentage in Rank 1. Because of the small size of the pre-trained model parameters, learning speeds up and there will be a need for less hardware and data. △ Less

Submitted 21 August, 2020; originally announced August 2020.

Showing 1–3 of 3 results for author: Sheshkal, S A