Naïve Bayes and Random Forest for Crop Yield Prediction
Authors:
Abbas Maazallahi,
Sreehari Thota,
Naga Prasad Kondaboina,
Vineetha Muktineni,
Deepthi Annem,
Abhi Stephen Rokkam,
Mohammad Hossein Amini,
Mohammad Amir Salari,
Payam Norouzzadeh,
Eli Snir,
Bahareh Rahmani
Abstract:
This study analyzes crop yield prediction in India from 1997 to 2020, focusing on various crops and key environmental factors. It aims to predict agricultural yields by utilizing advanced machine learning techniques like Linear Regression, Decision Tree, KNN, Naïve Bayes, K-Mean Clustering, and Random Forest. The models, particularly Naïve Bayes and Random Forest, demonstrate high effectiveness, a…
▽ More
This study analyzes crop yield prediction in India from 1997 to 2020, focusing on various crops and key environmental factors. It aims to predict agricultural yields by utilizing advanced machine learning techniques like Linear Regression, Decision Tree, KNN, Naïve Bayes, K-Mean Clustering, and Random Forest. The models, particularly Naïve Bayes and Random Forest, demonstrate high effectiveness, as shown through data visualizations. The research concludes that integrating these analytical methods significantly enhances the accuracy and reliability of crop yield predictions, offering vital contributions to agricultural data science.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
Cell Identity Codes: Understanding Cell Identity from Gene Expression Profiles using Deep Neural Networks
Authors:
Farzad Abdolhosseini,
Behrooz Azarkhalili,
Abbas Maazallahi,
Aryan Kamal,
Seyed Abolfazl Motahari,
Ali Sharifi-Zarchi,
Hamidreza Chitsaz
Abstract:
Understanding cell identity is an important task in many biomedical areas. Expression patterns of specific marker genes have been used to characterize some limited cell types, but exclusive markers are not available for many cell types. A second approach is to use machine learning to discriminate cell types based on the whole gene expression profiles (GEPs). The accuracies of simple classification…
▽ More
Understanding cell identity is an important task in many biomedical areas. Expression patterns of specific marker genes have been used to characterize some limited cell types, but exclusive markers are not available for many cell types. A second approach is to use machine learning to discriminate cell types based on the whole gene expression profiles (GEPs). The accuracies of simple classification algorithms such as linear discriminators or support vector machines are limited due to the complexity of biological systems. We used deep neural networks to analyze 1040 GEPs from 16 different human tissues and cell types. After comparing different architectures, we identified a specific structure of deep autoencoders that can encode a GEP into a vector of 30 numeric values, which we call the cell identity code (CIC). The original GEP can be reproduced from the CIC with an accuracy comparable to technical replicates of the same experiment. Although we use an unsupervised approach to train the autoencoder, we show different values of the CIC are connected to different biological aspects of the cell, such as different pathways or biological processes. This network can use CIC to reproduce the GEP of the cell types it has never seen during the training. It also can resist some noise in the measurement of the GEP. Furthermore, we introduce classifier autoencoder, an architecture that can accurately identify cell type based on the GEP or the CIC.
△ Less
Submitted 13 June, 2018;
originally announced June 2018.