-
A BERT based Ensemble Approach for Sentiment Classification of Customer Reviews and its Application to Nudge Marketing in e-Commerce
Authors:
Sayan Putatunda,
Anwesha Bhowmik,
Girish Thiruvenkadam,
Rahul Ghosh
Abstract:
According to the literature, Product reviews are an important source of information for customers to support their buying decision. Product reviews improve customer trust and loyalty. Reviews help customers in understanding what other customers think about a particular product and helps in driving purchase decisions. Therefore, for an e-commerce platform it is important to understand the sentiment…
▽ More
According to the literature, Product reviews are an important source of information for customers to support their buying decision. Product reviews improve customer trust and loyalty. Reviews help customers in understanding what other customers think about a particular product and helps in driving purchase decisions. Therefore, for an e-commerce platform it is important to understand the sentiments in customer reviews to understand their products and services, and it also allows them to potentially create positive consumer interaction as well as long lasting relationships. Reviews also provide innovative ways to market the products for an ecommerce company. One such approach is Nudge Marketing. Nudge marketing is a subtle way for an ecommerce company to help their customers make better decisions without hesitation.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
DriveML: An R Package for Driverless Machine Learning
Authors:
Sayan Putatunda,
Dayananda Ubrangala,
Kiran Rama,
Ravi Kondapalli
Abstract:
In recent years, the concept of automated machine learning has become very popular. Automated Machine Learning (AutoML) mainly refers to the automated methods for model selection and hyper-parameter optimization of various algorithms such as random forests, gradient boosting, neural networks, etc. In this paper, we introduce a new package i.e. DriveML for automated machine learning. DriveML helps…
▽ More
In recent years, the concept of automated machine learning has become very popular. Automated Machine Learning (AutoML) mainly refers to the automated methods for model selection and hyper-parameter optimization of various algorithms such as random forests, gradient boosting, neural networks, etc. In this paper, we introduce a new package i.e. DriveML for automated machine learning. DriveML helps in implementing some of the pillars of an automated machine learning pipeline such as automated data preparation, feature engineering, model building and model explanation by running the function instead of writing lengthy R codes. The DriveML package is available in CRAN. We compare the DriveML package with other relevant packages in CRAN/Github and find that DriveML performs the best across different parameters. We also provide an illustration by applying the DriveML package with default configuration on a real world dataset. Overall, the main benefits of DriveML are in development time savings, reduce developer's errors, optimal tuning of machine learning models and reproducibility.
△ Less
Submitted 6 August, 2021; v1 submitted 1 May, 2020;
originally announced May 2020.
-
A Modified Bayesian Optimization based Hyper-Parameter Tuning Approach for Extreme Gradient Boosting
Authors:
Sayan Putatunda,
Kiran Rama
Abstract:
It is already reported in the literature that the performance of a machine learning algorithm is greatly impacted by performing proper Hyper-Parameter optimization. One of the ways to perform Hyper-Parameter optimization is by manual search but that is time consuming. Some of the common approaches for performing Hyper-Parameter optimization are Grid search Random search and Bayesian optimization u…
▽ More
It is already reported in the literature that the performance of a machine learning algorithm is greatly impacted by performing proper Hyper-Parameter optimization. One of the ways to perform Hyper-Parameter optimization is by manual search but that is time consuming. Some of the common approaches for performing Hyper-Parameter optimization are Grid search Random search and Bayesian optimization using Hyperopt. In this paper, we propose a brand new approach for hyperparameter improvement i.e. Randomized-Hyperopt and then tune the hyperparameters of the XGBoost i.e. the Extreme Gradient Boosting algorithm on ten datasets by applying Random search, Randomized-Hyperopt, Hyperopt and Grid Search. The performances of each of these four techniques were compared by taking both the prediction accuracy and the execution time into consideration. We find that the Randomized-Hyperopt performs better than the other three conventional methods for hyper-paramter optimization of XGBoost.
△ Less
Submitted 10 April, 2020;
originally announced April 2020.
-
A Hybrid Deep Learning Approach for Diagnosis of the Erythemato-Squamous Disease
Authors:
Sayan Putatunda
Abstract:
The diagnosis of the Erythemato-squamous disease (ESD) is accepted as a difficult problem in dermatology. ESD is a form of skin disease. It generally causes redness of the skin and also may cause loss of skin. They are generally due to genetic or environmental factors. ESD comprises six classes of skin conditions namely, pityriasis rubra pilaris, lichen planus, chronic dermatitis, psoriasis, sebor…
▽ More
The diagnosis of the Erythemato-squamous disease (ESD) is accepted as a difficult problem in dermatology. ESD is a form of skin disease. It generally causes redness of the skin and also may cause loss of skin. They are generally due to genetic or environmental factors. ESD comprises six classes of skin conditions namely, pityriasis rubra pilaris, lichen planus, chronic dermatitis, psoriasis, seboreic dermatitis and pityriasis rosea. The automated diagnosis of ESD can help doctors and dermatologists in reducing the efforts from their end and in taking faster decisions for treatment. The literature is replete with works that used conventional machine learning methods for the diagnosis of ESD. However, there isn't much instances of application of Deep learning for the diagnosis of ESD. In this paper, we propose a novel hybrid deep learning approach i.e. Derm2Vec for the diagnosis of the ESD. Derm2Vec is a hybrid deep learning model that consists of both Autoencoders and Deep Neural Networks. We also apply a conventional Deep Neural Network (DNN) for the classification of ESD. We apply both Derm2Vec and DNN along with other traditional machine learning methods on a real world dermatology dataset. The Derm2Vec method is found to be the best performer (when taking the prediction accuracy into account) followed by DNN and Extreme Gradient Boosting.The mean CV score of Derm2Vec, DNN and Extreme Gradient Boosting are 96.92 percent, 96.65 percent and 95.80 percent respectively.
△ Less
Submitted 24 May, 2020; v1 submitted 17 September, 2019;
originally announced September 2019.
-
PropTech for Proactive Pricing of Houses in Classified Advertisements in the Indian Real Estate Market
Authors:
Sayan Putatunda
Abstract:
Property Technology (PropTech) is the next big thing that is going to disrupt the real estate market. Nowadays, we see applications of Machine Learning (ML) and Artificial Intelligence (AI) in almost all the domains but for a long time the real estate industry was quite slow in adopting data science and machine learning for problem solving and improving their processes. However, things are changin…
▽ More
Property Technology (PropTech) is the next big thing that is going to disrupt the real estate market. Nowadays, we see applications of Machine Learning (ML) and Artificial Intelligence (AI) in almost all the domains but for a long time the real estate industry was quite slow in adopting data science and machine learning for problem solving and improving their processes. However, things are changing quite fast as we see a lot of adoption of AI and ML in the US and European real estate markets. But the Indian real estate market has to catch-up a lot. This paper proposes a machine learning approach for solving the house price prediction problem in the classified advertisements. This study focuses on the Indian real estate market. We apply advanced machine learning algorithms such as Random forest, Gradient boosting and Artificial neural networks on a real world dataset and compare the performance of these methods. We find that the Random forest method is the best performer in terms of prediction accuracy.
△ Less
Submitted 27 March, 2019;
originally announced April 2019.
-
Care2Vec: A Deep learning approach for the classification of self-care problems in physically disabled children
Authors:
Sayan Putatunda
Abstract:
Accurate classification of self-care problems in children who suffer from physical and motor affliction is an important problem in the healthcare industry. This is a difficult and a time consumming process and it needs the expertise of occupational therapists. In recent years, healthcare professionals have opened up to the idea of using expert systems and artificial intelligence in the diagnosis a…
▽ More
Accurate classification of self-care problems in children who suffer from physical and motor affliction is an important problem in the healthcare industry. This is a difficult and a time consumming process and it needs the expertise of occupational therapists. In recent years, healthcare professionals have opened up to the idea of using expert systems and artificial intelligence in the diagnosis and classification of self care problems. In this study, we propose a new deep learning based approach named Care2Vec for solving these kind of problems and use a real world self care activities dataset that is based on a conceptual framework designed by the World Health Organization (WHO). Care2Vec is a mix of unsupervised and supervised learning where we use Autoencoders and Deep neural networks as a two step modeling process. We found that Care2Vec has a better prediction accuracy than some of the traditional methods reported in the literature for solving the self care classification problem viz. Decision trees and Artificial neural networks.
△ Less
Submitted 23 December, 2019; v1 submitted 3 December, 2018;
originally announced December 2018.