Search | arXiv e-print repository

SNOiC: Soft Labeling and Noisy Mixup based Open Intent Classification Model

Authors: Aditi Kanwar, Aditi Seetha, Satyendra Singh Chouhan, Rajdeep Niyogi

Abstract: This paper presents a Soft Labeling and Noisy Mixup-based open intent classification model (SNOiC). Most of the previous works have used threshold-based methods to identify open intents, which are prone to overfitting and may produce biased predictions. Additionally, the need for more available data for an open intent class presents another limitation for these existing models. SNOiC combines Soft… ▽ More This paper presents a Soft Labeling and Noisy Mixup-based open intent classification model (SNOiC). Most of the previous works have used threshold-based methods to identify open intents, which are prone to overfitting and may produce biased predictions. Additionally, the need for more available data for an open intent class presents another limitation for these existing models. SNOiC combines Soft Labeling and Noisy Mixup strategies to reduce the biasing and generate pseudo-data for open intent class. The experimental results on four benchmark datasets show that the SNOiC model achieves a minimum and maximum performance of 68.72\% and 94.71\%, respectively, in identifying open intents. Moreover, compared to state-of-the-art models, the SNOiC model improves the performance of identifying open intents by 0.93\% (minimum) and 12.76\% (maximum). The model's efficacy is further established by analyzing various parameters used in the proposed model. An ablation study is also conducted, which involves creating three model variants to validate the effectiveness of the SNOiC model. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 9 Pages, 6 figures

MSC Class: ACM-class: F.2.2; I.2.7

arXiv:2203.05173 [pdf, other]

TextConvoNet:A Convolutional Neural Network based Architecture for Text Classification

Authors: Sanskar Soni, Satyendra Singh Chouhan, Santosh Singh Rathore

Abstract: In recent years, deep learning-based models have significantly improved the Natural Language Processing (NLP) tasks. Specifically, the Convolutional Neural Network (CNN), initially used for computer vision, has shown remarkable performance for text data in various NLP problems. Most of the existing CNN-based models use 1-dimensional convolving filters n-gram detectors), where each filter specialis… ▽ More In recent years, deep learning-based models have significantly improved the Natural Language Processing (NLP) tasks. Specifically, the Convolutional Neural Network (CNN), initially used for computer vision, has shown remarkable performance for text data in various NLP problems. Most of the existing CNN-based models use 1-dimensional convolving filters n-gram detectors), where each filter specialises in extracting n-grams features of a particular input word embedding. The input word embeddings, also called sentence matrix, is treated as a matrix where each row is a word vector. Thus, it allows the model to apply one-dimensional convolution and only extract n-gram based features from a sentence matrix. These features can be termed as intra-sentence n-gram features. To the extent of our knowledge, all the existing CNN models are based on the aforementioned concept. In this paper, we present a CNN-based architecture TextConvoNet that not only extracts the intra-sentence n-gram features but also captures the inter-sentence n-gram features in input text data. It uses an alternative approach for input matrix representation and applies a two-dimensional multi-scale convolutional operation on the input. To evaluate the performance of TextConvoNet, we perform an experimental study on five text classification datasets. The results are evaluated by using various performance metrics. The experimental results show that the presented TextConvoNet outperforms state-of-the-art machine learning and deep learning models for text classification purposes. △ Less

Submitted 10 March, 2022; originally announced March 2022.

arXiv:2111.15629 [pdf, other]

DiPD: Disruptive event Prediction Dataset from Twitter

Authors: Sanskar Soni, Dev Mehta, Vinush Vishwanath, Aditi Seetha, Satyendra Singh Chouhan

Abstract: Riots and protests, if gone out of control, can cause havoc in a country. We have seen examples of this, such as the BLM movement, climate strikes, CAA Movement, and many more, which caused disruption to a large extent. Our motive behind creating this dataset was to use it to develop machine learning systems that can give its users insight into the trending events going on and alert them about the… ▽ More Riots and protests, if gone out of control, can cause havoc in a country. We have seen examples of this, such as the BLM movement, climate strikes, CAA Movement, and many more, which caused disruption to a large extent. Our motive behind creating this dataset was to use it to develop machine learning systems that can give its users insight into the trending events going on and alert them about the events that could lead to disruption in the nation. If any event starts going out of control, it can be handled and mitigated by monitoring it before the matter escalates. This dataset collects tweets of past or ongoing events known to have caused disruption and labels these tweets as 1. We also collect tweets that are considered non-eventful and label them as 0 so that they can also be used to train a classification system. The dataset contains 94855 records of unique events and 168706 records of unique non-events, thus giving the total dataset 263561 records. We extract multiple features from the tweets, such as the user's follower count and the user's location, to understand the impact and reach of the tweets. This dataset might be useful in various event related machine learning problems such as event classification, event recognition, and so on. △ Less

Submitted 25 November, 2021; originally announced November 2021.

arXiv:2105.13448 [pdf, other]

Open-world Machine Learning: Applications, Challenges, and Opportunities

Authors: Jitendra Parmar, Satyendra Singh Chouhan, Vaskar Raychoudhury, Santosh Singh Rathore

Abstract: Traditional machine learning mainly supervised learning, follows the assumptions of closed-world learning, i.e., for each testing class, a training class is available. However, such machine learning models fail to identify the classes which were not available during training time. These classes can be referred to as unseen classes. Whereas open-world machine learning (OWML) deals with unseen class… ▽ More Traditional machine learning mainly supervised learning, follows the assumptions of closed-world learning, i.e., for each testing class, a training class is available. However, such machine learning models fail to identify the classes which were not available during training time. These classes can be referred to as unseen classes. Whereas open-world machine learning (OWML) deals with unseen classes. In this paper, first, we present an overview of OWML with importance to the real-world context. Next, different dimensions of open-world machine learning are explored and discussed. The area of OWML gained the attention of the research community in the last decade only. We have searched through different online digital libraries and scrutinized the work done in the last decade. This paper presents a systematic review of various techniques for OWML. It also presents the research gaps, challenges, and future directions in open-world machine learning. This paper will help researchers understand the comprehensive developments of OWML and the likelihood of extending the research in suitable areas. It will also help to select applicable methodologies and datasets to explore this further. △ Less

Submitted 20 February, 2022; v1 submitted 27 May, 2021; originally announced May 2021.

Showing 1–4 of 4 results for author: Chouhan, S S