Search | arXiv e-print repository

Geometry Based Machining Feature Retrieval with Inductive Transfer Learning

Authors: N S Kamal, Barathi Ganesh HB, Sajith Variyar VV, Sowmya V, Soman KP

Abstract: Manufacturing industries have widely adopted the reuse of machine parts as a method to reduce costs and as a sustainable manufacturing practice. Identification of reusable features from the design of the parts and finding their similar features from the database is an important part of this process. In this project, with the help of fully convolutional geometric features, we are able to extract an… ▽ More Manufacturing industries have widely adopted the reuse of machine parts as a method to reduce costs and as a sustainable manufacturing practice. Identification of reusable features from the design of the parts and finding their similar features from the database is an important part of this process. In this project, with the help of fully convolutional geometric features, we are able to extract and learn the high level semantic features from CAD models with inductive transfer learning. The extracted features are then compared with that of other CAD models from the database using Frobenius norm and identical features are retrieved. Later we passed the extracted features to a deep convolutional neural network with a spatial pyramid pooling layer and the performance of the feature retrieval increased significantly. It was evident from the results that the model could effectively capture the geometrical elements from machining features. △ Less

Submitted 15 November, 2021; v1 submitted 26 August, 2021; originally announced August 2021.

Comments: Submitted to 9th International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA 2021)

MSC Class: 68T07 ACM Class: I.4; I.2; I.5

arXiv:2004.04812 [pdf, other]

Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and URL Data Analysis

Authors: Simran K, Prathiksha Balakrishna, Vinayakumar Ravi, Soman KP

Abstract: Deep learning is a state of the art method for a lot of applications. The main issue is that most of the real-time data is highly imbalanced in nature. In order to avoid bias in training, cost-sensitive approach can be used. In this paper, we propose cost-sensitive deep learning based frameworks and the performance of the frameworks is evaluated on three different Cyber Security use cases which ar… ▽ More Deep learning is a state of the art method for a lot of applications. The main issue is that most of the real-time data is highly imbalanced in nature. In order to avoid bias in training, cost-sensitive approach can be used. In this paper, we propose cost-sensitive deep learning based frameworks and the performance of the frameworks is evaluated on three different Cyber Security use cases which are Domain Generation Algorithm (DGA), Electronic mail (Email), and Uniform Resource Locator (URL). Various experiments were performed using cost-insensitive as well as cost-sensitive methods and parameters for both of these methods are set based on hyperparameter tuning. In all experiments, the cost-sensitive deep learning methods performed better than the cost-insensitive approaches. This is mainly due to the reason that cost-sensitive approach gives importance to the classes which have a very less number of samples during training and this helps to learn all the classes in a more efficient manner. △ Less

Submitted 17 October, 2020; v1 submitted 30 March, 2020; originally announced April 2020.

Comments: 12 pages

arXiv:2004.00503 [pdf, other]

Deep Learning Approach for Enhanced Cyber Threat Indicators in Twitter Stream

Authors: Simran K, Prathiksha Balakrishna, Vinayakumar R, Soman KP

Abstract: In recent days, the amount of Cyber Security text data shared via social media resources mainly Twitter has increased. An accurate analysis of this data can help to develop cyber threat situational awareness framework for a cyber threat. This work proposes a deep learning based approach for tweet data analysis. To convert the tweets into numerical representations, various text representations are… ▽ More In recent days, the amount of Cyber Security text data shared via social media resources mainly Twitter has increased. An accurate analysis of this data can help to develop cyber threat situational awareness framework for a cyber threat. This work proposes a deep learning based approach for tweet data analysis. To convert the tweets into numerical representations, various text representations are employed. These features are feed into deep learning architecture for optimal feature extraction as well as classification. Various hyperparameter tuning approaches are used for identifying optimal text representation method as well as optimal network parameters and network structures for deep learning models. For comparative analysis, the classical text representation method with classical machine learning algorithm is employed. From the detailed analysis of experiments, we found that the deep learning architecture with advanced text representation methods performed better than the classical text representation and classical machine learning algorithms. The primary reason for this is that the advanced text representation methods have the capability to learn sequential properties which exist among the textual data and deep learning architectures learns the optimal features along with decreasing the feature size. △ Less

Submitted 30 March, 2020; originally announced April 2020.

Comments: 11 pages

arXiv:2004.00502 [pdf, other]

Deep Learning Approach for Intelligent Named Entity Recognition of Cyber Security

Authors: Simran K, Sriram S, Vinayakumar R, Soman KP

Abstract: In recent years, the amount of Cyber Security data generated in the form of unstructured texts, for example, social media resources, blogs, articles, and so on has exceptionally increased. Named Entity Recognition (NER) is an initial step towards converting this unstructured data into structured data which can be used by a lot of applications. The existing methods on NER for Cyber Security data ar… ▽ More In recent years, the amount of Cyber Security data generated in the form of unstructured texts, for example, social media resources, blogs, articles, and so on has exceptionally increased. Named Entity Recognition (NER) is an initial step towards converting this unstructured data into structured data which can be used by a lot of applications. The existing methods on NER for Cyber Security data are based on rules and linguistic characteristics. A Deep Learning (DL) based approach embedded with Conditional Random Fields (CRFs) is proposed in this paper. Several DL architectures are evaluated to find the most optimal architecture. The combination of Bidirectional Gated Recurrent Unit (Bi-GRU), Convolutional Neural Network (CNN), and CRF performed better compared to various other DL frameworks on a publicly available benchmark dataset. This may be due to the reason that the bidirectional structures preserve the features related to the future and previous words in a sequence. △ Less

Submitted 30 March, 2020; originally announced April 2020.

Comments: 10 pages

arXiv:1910.03188 [pdf, other]

Dynamic Mode Decomposition based feature for Image Classification

Authors: Rahul-Vigneswaran K, Sachin-Kumar S, Neethu Mohan, Soman KP

Abstract: Irrespective of the fact that Machine learning has produced groundbreaking results, it demands an enormous amount of data in order to perform so. Even though data production has been in its all-time high, almost all the data is unlabelled, hence making them unsuitable for training the algorithms. This paper proposes a novel method of extracting the features using Dynamic Mode Decomposition (DMD).… ▽ More Irrespective of the fact that Machine learning has produced groundbreaking results, it demands an enormous amount of data in order to perform so. Even though data production has been in its all-time high, almost all the data is unlabelled, hence making them unsuitable for training the algorithms. This paper proposes a novel method of extracting the features using Dynamic Mode Decomposition (DMD). The experiment is performed using data samples from Imagenet. The learning is done using SVM-linear, SVM-RBF, Random Kitchen Sink approach (RKS). The results have shown that DMD features with RKS give competing results. △ Less

Submitted 7 October, 2019; originally announced October 2019.

Comments: Selected for Spotlight presentation at TENCON 2019

arXiv:1904.10434 [pdf, other]

Data-driven Computing in Elasticity via Chebyshev Approximation

Authors: Rahul-Vigneswaran K, Neethu Mohan, Soman KP

Abstract: This paper proposes a data-driven approach for computing elasticity by means of a non-parametric regression approach rather than an optimization approach. The Chebyshev approximation is utilized for tackling the material data-sets non-linearity of the elasticity. Also, additional efforts have been taken to compare the results with several other state-of-the-art methodologies. This paper proposes a data-driven approach for computing elasticity by means of a non-parametric regression approach rather than an optimization approach. The Chebyshev approximation is utilized for tackling the material data-sets non-linearity of the elasticity. Also, additional efforts have been taken to compare the results with several other state-of-the-art methodologies. △ Less

Submitted 23 April, 2019; originally announced April 2019.

Comments: 6 pages, Accepted for ICCS 2019

arXiv:1904.03491 [pdf, ps, other]

A Compendium on Network and Host based Intrusion Detection Systems

Authors: Rahul-Vigneswaran K, Prabaharan Poornachandran, Soman KP

Abstract: The techniques of deep learning have become the state of the art methodology for executing complicated tasks from various domains of computer vision, natural language processing, and several other areas. Due to its rapid development and promising benchmarks in those fields, researchers started experimenting with this technique to perform in the area of, especially in intrusion detection related ta… ▽ More The techniques of deep learning have become the state of the art methodology for executing complicated tasks from various domains of computer vision, natural language processing, and several other areas. Due to its rapid development and promising benchmarks in those fields, researchers started experimenting with this technique to perform in the area of, especially in intrusion detection related tasks. Deep learning is a subset and a natural extension of classical Machine learning and an evolved model of neural networks. This paper contemplates and discusses all the methodologies related to the leading edge Deep learning and Neural network models purposing to the arena of Intrusion Detection Systems. △ Less

Submitted 6 April, 2019; originally announced April 2019.

Comments: 8 pages, Accepted for ICDSMLA 2019

arXiv:1901.04281 [pdf]

doi 10.13140/RG.2.2.21876.81283

RNNSecureNet: Recurrent neural networks for Cyber security use-cases

Authors: Mohammed Harun Babu R, Vinayakumar R, Soman KP

Abstract: Recurrent neural network (RNN) is an effective neural network in solving very complex supervised and unsupervised tasks. There has been a significant improvement in RNN field such as natural language processing, speech processing, computer vision and other multiple domains. This paper deals with RNN application on different use cases like Incident Detection, Fraud Detection, and Android Malware Cl… ▽ More Recurrent neural network (RNN) is an effective neural network in solving very complex supervised and unsupervised tasks. There has been a significant improvement in RNN field such as natural language processing, speech processing, computer vision and other multiple domains. This paper deals with RNN application on different use cases like Incident Detection, Fraud Detection, and Android Malware Classification. The best performing neural network architecture is chosen by conducting different chain of experiments for different network parameters and structures. The network is run up to 1000 epochs with learning rate set in the range of 0.01 to 0.5.Obviously, RNN performed very well when compared to classical machine learning algorithms. This is mainly possible because RNNs implicitly extracts the underlying features and also identifies the characteristics of the data. This helps to achieve better accuracy. △ Less

Submitted 5 January, 2019; originally announced January 2019.

Comments: 12 pages. arXiv admin note: text overlap with arXiv:1812.03519

arXiv:1901.03141 [pdf]

Emotion Detection using Data Driven Models

Authors: Naveenkumar K S, Vinayakumar R, Soman KP

Abstract: Text is the major method that is used for communication now a days, each and every day lots of text are created. In this paper the text data is used for the classification of the emotions. Emotions are the way of expression of the persons feelings which has an high influence on the decision making tasks. Datasets are collected which are available publically and combined together based on the three… ▽ More Text is the major method that is used for communication now a days, each and every day lots of text are created. In this paper the text data is used for the classification of the emotions. Emotions are the way of expression of the persons feelings which has an high influence on the decision making tasks. Datasets are collected which are available publically and combined together based on the three emotions that are considered here positive, negative and neutral. In this paper we have proposed the text representation method TFIDF and keras embedding and then given to the classical machine learning algorithms of which Logistics Regression gives the highest accuracy of about 75.6%, after which it is passed to the deep learning algorithm which is the CNN which gives the state of art accuracy of about 45.25%. For the research purpose the datasets that has been collected are released. △ Less

Submitted 10 January, 2019; originally announced January 2019.

Comments: 11 pages

arXiv:1901.01051 [pdf, other]

An Insight into the Dynamics and State Space Modelling of a 3-D Quadrotor

Authors: Rahul Vigneswaran K, Soman KP

Abstract: Drones have gained popularity in a wide range of field ranging from aerial photography, aerial map**, and investigation of electric power lines. Every drone that we know today is carrying out some kind of control algorithm at the low level in order to manoeuvre itself around. For the quadrotor to either control itself autonomously or to develop a high-level user interface for us to control it, w… ▽ More Drones have gained popularity in a wide range of field ranging from aerial photography, aerial map**, and investigation of electric power lines. Every drone that we know today is carrying out some kind of control algorithm at the low level in order to manoeuvre itself around. For the quadrotor to either control itself autonomously or to develop a high-level user interface for us to control it, we need to understand the basic mathematics behind how it functions. This paper aims to explain the mathematical modelling of the dynamics of a 3 Dimensional quadrotor. As it may seem like a trivial task, it plays a vital role in how we control the drone. Also, additional effort has been taken to explain the transformations of the drone's frame of reference to the inertial frame of reference. △ Less

Submitted 4 January, 2019; originally announced January 2019.

Comments: 16 pages, 6 figures

arXiv:1901.00297 [pdf]

A Deep Learning Approach for Similar Languages, Varieties and Dialects

Authors: Vidya Prasad K, Akarsh S, Vinayakumar R, Soman KP

Abstract: Deep learning mechanisms are prevailing approaches in recent days for the various tasks in natural language processing, speech recognition, image processing and many others. To leverage this we use deep learning based mechanism specifically Bidirectional- Long Short-Term Memory (B-LSTM) for the task of dialectic identification in Arabic and German broadcast speech and Long Short-Term Memory (LSTM)… ▽ More Deep learning mechanisms are prevailing approaches in recent days for the various tasks in natural language processing, speech recognition, image processing and many others. To leverage this we use deep learning based mechanism specifically Bidirectional- Long Short-Term Memory (B-LSTM) for the task of dialectic identification in Arabic and German broadcast speech and Long Short-Term Memory (LSTM) for discriminating between similar Languages. Two unique B-LSTM models are created using the Large-vocabulary Continuous Speech Recognition (LVCSR) based lexical features and a fixed length of 400 per utterance bottleneck features generated by i-vector framework. These models were evaluated on the VarDial 2017 datasets for the tasks Arabic, German dialect identification with dialects of Egyptian, Gulf, Levantine, North African, and MSA for Arabic and Basel, Bern, Lucerne, and Zurich for German. Also for the task of Discriminating between Similar Languages like Bosnian, Croatian and Serbian. The B-LSTM model showed accuracy of 0.246 on lexical features and accuracy of 0.577 bottleneck features of i-Vector framework. △ Less

Submitted 2 January, 2019; originally announced January 2019.

Comments: 17 pages

arXiv:1812.06292 [pdf]

A short review on Applications of Deep learning for Cyber security

Authors: Mohammed Harun Babu R, Vinayakumar R, Soman KP

Abstract: Deep learning is an advanced model of traditional machine learning. This has the capability to extract optimal feature representation from raw input samples. This has been applied towards various use cases in cyber security such as intrusion detection, malware classification, android malware detection, spam and phishing detection and binary analysis. This paper outlines the survey of all the works… ▽ More Deep learning is an advanced model of traditional machine learning. This has the capability to extract optimal feature representation from raw input samples. This has been applied towards various use cases in cyber security such as intrusion detection, malware classification, android malware detection, spam and phishing detection and binary analysis. This paper outlines the survey of all the works related to deep learning based solutions for various cyber security use cases. Keywords: Deep learning, intrusion detection, malware detection, Android malware detection, spam & phishing detection, traffic analysis, binary analysis. △ Less

Submitted 29 January, 2019; v1 submitted 15 December, 2018; originally announced December 2018.

Comments: 15 pages

arXiv:1812.03519 [pdf]

Deep-Net: Deep Neural Network for Cyber Security Use Cases

Authors: Vinayakumar R, Barathi Ganesh HB, Prabaharan Poornachandran, Anand Kumar M, Soman KP

Abstract: Deep neural networks (DNNs) have witnessed as a powerful approach in this year by solving long-standing Artificial intelligence (AI) supervised and unsupervised tasks exists in natural language processing, speech processing, computer vision and others. In this paper, we attempt to apply DNNs on three different cyber security use cases: Android malware classification, incident detection and fraud d… ▽ More Deep neural networks (DNNs) have witnessed as a powerful approach in this year by solving long-standing Artificial intelligence (AI) supervised and unsupervised tasks exists in natural language processing, speech processing, computer vision and others. In this paper, we attempt to apply DNNs on three different cyber security use cases: Android malware classification, incident detection and fraud detection. The data set of each use case contains real known benign and malicious activities samples. The efficient network architecture for DNN is chosen by conducting various trails of experiments for network parameters and network structures. The experiments of such chosen efficient configurations of DNNs are run up to 1000 epochs with learning rate set in the range [0.01-0.5]. Experiments of DNN performed well in comparison to the classical machine learning algorithms in all cases of experiments of cyber security use cases. This is due to the fact that DNNs implicitly extract and build better features, identifies the characteristics of the data that lead to better accuracy. The best accuracy obtained by DNN and XGBoost on Android malware classification 0.940 and 0.741, incident detection 1.00 and 0.997 fraud detection 0.972 and 0.916 respectively. △ Less

Submitted 9 December, 2018; originally announced December 2018.

MSC Class: 68T50

arXiv:1810.04144 [pdf, other]

A Brief Survey on Autonomous Vehicle Possible Attacks, Exploits and Vulnerabilities

Authors: Amara Dinesh Kumar, Koti Naga Renu Chebrolu, Vinayakumar R, Soman KP

Abstract: Advanced driver assistance systems are advancing at a rapid pace and all major companies started investing in develo** the autonomous vehicles. But the security and reliability is still uncertain and debatable. Imagine that a vehicle is compromised by the attackers and then what they can do. An attacker can control brake, accelerate and even steering which can lead to catastrophic consequences.… ▽ More Advanced driver assistance systems are advancing at a rapid pace and all major companies started investing in develo** the autonomous vehicles. But the security and reliability is still uncertain and debatable. Imagine that a vehicle is compromised by the attackers and then what they can do. An attacker can control brake, accelerate and even steering which can lead to catastrophic consequences. This paper gives a very short and brief overview of most of the possible attacks on autonomous vehicle software and hardware and their potential implications. △ Less

Submitted 3 October, 2018; originally announced October 2018.

Comments: 5 Pages,1 Figure

arXiv:1810.03977 [pdf, other]

DeepImageSpam: Deep Learning based Image Spam Detection

Authors: Amara Dinesh Kumar, Vinayakumar R, Soman KP

Abstract: Hackers and spammers are employing innovative and novel techniques to deceive novice and even knowledgeable internet users. Image spam is one of such technique where the spammer varies and changes some portion of the image such that it is indistinguishable from the original image fooling the users. This paper proposes a deep learning based approach for image spam detection using the convolutional… ▽ More Hackers and spammers are employing innovative and novel techniques to deceive novice and even knowledgeable internet users. Image spam is one of such technique where the spammer varies and changes some portion of the image such that it is indistinguishable from the original image fooling the users. This paper proposes a deep learning based approach for image spam detection using the convolutional neural networks which uses a dataset with 810 natural images and 928 spam images for classification achieving an accuracy of 91.7% outperforming the existing image processing and machine learning techniques △ Less

Submitted 3 October, 2018; originally announced October 2018.

Comments: 4 pages

arXiv:1809.04461 [pdf]

DeepProteomics: Protein family classification using Shallow and Deep Networks

Authors: Anu Vazhayil, Vinayakumar R, Soman KP

Abstract: The knowledge regarding the function of proteins is necessary as it gives a clear picture of biological processes. Nevertheless, there are many protein sequences found and added to the databases but lacks functional annotation. The laboratory experiments take a considerable amount of time for annotation of the sequences. This arises the need to use computational techniques to classify proteins bas… ▽ More The knowledge regarding the function of proteins is necessary as it gives a clear picture of biological processes. Nevertheless, there are many protein sequences found and added to the databases but lacks functional annotation. The laboratory experiments take a considerable amount of time for annotation of the sequences. This arises the need to use computational techniques to classify proteins based on their functions. In our work, we have collected the data from Swiss-Prot containing 40433 proteins which is grouped into 30 families. We pass it to recurrent neural network(RNN), long short term memory(LSTM) and gated recurrent unit(GRU) model and compare it by applying trigram with deep neural network and shallow neural network on the same dataset. Through this approach, we could achieve maximum of around 78% accuracy for the classification of protein families. △ Less

Submitted 11 September, 2018; originally announced September 2018.

arXiv:1710.08396 [pdf, ps, other]

Deep Health Care Text Classification

Authors: Vinayakumar R, Barathi Ganesh HB, Anand Kumar M, Soman KP

Abstract: Health related social media mining is a valuable apparatus for the early recognition of the diverse antagonistic medicinal conditions. Mostly, the existing methods are based on machine learning with knowledge-based learning. This working note presents the Recurrent neural network (RNN) and Long short-term memory (LSTM) based embedding for automatic health text classification in the social media mi… ▽ More Health related social media mining is a valuable apparatus for the early recognition of the diverse antagonistic medicinal conditions. Mostly, the existing methods are based on machine learning with knowledge-based learning. This working note presents the Recurrent neural network (RNN) and Long short-term memory (LSTM) based embedding for automatic health text classification in the social media mining. For each task, two systems are built and that classify the tweet at the tweet level. RNN and LSTM are used for extracting features and non-linear activation function at the last layer facilitates to distinguish the tweets of different categories. The experiments are conducted on 2nd Social Media Mining for Health Applications Shared Task at AMIA 2017. The experiment results are considerable; however the proposed method is appropriate for the health text classification. This is primarily due to the reason that, it doesn't rely on any feature engineering mechanisms. △ Less

Submitted 23 October, 2017; originally announced October 2017.

Comments: 4 pages

MSC Class: 68T50

arXiv:1708.06068 [pdf, other]

Vector Space Model as Cognitive Space for Text Classification

Authors: Barathi Ganesh HB, Anand Kumar M, Soman KP

Abstract: In this era of digitization, knowing the user's sociolect aspects have become essential features to build the user specific recommendation systems. These sociolect aspects could be found by mining the user's language sharing in the form of text in social media and reviews. This paper describes about the experiment that was performed in PAN Author Profiling 2017 shared task. The objective of the ta… ▽ More In this era of digitization, knowing the user's sociolect aspects have become essential features to build the user specific recommendation systems. These sociolect aspects could be found by mining the user's language sharing in the form of text in social media and reviews. This paper describes about the experiment that was performed in PAN Author Profiling 2017 shared task. The objective of the task is to find the sociolect aspects of the users from their tweets. The sociolect aspects considered in this experiment are user's gender and native language information. Here user's tweets written in a different language from their native language are represented as Document - Term Matrix with document frequency as the constraint. Further classification is done using the Support Vector Machine by taking gender and native language as target classes. This experiment attains the average accuracy of 73.42% in gender prediction and 76.26% in the native language identification task. △ Less

Submitted 20 August, 2017; originally announced August 2017.

Comments: 6 pages, 6 figures, 3 tables

MSC Class: 68T50

arXiv:1407.2237 [pdf]

doi 10.1109/ICCAE.2010.5452072

An Algorithm for Alignment-free Sequence Comparison using Logical Match

Authors: Sanil Shanker KP, Elizabeth Sherly, Jim Austin

Abstract: This paper proposes an algorithm for alignment-free sequence comparison using Logical Match. Here, we compute the score using fuzzy membership values which generate automatically from the number of matches and mismatches. We demonstrate the method with both the artificial and real datum. The results show the uniqueness of the proposed method by analyzing DNA sequences taken from NCBI databank with… ▽ More This paper proposes an algorithm for alignment-free sequence comparison using Logical Match. Here, we compute the score using fuzzy membership values which generate automatically from the number of matches and mismatches. We demonstrate the method with both the artificial and real datum. The results show the uniqueness of the proposed method by analyzing DNA sequences taken from NCBI databank with a novel computational time. △ Less

Submitted 6 July, 2014; originally announced July 2014.

Comments: Computer and Automation Engineering (ICCAE), 2010 The 2nd International Conference on

arXiv:1407.2206 [pdf]

doi 10.1109/ICNIT.2010.5508469

Sequential Data Mining using Correlation Matrix Memory

Authors: Sanil Shanker KP, Aaron Turner, Elizabeth Sherly, Jim Austin

Abstract: This paper proposes a method for sequential data mining using correlation matrix memory. Here, we use the concept of the Logical Match to mine the indices of the sequential pattern. We demonstrate the uniqueness of the method with both the artificial and the real datum taken from NCBI databank. This paper proposes a method for sequential data mining using correlation matrix memory. Here, we use the concept of the Logical Match to mine the indices of the sequential pattern. We demonstrate the uniqueness of the method with both the artificial and the real datum taken from NCBI databank. △ Less

Submitted 6 July, 2014; originally announced July 2014.

Comments: Networking and Information Technology (ICNIT), 2010 International Conference on

Showing 1–20 of 20 results for author: KP, S