Search | arXiv e-print repository

arXiv:2004.03431 [pdf]

Harmony-Search and Otsu based System for Coronavirus Disease (COVID-19) Detection using Lung CT Scan Images

Authors: V. Ra**ikanth, Nilanjan Dey, Alex Noel Joseph Raj, Aboul Ella Hassanien, K. C. Santosh, N. Sri Madhava Raja

Abstract: Pneumonia is one of the foremost lung diseases and untreated pneumonia will lead to serious threats for all age groups. The proposed work aims to extract and evaluate the Coronavirus disease (COVID-19) caused pneumonia infection in lung using CT scans. We propose an image-assisted system to extract COVID-19 infected sections from lung CT scans (coronal view). It includes following steps: (i) Thres… ▽ More Pneumonia is one of the foremost lung diseases and untreated pneumonia will lead to serious threats for all age groups. The proposed work aims to extract and evaluate the Coronavirus disease (COVID-19) caused pneumonia infection in lung using CT scans. We propose an image-assisted system to extract COVID-19 infected sections from lung CT scans (coronal view). It includes following steps: (i) Threshold filter to extract the lung region by eliminating possible artifacts; (ii) Image enhancement using Harmony-Search-Optimization and Otsu thresholding; (iii) Image segmentation to extract infected region(s); and (iv) Region-of-interest (ROI) extraction (features) from binary image to compute level of severity. The features that are extracted from ROI are then employed to identify the pixel ratio between the lung and infection sections to identify infection level of severity. The primary objective of the tool is to assist the pulmonologist not only to detect but also to help plan treatment process. As a consequence, for mass screening processing, it will help prevent diagnostic burden. △ Less

Submitted 6 April, 2020; originally announced April 2020.

Comments: 13 pages

arXiv:1801.01627 [pdf, other]

Deep learning for word-level handwritten Indic script identification

Authors: Soumya Ukil, Swarnendu Ghosh, Sk Md Obaidullah, K. C. Santosh, Kaushik Roy, Nibaran Das

Abstract: We propose a novel method that uses convolutional neural networks (CNNs) for feature extraction. Not just limited to conventional spatial domain representation, we use multilevel 2D discrete Haar wavelet transform, where image representations are scaled to a variety of different sizes. These are then used to train different CNNs to select features. To be precise, we use 10 different CNNs that sele… ▽ More We propose a novel method that uses convolutional neural networks (CNNs) for feature extraction. Not just limited to conventional spatial domain representation, we use multilevel 2D discrete Haar wavelet transform, where image representations are scaled to a variety of different sizes. These are then used to train different CNNs to select features. To be precise, we use 10 different CNNs that select a set of 10240 features, i.e. 1024/CNN. With this, 11 different handwritten scripts are identified, where 1K words per script are used. In our test, we have achieved the maximum script identification rate of 94.73% using multi-layer perceptron (MLP). Our results outperform the state-of-the-art techniques. △ Less

Submitted 4 January, 2018; originally announced January 2018.

Comments: 11 pages, 6 figures , 2 tables

arXiv:1704.03404 [pdf]

ENWalk: Learning Network Features for Spam Detection in Twitter

Authors: K C Santosh, Suman Kalyan Maity, Arjun Mukherjee

Abstract: Social medias are increasing their influence with the vast public information leading to their active use for marketing by the companies and organizations. Such marketing promotions are difficult to identify unlike the traditional medias like TV and newspaper. So, it is very much important to identify the promoters in the social media. Although, there are active ongoing researches, existing approa… ▽ More Social medias are increasing their influence with the vast public information leading to their active use for marketing by the companies and organizations. Such marketing promotions are difficult to identify unlike the traditional medias like TV and newspaper. So, it is very much important to identify the promoters in the social media. Although, there are active ongoing researches, existing approaches are far from solving the problem. To identify such imposters, it is very much important to understand their strategies of social circle creation and dynamics of content posting. Are there any specific spammer types? How successful are each types? We analyze these questions in the light of social relationships in Twitter. Our analyses discover two types of spammers and their relationships with the dynamics of content posts. Our results discover novel dynamics of spamming which are intuitive and arguable. We propose ENWalk, a framework to detect the spammers by learning the feature representations of the users in the social media. We learn the feature representations using the random walks biased on the spam dynamics. Experimental results on large-scale twitter network and the corresponding tweets show the effectiveness of our approach that outperforms the existing approaches △ Less

Submitted 11 April, 2017; originally announced April 2017.

arXiv:1304.1930 [pdf, other]

Client-Driven Content Extraction Associated with Table

Authors: K. C. Santosh, Abdel Belaïd

Abstract: The goal of the project is to extract content within table in document images based on learnt patterns. Real-world users i.e., clients first provide a set of key fields within the table which they think are important. These are first used to represent the graph where nodes are labelled with semantics including other features and edges are attributed with relations. Attributed relational graph (ARG… ▽ More The goal of the project is to extract content within table in document images based on learnt patterns. Real-world users i.e., clients first provide a set of key fields within the table which they think are important. These are first used to represent the graph where nodes are labelled with semantics including other features and edges are attributed with relations. Attributed relational graph (ARG) is then employed to mine similar graphs from a document image. Each mined graph will represent an item within the table, and hence a set of such graphs will compose a table. We have validated the concept by using a real-world industrial problem. △ Less

Submitted 6 April, 2013; originally announced April 2013.

Journal ref: Machine Vision Applications (2013)

arXiv:1304.0421 [pdf, other]

Stroke-Based Cursive Character Recognition

Authors: K. C. Santosh, E. Iwata

Abstract: Human eye can see and read what is written or displayed either in natural handwriting or in printed format. The same work in case the machine does is called handwriting recognition. Handwriting recognition can be broken down into two categories: off-line and on-line. ... Human eye can see and read what is written or displayed either in natural handwriting or in printed format. The same work in case the machine does is called handwriting recognition. Handwriting recognition can be broken down into two categories: off-line and on-line. ... △ Less

Submitted 1 April, 2013; originally announced April 2013.

Journal ref: Advances in Character Recognition INTECH (Ed.) (2012) 175-192

arXiv:1303.4614 [pdf, other]

Handwritten and Printed Text Separation in Real Document

Authors: Abdel Belaïd, K. C. Santosh, Vincent Poulain D'Andecy

Abstract: The aim of the paper is to separate handwritten and printed text from a real document embedded with noise, graphics including annotations. Relying on run-length smoothing algorithm (RLSA), the extracted pseudo-lines and pseudo-words are used as basic blocks for classification. To handle this, a multi-class support vector machine (SVM) with Gaussian kernel performs a first labelling of each pseudo-… ▽ More The aim of the paper is to separate handwritten and printed text from a real document embedded with noise, graphics including annotations. Relying on run-length smoothing algorithm (RLSA), the extracted pseudo-lines and pseudo-words are used as basic blocks for classification. To handle this, a multi-class support vector machine (SVM) with Gaussian kernel performs a first labelling of each pseudo-word including the study of local neighbourhood. It then propagates the context between neighbours so that we can correct possible labelling errors. Considering running time complexity issue, we propose linear complexity methods where we use k-NN with constraint. When using a kd-tree, it is almost linearly proportional to the number of pseudo-words. The performance of our system is close to 90%, even when very small learning dataset where samples are basically composed of complex administrative documents. △ Less

Submitted 19 March, 2013; originally announced March 2013.

Comments: Machine Vision Applications (2013)

arXiv:1303.3087 [pdf]

Statistical Texture Features based Handwritten and Printed Text Classification in South Indian Documents

Authors: Mallikarjun Hangarge, K. C. Santosh, Srikanth Doddamani, Rajmohan Pardeshi

Abstract: In this paper, we use statistical texture features for handwritten and printed text classification. We primarily aim for word level classification in south Indian scripts. Words are first extracted from the scanned document. For each extracted word, statistical texture features are computed such as mean, standard deviation, smoothness, moment, uniformity, entropy and local range including local en… ▽ More In this paper, we use statistical texture features for handwritten and printed text classification. We primarily aim for word level classification in south Indian scripts. Words are first extracted from the scanned document. For each extracted word, statistical texture features are computed such as mean, standard deviation, smoothness, moment, uniformity, entropy and local range including local entropy. These feature vectors are then used to classify words via k-NN classifier. We have validated the approach over several different datasets. Scripts like Kannada, Telugu, Malayalam and Hindi i.e., Devanagari are primarily employed where an average classification rate of 99.26% is achieved. In addition, to provide an extensibility of the approach, we address Roman script by using publicly available dataset and interesting results are reported. △ Less

Submitted 13 March, 2013; originally announced March 2013.

Comments: Appeared in ICECIT-2102

Report number: Volume 1,Number 32

Showing 1–7 of 7 results for author: Santosh, K C