-
CompTLL-UNet: Compressed Domain Text-Line Localization in Challenging Handwritten Documents using Deep Feature Learning from JPEG Coefficients
Authors:
Bulla Rajesh,
Sk Mahafuz Zaman,
Mohammed Javed,
P. Nagabhushan
Abstract:
Automatic localization of text-lines in handwritten documents is still an open and challenging research problem. Various writing issues such as uneven spacing between the lines, oscillating and touching text, and the presence of skew become much more challenging when the case of complex handwritten document images are considered for segmentation directly in their respective compressed representati…
▽ More
Automatic localization of text-lines in handwritten documents is still an open and challenging research problem. Various writing issues such as uneven spacing between the lines, oscillating and touching text, and the presence of skew become much more challenging when the case of complex handwritten document images are considered for segmentation directly in their respective compressed representation. This is because, the conventional way of processing compressed documents is through decompression, but here in this paper, we propose an idea that employs deep feature learning directly from the JPEG compressed coefficients without full decompression to accomplish text-line localization in the JPEG compressed domain. A modified U-Net architecture known as Compressed Text-Line Localization Network (CompTLL-UNet) is designed to accomplish it. The model is trained and tested with JPEG compressed version of benchmark datasets including ICDAR2017 (cBAD) and ICDAR2019 (cBAD), reporting the state-of-the-art performance with reduced storage and computational costs in the JPEG compressed domain.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
DWT-CompCNN: Deep Image Classification Network for High Throughput JPEG 2000 Compressed Documents
Authors:
Tejasvee Bisen,
Mohammed Javed,
Shashank Kirtania,
P. Nagabhushan
Abstract:
For any digital application with document images such as retrieval, the classification of document images becomes an essential stage. Conventionally for the purpose, the full versions of the documents, that is the uncompressed document images make the input dataset, which poses a threat due to the big volume required to accommodate the full versions of the documents. Therefore, it would be novel,…
▽ More
For any digital application with document images such as retrieval, the classification of document images becomes an essential stage. Conventionally for the purpose, the full versions of the documents, that is the uncompressed document images make the input dataset, which poses a threat due to the big volume required to accommodate the full versions of the documents. Therefore, it would be novel, if the same classification task could be accomplished directly (with some partial decompression) with the compressed representation of documents in order to make the whole process computationally more efficient. In this research work, a novel deep learning model, DWT CompCNN is proposed for classification of documents that are compressed using High Throughput JPEG 2000 (HTJ2K) algorithm. The proposed DWT-CompCNN comprises of five convolutional layers with filter sizes of 16, 32, 64, 128, and 256 consecutively for each increasing layer to improve learning from the wavelet coefficients extracted from the compressed images. Experiments are performed on two benchmark datasets- Tobacco-3482 and RVL-CDIP, which demonstrate that the proposed model is time and space efficient, and also achieves a better classification accuracy in compressed domain.
△ Less
Submitted 15 July, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
T2CI-GAN: Text to Compressed Image generation using Generative Adversarial Network
Authors:
Bulla Rajesh,
Nandakishore Dusa,
Mohammed Javed,
Shiv Ram Dubey,
P. Nagabhushan
Abstract:
The problem of generating textual descriptions for the visual data has gained research attention in the recent years. In contrast to that the problem of generating visual data from textual descriptions is still very challenging, because it requires the combination of both Natural Language Processing (NLP) and Computer Vision techniques. The existing methods utilize the Generative Adversarial Netwo…
▽ More
The problem of generating textual descriptions for the visual data has gained research attention in the recent years. In contrast to that the problem of generating visual data from textual descriptions is still very challenging, because it requires the combination of both Natural Language Processing (NLP) and Computer Vision techniques. The existing methods utilize the Generative Adversarial Networks (GANs) and generate the uncompressed images from textual description. However, in practice, most of the visual data are processed and transmitted in the compressed representation. Hence, the proposed work attempts to generate the visual data directly in the compressed representation form using Deep Convolutional GANs (DCGANs) to achieve the storage and computational efficiency. We propose GAN models for compressed image generation from text. The first model is directly trained with JPEG compressed DCT images (compressed domain) to generate the compressed images from text descriptions. The second model is trained with RGB images (pixel domain) to generate JPEG compressed DCT representation from text descriptions. The proposed models are tested on an open source benchmark dataset Oxford-102 Flower images using both RGB and JPEG compressed versions, and accomplished the state-of-the-art performance in the JPEG compressed domain. The code will be publicly released at GitHub after acceptance of paper.
△ Less
Submitted 1 October, 2022;
originally announced October 2022.
-
Pinball-OCSVM for early-stage COVID-19 diagnosis with limited posteroanterior chest X-ray images
Authors:
Sanjay Kumar Sonbhadra,
Sonali Agarwal,
P. Nagabhushan
Abstract:
The infection of respiratory coronavirus disease 2019 (COVID-19) starts with the upper respiratory tract and as the virus grows, the infection can progress to lungs and develop pneumonia. The conventional way of COVID-19 diagnosis is reverse transcription polymerase chain reaction (RT-PCR), which is less sensitive during early stages; especially if the patient is asymptomatic, which may further ca…
▽ More
The infection of respiratory coronavirus disease 2019 (COVID-19) starts with the upper respiratory tract and as the virus grows, the infection can progress to lungs and develop pneumonia. The conventional way of COVID-19 diagnosis is reverse transcription polymerase chain reaction (RT-PCR), which is less sensitive during early stages; especially if the patient is asymptomatic, which may further cause more severe pneumonia. In this context, several deep learning models have been proposed to identify pulmonary infections using publicly available chest X-ray (CXR) image datasets for early diagnosis, better treatment and quick cure. In these datasets, presence of less number of COVID-19 positive samples compared to other classes (normal, pneumonia and Tuberculosis) raises the challenge for unbiased learning of deep learning models. All deep learning models opted class balancing techniques to solve this issue; which however should be avoided in any medical diagnosis process. Moreover, the deep learning models are also data hungry and need massive computation resources. Therefore for quicker diagnosis, this research proposes a novel pinball loss function based one-class support vector machine (PB-OCSVM), that can work in presence of limited COVID-19 positive CXR samples with objectives to maximize the learning efficiency and to minimize the false predictions. The performance of the proposed model is compared with conventional OCSVM and existing deep learning models, and the experimental results prove that the proposed model outperformed over state-of-the-art methods. To validate the robustness of the proposed model, experiments are also performed with noisy CXR images and UCI benchmark datasets.
△ Less
Submitted 5 June, 2021; v1 submitted 15 October, 2020;
originally announced October 2020.
-
Depth-wise layering of 3d images using dense depth maps: a threshold based approach
Authors:
Seyedsaeid Mirkamali,
P. Nagabhushan
Abstract:
Image segmentation has long been a basic problem in computer vision. Depth-wise Layering is a kind of segmentation that slices an image in a depth-wise sequence unlike the conventional image segmentation problems dealing with surface-wise decomposition. The proposed Depth-wise Layering technique uses a single depth image of a static scene to slice it into multiple layers. The technique employs a t…
▽ More
Image segmentation has long been a basic problem in computer vision. Depth-wise Layering is a kind of segmentation that slices an image in a depth-wise sequence unlike the conventional image segmentation problems dealing with surface-wise decomposition. The proposed Depth-wise Layering technique uses a single depth image of a static scene to slice it into multiple layers. The technique employs a thresholding approach to segment rows of the dense depth map into smaller partitions called Line-Segments in this paper. Then, it uses the line-segment labelling method to identify number of objects and layers of the scene independently. The final stage is to link objects of the scene to their respective object-layers. We evaluate the efficiency of the proposed technique by applying that on many images along with their dense depth maps. The experiments have shown promising results of layering.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
Automatic Page Segmentation Without Decompressing the Run-Length Compressed Text Documents
Authors:
Mohammed Javed,
P. Nagabhushan
Abstract:
Page segmentation is considered to be the crucial stage for the automatic analysis of documents with complex layouts. This has traditionally been carried out in uncompressed documents, although most of the documents in real life exist in a compressed form warranted by the requirement to make storage and transfer efficient. However, carrying out page segmentation directly in compressed documents wi…
▽ More
Page segmentation is considered to be the crucial stage for the automatic analysis of documents with complex layouts. This has traditionally been carried out in uncompressed documents, although most of the documents in real life exist in a compressed form warranted by the requirement to make storage and transfer efficient. However, carrying out page segmentation directly in compressed documents without going through the stage of decompression is a challenging goal. This research paper proposes demonstrating the possibility of carrying out a page segmentation operation directly in the run-length data of the CCITT Group-3 compressed text document, which could be single- or multi-columned and might even have some text regions in the inverted text color mode. Therefore, before carrying out the segmentation of the text document into columns, each column into paragraphs, each paragraph into text lines, each line into words, and, finally, each word into characters, a pre-processing of the text document needs to be carried out. The pre-processing stage identifies the normal text regions and inverted text regions, and the inverted text regions are toggled to the normal mode. In the sequel to initiate column separation, a new strategy of incremental assimilation of white space runs in the vertical direction and the auto-estimation of certain related parameters is proposed. A procedure to realize column-segmentation employing these extracted parameters has been devised. Subsequently, what follows first is a two-level horizontal row separation process, which segments every column into paragraphs, and in turn, into text-lines. Then, there is a two-level vertical column separation process, which completes the separation into words and characters.
△ Less
Submitted 2 July, 2020;
originally announced July 2020.
-
Unleashing the power of disruptive and emerging technologies amid COVID-19: A detailed review
Authors:
Sonali Agarwal,
Narinder Singh Punn,
Sanjay Kumar Sonbhadra,
M. Tanveer,
P. Nagabhushan,
K K Soundra Pandian,
Praveer Saxena
Abstract:
The unprecedented outbreak of the novel coronavirus (COVID-19), during early December 2019 in Wuhan, China, has quickly evolved into a global pandemic, became a matter of grave concern, and placed government agencies worldwide in a precarious position. The scarcity of resources and lack of experiences to endure the COVID-19 pandemic, combined with the fear of future consequences has established th…
▽ More
The unprecedented outbreak of the novel coronavirus (COVID-19), during early December 2019 in Wuhan, China, has quickly evolved into a global pandemic, became a matter of grave concern, and placed government agencies worldwide in a precarious position. The scarcity of resources and lack of experiences to endure the COVID-19 pandemic, combined with the fear of future consequences has established the need for adoption of emerging and future technologies to address the upcoming challenges. Since the last five months, the amount of pandemic impact has reached its pinnacle that is altering everyone's life; and humans are now bound to adopt safe ways to survive under the risk of being affected. Technological advances are now accelerating faster than ever before to stay ahead of the consequences and acquire new capabilities to build a safer world. Thus, there is a rising need to unfold the power of emerging, future and disruptive technologies to explore all possible ways to fight against COVID-19. In this review article, we attempt to study all emerging, future, and disruptive technologies that can be utilized to mitigate the impact of COVID-19. Building on background insights, detailed technological specific use cases to fight against COVID-19 have been discussed in terms of their strengths, weaknesses, opportunities, and threats (SWOT). As concluding remarks, we highlight prioritized research areas and upcoming opportunities to blur the lines between the physical, digital, and biological domain-specific challenges and also illuminate collaborative research directions for moving towards a post-COVID-19 world.
△ Less
Submitted 19 April, 2021; v1 submitted 23 May, 2020;
originally announced May 2020.
-
Target specific mining of COVID-19 scholarly articles using one-class approach
Authors:
Sanjay Kumar Sonbhadra,
Sonali Agarwal,
P. Nagabhushan
Abstract:
In recent years, several research articles have been published in the field of corona-virus caused diseases like severe acute respiratory syndrome (SARS), middle east respiratory syndrome (MERS) and COVID-19. In the presence of numerous research articles, extracting best-suited articles is time-consuming and manually impractical. The objective of this paper is to extract the activity and trends of…
▽ More
In recent years, several research articles have been published in the field of corona-virus caused diseases like severe acute respiratory syndrome (SARS), middle east respiratory syndrome (MERS) and COVID-19. In the presence of numerous research articles, extracting best-suited articles is time-consuming and manually impractical. The objective of this paper is to extract the activity and trends of corona-virus related research articles using machine learning approaches. The COVID-19 open research dataset (CORD-19) is used for experiments, whereas several target-tasks along with explanations are defined for classification, based on domain knowledge. Clustering techniques are used to create the different clusters of available articles, and later the task assignment is performed using parallel one-class support vector machines (OCSVMs). Experiments with original and reduced features validate the performance of the approach. It is evident that the k-means clustering algorithm, followed by parallel OCSVMs, outperforms other methods for both original and reduced feature space.
△ Less
Submitted 1 August, 2020; v1 submitted 24 April, 2020;
originally announced April 2020.
-
Word and character segmentation directly in run-length compressed handwritten document images
Authors:
Amarnath R,
P. Nagabhushan,
Mohammed Javed
Abstract:
From the literature, it is demonstrated that performing text-line segmentation directly in the run-length compressed handwritten document images significantly reduces the computational time and memory space. In this paper, we investigate the issues of word and character segmentation directly on the run-length compressed document images. Primarily, the spreads of the characters are intelligently ex…
▽ More
From the literature, it is demonstrated that performing text-line segmentation directly in the run-length compressed handwritten document images significantly reduces the computational time and memory space. In this paper, we investigate the issues of word and character segmentation directly on the run-length compressed document images. Primarily, the spreads of the characters are intelligently extracted from the foreground runs of the compressed data and subsequently connected components are established. The spacing between the connected components would be larger between the adjacent words when compared to that of intra-words. With this knowledge, a threshold is empirically chosen for inter-word separation. Every connected component within a word is further analysed for character segmentation. Here, min-cut graph concept is used for separating the touching characters. Over-segmentation and under-segmentation issues are addressed by insertion and deletion operations respectively. The approach has been developed particularly for compressed handwritten English document images. However, the model has been tested on non-English document images.
△ Less
Submitted 18 August, 2019;
originally announced September 2019.
-
Appearance invariant Entry-Exit matching using visual soft biometric traits
Authors:
Vinay Kumar V,
P Nagabhushan
Abstract:
The problem of appearance invariant subject recognition for Entry-Exit surveillance applications is addressed. A novel Semantic Entry-Exit matching model that makes use of ancillary information about subjects such as height, build, complexion and clothing color to endorse exit of every subject who had entered private area is proposed in this paper. The proposed method is robust to variations in cl…
▽ More
The problem of appearance invariant subject recognition for Entry-Exit surveillance applications is addressed. A novel Semantic Entry-Exit matching model that makes use of ancillary information about subjects such as height, build, complexion and clothing color to endorse exit of every subject who had entered private area is proposed in this paper. The proposed method is robust to variations in clothing. Each describing attribute is given equal weight while computing the matching score and hence the proposed model achieves high rank-k accuracy on benchmark datasets. The soft biometric traits used as a combination though cannot achieve high rank-1 accuracy, it helps to narrow down the search to match using reliable biometric traits such as gait and face whose learning and matching time is costlier when compared to the visual soft biometrics.
△ Less
Submitted 26 August, 2019;
originally announced September 2019.
-
Monitoring of people entering and exiting private areas using Computer Vision
Authors:
Vinay Kumar V,
P Nagabhushan
Abstract:
Entry-Exit surveillance is a novel research problem that addresses security concerns when people attain absolute privacy in camera forbidden areas such as toilets and changing rooms that are basic amenities to the humans in public places such as Shop** malls, Airports, Bus and Rail stations. The objective is, if not inside these camera forbidden areas, from outside, the individuals are to be mon…
▽ More
Entry-Exit surveillance is a novel research problem that addresses security concerns when people attain absolute privacy in camera forbidden areas such as toilets and changing rooms that are basic amenities to the humans in public places such as Shop** malls, Airports, Bus and Rail stations. The objective is, if not inside these camera forbidden areas, from outside, the individuals are to be monitored to analyze the time spent by them inside and also the suspecting transformations in their appearances if any. In this paper, firstly, a pseudo-annotated dataset of a laboratory observation of people entering and exiting the camera forbidden area captured using two cameras in contrast to the state-of-the-art single-camera based EnEx dataset is presented. Conventionally the proposed dataset is named \textbf{\textit{EnEx2}}. Next, a spatial transition based event detection to determine the entry or exit of individuals is presented with standard results by evaluating the proposed model using the proposed dataset and the publicly available standard video surveillance datasets that are hypothesized to Entry-Exit surveillance scenarios. The proposed dataset is expected to enkindle active research in Entry-Exit Surveillance domain.
△ Less
Submitted 28 August, 2019; v1 submitted 2 August, 2019;
originally announced August 2019.
-
Automatic Text Line Segmentation Directly in JPEG Compressed Document Images
Authors:
Bulla Rajesh,
Mohammed Javed,
P Nagabhushan
Abstract:
JPEG is one of the popular image compression algorithms that provide efficient storage and transmission capabilities in consumer electronics, and hence it is the most preferred image format over the internet world. In the present digital and Big-data era, a huge volume of JPEG compressed document images are being archived and communicated through consumer electronics on daily basis. Though it is a…
▽ More
JPEG is one of the popular image compression algorithms that provide efficient storage and transmission capabilities in consumer electronics, and hence it is the most preferred image format over the internet world. In the present digital and Big-data era, a huge volume of JPEG compressed document images are being archived and communicated through consumer electronics on daily basis. Though it is advantageous to have data in the compressed form on one side, however, on the other side processing with off-the-shelf methods becomes computationally expensive because it requires decompression and recompression operations. Therefore, it would be novel and efficient, if the compressed data are processed directly in their respective compressed domains of consumer electronics. In the present research paper, we propose to demonstrate this idea taking the case study of printed text line segmentation. Since, JPEG achieves compression by dividing the image into non overlap** 8x8 blocks in the pixel domain and using Discrete Cosine Transform (DCT); it is very likely that the partitioned 8x8 DCT blocks overlap the contents of two adjacent text-lines without leaving any clue for the line separator, thus making text-line segmentation a challenging problem. Two approaches of segmentation have been proposed here using the DC projection profile and AC coefficients of each 8x8 DCT block. The first approach is based on the strategy of partial decompression of selected DCT blocks, and the second approach is with intelligent analysis of F10 and F11 AC coefficients and without using any type of decompression. The proposed methods have been tested with variable font sizes, font style and spacing between lines, and a good performance is reported.
△ Less
Submitted 29 July, 2019;
originally announced July 2019.
-
Text line Segmentation in Compressed Representation of Handwritten Document using Tunneling Algorithm
Authors:
Amarnath R,
P Nagabhushan
Abstract:
In this research work, we perform text line segmentation directly in compressed representation of an unconstrained handwritten document image. In this relation, we make use of text line terminal points which is the current state-of-the-art. The terminal points spotted along both margins (left and right) of a document image for every text line are considered as source and target respectively. The t…
▽ More
In this research work, we perform text line segmentation directly in compressed representation of an unconstrained handwritten document image. In this relation, we make use of text line terminal points which is the current state-of-the-art. The terminal points spotted along both margins (left and right) of a document image for every text line are considered as source and target respectively. The tunneling algorithm uses a single agent (or robot) to identify the coordinate positions in the compressed representation to perform text-line segmentation of the document. The agent starts at a source point and progressively tunnels a path routing in between two adjacent text lines and reaches the probable target. The agent's navigation path from source to the target bypassing obstacles, if any, results in segregating the two adjacent text lines. However, the target point would be known only when the agent reaches the destination; this is applicable for all source points and henceforth we could analyze the correspondence between source and target nodes. Artificial Intelligence in Expert systems, dynamic programming and greedy strategies are employed for every search space while tunneling. An exhaustive experimentation is carried out on various benchmark datasets including ICDAR13 and the performances are reported.
△ Less
Submitted 3 January, 2019;
originally announced January 2019.
-
Spotting Separator Points at Line Terminals in Compressed Document Images for Text-line Segmentation
Authors:
Amarnath R,
P. Nagabhushan
Abstract:
Line separators are used to segregate text-lines from one another in document image analysis. Finding the separator points at every line terminal in a document image would enable text-line segmentation. In particular, identifying the separators in handwritten text could be a thrilling exercise. Obviously it would be challenging to perform this in the compressed version of a document image and that…
▽ More
Line separators are used to segregate text-lines from one another in document image analysis. Finding the separator points at every line terminal in a document image would enable text-line segmentation. In particular, identifying the separators in handwritten text could be a thrilling exercise. Obviously it would be challenging to perform this in the compressed version of a document image and that is the proposed objective in this research. Such an effort would prevent the computational burden of decompressing a document for text-line segmentation. Since document images are generally compressed using run length encoding (RLE) technique as per the CCITT standards, the first column in the RLE will be a white column. The value (depth) in the white column is very low when a particular line is a text line and the depth could be larger at the point of text line separation. A longer consecutive sequence of such larger depth should indicate the gap between the text lines, which provides the separator region. In case of over separation and under separation issues, corrective actions such as deletion and insertion are suggested respectively. An extensive experimentation is conducted on the compressed images of the benchmark datasets of ICDAR13 and Alireza et al [17] to demonstrate the efficacy.
△ Less
Submitted 18 August, 2017;
originally announced August 2017.
-
Direct Processing of Document Images in Compressed Domain
Authors:
Mohammed Javed,
P. Nagabhushan,
B. B. Chaudhuri
Abstract:
With the rapid increase in the volume of Big data of this digital era, fax documents, invoices, receipts, etc are traditionally subjected to compression for the efficiency of data storage and transfer. However, in order to process these documents, they need to undergo the stage of decompression which indents additional computing resources. This limitation induces the motivation to research on the…
▽ More
With the rapid increase in the volume of Big data of this digital era, fax documents, invoices, receipts, etc are traditionally subjected to compression for the efficiency of data storage and transfer. However, in order to process these documents, they need to undergo the stage of decompression which indents additional computing resources. This limitation induces the motivation to research on the possibility of directly processing of compressed images. In this research paper, we summarize the research work carried out to perform different operations straight from run-length compressed documents without going through the stage of decompression. The different operations demonstrated are feature extraction; text-line, word and character segmentation; document block segmentation; and font size detection, all carried out in the compressed version of the document. Feature extraction methods demonstrate how to extract the conventionally defined features such as projection profile, run-histogram and entropy, directly from the compressed document data. Document segmentation involves the extraction of compressed segments of text-lines, words and characters using the vertical and horizontal projection profile features. Further an attempt is made to segment randomly a block of interest from the compressed document and subsequently facilitate absolute and relative characterization of the segmented block which finds real time applications in automatic processing of Bank Cheques, Challans, etc, in compressed domain. Finally an application to detect font size at text line level is also investigated. All the proposed algorithms are validated experimentally with sufficient data set of compressed documents.
△ Less
Submitted 14 October, 2014; v1 submitted 11 October, 2014;
originally announced October 2014.
-
Automatic Removal of Marginal Annotations in Printed Text Document
Authors:
Abdessamad Elboushaki,
Rachida Hannane,
P. Nagabhushan,
Mohammed Javed
Abstract:
Recovering the original printed texts from a document with added handwritten annotations in the marginal area is one of the challenging problems, especially when the original document is not available. Therefore, this paper aims at salvaging automatically the original document from the annotated document by detecting and removing any handwritten annotations that appear in the marginal area of the…
▽ More
Recovering the original printed texts from a document with added handwritten annotations in the marginal area is one of the challenging problems, especially when the original document is not available. Therefore, this paper aims at salvaging automatically the original document from the annotated document by detecting and removing any handwritten annotations that appear in the marginal area of the document without any loss of information. Here a two stage algorithm is proposed, where in the first stage due to approximate marginal boundary detection with horizontal and vertical projection profiles, all of the marginal annotations along with some part of the original printed text that may appear very close to the marginal boundary are removed. Therefore as a second stage, using the connected components, a strategy is applied to bring back the printed text components cropped during the first stage. The proposed method is validated using a dataset of 50 documents having complex handwritten annotations, which gives an overall accuracy of 89.01% in removing the marginal annotations and 97.74% in case of retrieving the original printed text document.
△ Less
Submitted 8 August, 2014;
originally announced August 2014.
-
Entropy Computation of Document Images in Run-Length Compressed Domain
Authors:
P. Nagabhushan,
Mohammed Javed,
B. B. Chaudhuri
Abstract:
Compression of documents, images, audios and videos have been traditionally practiced to increase the efficiency of data storage and transfer. However, in order to process or carry out any analytical computations, decompression has become an unavoidable pre-requisite. In this research work, we have attempted to compute the entropy, which is an important document analytic directly from the compress…
▽ More
Compression of documents, images, audios and videos have been traditionally practiced to increase the efficiency of data storage and transfer. However, in order to process or carry out any analytical computations, decompression has become an unavoidable pre-requisite. In this research work, we have attempted to compute the entropy, which is an important document analytic directly from the compressed documents. We use Conventional Entropy Quantifier (CEQ) and Spatial Entropy Quantifiers (SEQ) for entropy computations [1]. The entropies obtained are useful in applications like establishing equivalence, word spotting and document retrieval. Experiments have been performed with all the data sets of [1], at character, word and line levels taking compressed documents in run-length compressed domain. The algorithms developed are computational and space efficient, and results obtained match 100% with the results reported in [1].
△ Less
Submitted 8 April, 2014;
originally announced April 2014.
-
Extraction of Projection Profile, Run-Histogram and Entropy Features Straight from Run-Length Compressed Text-Documents
Authors:
Mohammed Javed,
P. Nagabhushan,
B. B. Chaudhuri
Abstract:
Document Image Analysis, like any Digital Image Analysis requires identification and extraction of proper features, which are generally extracted from uncompressed images, though in reality images are made available in compressed form for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computi…
▽ More
Document Image Analysis, like any Digital Image Analysis requires identification and extraction of proper features, which are generally extracted from uncompressed images, though in reality images are made available in compressed form for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computing resources. This limitation induces the motivation to research in extracting features directly from the compressed image. In this research, we propose to extract essential features such as projection profile, run-histogram and entropy for text document analysis directly from run-length compressed text-documents. The experimentation illustrates that features are extracted directly from the compressed image without going through the stage of decompression, because of which the computing time is reduced. The feature values so extracted are exactly identical to those extracted from uncompressed images.
△ Less
Submitted 2 April, 2014;
originally announced April 2014.
-
Extraction of Line Word Character Segments Directly from Run Length Compressed Printed Text Documents
Authors:
Mohammed Javed,
P. Nagabhushan,
B. B. Chaudhuri
Abstract:
Segmentation of a text-document into lines, words and characters, which is considered to be the crucial pre-processing stage in Optical Character Recognition (OCR) is traditionally carried out on uncompressed documents, although most of the documents in real life are available in compressed form, for the reasons such as transmission and storage efficiency. However, this implies that the compressed…
▽ More
Segmentation of a text-document into lines, words and characters, which is considered to be the crucial pre-processing stage in Optical Character Recognition (OCR) is traditionally carried out on uncompressed documents, although most of the documents in real life are available in compressed form, for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computing resources. This limitation has motivated us to take up research in document image analysis using compressed documents. In this paper, we think in a new way to carry out segmentation at line, word and character level in run-length compressed printed-text-documents. We extract the horizontal projection profile curve from the compressed file and using the local minima points perform line segmentation. However, tracing vertical information which leads to tracking words-characters in a run-length compressed file is not very straight forward. Therefore, we propose a novel technique for carrying out simultaneous word and character segmentation by pop** out column runs from each row in an intelligent sequence. The proposed algorithms have been validated with 1101 text-lines, 1409 words and 7582 characters from a data-set of 35 noise and skew free compressed documents of Bengali, Kannada and English Scripts.
△ Less
Submitted 30 March, 2014;
originally announced March 2014.
-
Texture Defect Detection in Gradient Space
Authors:
V. Asha,
N. U. Bhajantri,
P. Nagabhushan
Abstract:
In this paper, we propose a machine vision algorithm for automatically detecting defects in patterned textures with the help of gradient space and its energy. Experiments on real fabric images with defects show that the proposed method can be used for automatic detection of fabric defects in textile industries.
In this paper, we propose a machine vision algorithm for automatically detecting defects in patterned textures with the help of gradient space and its energy. Experiments on real fabric images with defects show that the proposed method can be used for automatic detection of fabric defects in textile industries.
△ Less
Submitted 9 March, 2014;
originally announced March 2014.
-
Automatic Detection of Font Size Straight from Run Length Compressed Text Documents
Authors:
Mohammed Javed,
P. Nagabhushan,
B. B. Chaudhuri
Abstract:
Automatic detection of font size finds many applications in the area of intelligent OCRing and document image analysis, which has been traditionally practiced over uncompressed documents, although in real life the documents exist in compressed form for efficient storage and transmission. It would be novel and intelligent if the task of font size detection could be carried out directly from the com…
▽ More
Automatic detection of font size finds many applications in the area of intelligent OCRing and document image analysis, which has been traditionally practiced over uncompressed documents, although in real life the documents exist in compressed form for efficient storage and transmission. It would be novel and intelligent if the task of font size detection could be carried out directly from the compressed data of these documents without decompressing, which would result in saving of considerable amount of processing time and space. Therefore, in this paper we present a novel idea of learning and detecting font size directly from run-length compressed text documents at line level using simple line height features, which paves the way for intelligent OCRing and document analysis directly from compressed documents. In the proposed model, the given mixed-case text documents of different font size are segmented into compressed text lines and the features extracted such as line height and ascender height are used to capture the pattern of font size in the form of a regression line, using which the automatic detection of font size is done during the recognition stage. The method is experimented with a dataset of 50 compressed documents consisting of 780 text lines of single font size and 375 text lines of mixed font size resulting in an overall accuracy of 99.67%.
△ Less
Submitted 18 February, 2014;
originally announced February 2014.
-
Direct Processing of Run Length Compressed Document Image for Segmentation and Characterization of a Specified Block
Authors:
Mohammed Javed,
P. Nagabhushan,
B. B. Chaudhuri
Abstract:
Extracting a block of interest referred to as segmenting a specified block in an image and studying its characteristics is of general research interest, and could be a challenging if such a segmentation task has to be carried out directly in a compressed image. This is the objective of the present research work. The proposal is to evolve a method which would segment and extract a specified block,…
▽ More
Extracting a block of interest referred to as segmenting a specified block in an image and studying its characteristics is of general research interest, and could be a challenging if such a segmentation task has to be carried out directly in a compressed image. This is the objective of the present research work. The proposal is to evolve a method which would segment and extract a specified block, and carry out its characterization without decompressing a compressed image, for two major reasons that most of the image archives contain images in compressed format and decompressing an image indents additional computing time and space. Specifically in this research work, the proposal is to work on run-length compressed document images.
△ Less
Submitted 18 February, 2014; v1 submitted 9 February, 2014;
originally announced February 2014.
-
Periodicity Extraction using Superposition of Distance Matching Function and One-dimensional Haar Wavelet Transform
Authors:
V. Asha,
N. U. Bhajantri,
P. Nagabhushan
Abstract:
Periodicity of a texture is one of the important visual characteristics and is often used as a measure for textural discrimination at the structural level. Knowledge about periodicity of a texture is very essential in the field of texture synthesis and texture compression and also in the design of frieze and wall papers. In this paper, we propose a method of periodicity extraction from noisy image…
▽ More
Periodicity of a texture is one of the important visual characteristics and is often used as a measure for textural discrimination at the structural level. Knowledge about periodicity of a texture is very essential in the field of texture synthesis and texture compression and also in the design of frieze and wall papers. In this paper, we propose a method of periodicity extraction from noisy images based on superposition of distance matching function (DMF) and wavelet decomposition without de-noising the test images. Overall DMFs are subjected to single-level Haar wavelet decomposition to obtain approximate and detailed coefficients. Extracted coefficients help in determination of periodicities in row and column directions. We illustrate the usefulness and the effectiveness of the proposed method in a texture synthesis application.
△ Less
Submitted 15 November, 2013;
originally announced November 2013.
-
Automatic Detection of Texture Defects Using Texture-Periodicity and Gabor Wavelets
Authors:
V. Asha,
N. U. Bhajantri,
P. Nagabhushan
Abstract:
In this paper, we propose a machine vision algorithm for automatically detecting defects in textures belonging to 16 out of 17 wallpaper groups using texture-periodicity and a family of Gabor wavelets. Input defective images are subjected to Gabor wavelet transformation in multi-scales and multi-orientations and a resultant image is obtained in L2 norm. The resultant image is split into several pe…
▽ More
In this paper, we propose a machine vision algorithm for automatically detecting defects in textures belonging to 16 out of 17 wallpaper groups using texture-periodicity and a family of Gabor wavelets. Input defective images are subjected to Gabor wavelet transformation in multi-scales and multi-orientations and a resultant image is obtained in L2 norm. The resultant image is split into several periodic blocks and energy of each block is used as a feature space to automatically identify defective and defect-free blocks using Ward's hierarchical clustering. Experiments on defective fabric images of three major wallpaper groups, namely, pmm, p2 and p4m, show that the proposed method is robust in finding fabric defects without human intervention and can be used for automatic defect detection in fabric industries.
△ Less
Submitted 6 December, 2012;
originally announced December 2012.
-
GLCM-based chi-square histogram distance for automatic detection of defects on patterned textures
Authors:
V. Asha,
N. U. Bhajantri,
P. Nagabhushan
Abstract:
Chi-square histogram distance is one of the distance measures that can be used to find dissimilarity between two histograms. Motivated by the fact that texture discrimination by human vision system is based on second-order statistics, we make use of histogram of gray-level co-occurrence matrix (GLCM) that is based on second-order statistics and propose a new machine vision algorithm for automatic…
▽ More
Chi-square histogram distance is one of the distance measures that can be used to find dissimilarity between two histograms. Motivated by the fact that texture discrimination by human vision system is based on second-order statistics, we make use of histogram of gray-level co-occurrence matrix (GLCM) that is based on second-order statistics and propose a new machine vision algorithm for automatic defect detection on patterned textures. Input defective images are split into several periodic blocks and GLCMs are computed after quantizing the gray levels from 0-255 to 0-63 to keep the size of GLCM compact and to reduce computation time. Dissimilarity matrix derived from chi-square distances of the GLCMs is subjected to hierarchical clustering to automatically identify defective and defect-free blocks. Effectiveness of the proposed method is demonstrated through experiments on defective real-fabric images of 2 major wallpaper groups (pmm and p4m groups).
△ Less
Submitted 3 December, 2012;
originally announced December 2012.