-
Towards Detecting, Recognizing, and Parsing the Address Information from Bangla Signboard: A Deep Learning-based Approach
Authors:
Hasan Murad,
Mohammed Eunus Ali
Abstract:
Retrieving textual information from natural scene images is an active research area in the field of computer vision with numerous practical applications. Detecting text regions and extracting text from signboards is a challenging problem due to special characteristics like reflecting lights, uneven illumination, or shadows found in real-life natural scene images. With the advent of deep learning-b…
▽ More
Retrieving textual information from natural scene images is an active research area in the field of computer vision with numerous practical applications. Detecting text regions and extracting text from signboards is a challenging problem due to special characteristics like reflecting lights, uneven illumination, or shadows found in real-life natural scene images. With the advent of deep learning-based methods, different sophisticated techniques have been proposed for text detection and text recognition from the natural scene. Though a significant amount of effort has been devoted to extracting natural scene text for resourceful languages like English, little has been done for low-resource languages like Bangla. In this research work, we have proposed an end-to-end system with deep learning-based models for efficiently detecting, recognizing, correcting, and parsing address information from Bangla signboards. We have created manually annotated datasets and synthetic datasets to train signboard detection, address text detection, address text recognition, address text correction, and address text parser models. We have conducted a comparative study among different CTC-based and Encoder-Decoder model architectures for Bangla address text recognition. Moreover, we have designed a novel address text correction model using a sequence-to-sequence transformer-based network to improve the performance of Bangla address text recognition model by post-correction. Finally, we have developed a Bangla address text parser using the state-of-the-art transformer-based pre-trained language model.
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
BDSL 49: A Comprehensive Dataset of Bangla Sign Language
Authors:
Ayman Hasib,
Saqib Sizan Khan,
Jannatul Ferdous Eva,
Mst. Nipa Khatun,
Ashraful Haque,
Nishat Shahrin,
Rashik Rahman,
Hasan Murad,
Md. Rajibul Islam,
Molla Rashied Hussein
Abstract:
Language is a method by which individuals express their thoughts. Each language has its own set of alphabetic and numeric characters. People can communicate with one another through either oral or written communication. However, each language has a sign language counterpart. Individuals who are deaf and/or mute communicate through sign language. The Bangla language also has a sign language, which…
▽ More
Language is a method by which individuals express their thoughts. Each language has its own set of alphabetic and numeric characters. People can communicate with one another through either oral or written communication. However, each language has a sign language counterpart. Individuals who are deaf and/or mute communicate through sign language. The Bangla language also has a sign language, which is called BDSL. The dataset is about Bangla hand sign images. The collection contains 49 individual Bangla alphabet images in sign language. BDSL49 is a dataset that consists of 29,490 images with 49 labels. Images of 14 different adult individuals, each with a distinct background and appearance, have been recorded during data collection. Several strategies have been used to eliminate noise from datasets during preparation. This dataset is available to researchers for free. They can develop automated systems using machine learning, computer vision, and deep learning techniques. In addition, two models were used in this dataset. The first is for detection, while the second is for recognition.
△ Less
Submitted 14 August, 2022;
originally announced August 2022.
-
In-BoXBART: Get Instructions into Biomedical Multi-Task Learning
Authors:
Mihir Parmar,
Swaroop Mishra,
Mirali Purohit,
Man Luo,
M. Hassan Murad,
Chitta Baral
Abstract:
Single-task models have proven pivotal in solving specific tasks; however, they have limitations in real-world applications where multi-tasking is necessary and domain shifts are exhibited. Recently, instructional prompts have shown significant improvement towards multi-task generalization; however, the effect of instructional prompts and Multi-Task Learning (MTL) has not been systematically studi…
▽ More
Single-task models have proven pivotal in solving specific tasks; however, they have limitations in real-world applications where multi-tasking is necessary and domain shifts are exhibited. Recently, instructional prompts have shown significant improvement towards multi-task generalization; however, the effect of instructional prompts and Multi-Task Learning (MTL) has not been systematically studied in the biomedical domain. Motivated by this, this paper explores the impact of instructional prompts for biomedical MTL. We introduce the BoX, a collection of 32 instruction tasks for Biomedical NLP across (X) various categories. Using this meta-dataset, we propose a unified model termed In-BoXBART, that can jointly learn all tasks of the BoX without any task-specific modules. To the best of our knowledge, this is the first attempt to propose a unified model in the biomedical domain and use instructions to achieve generalization across several biomedical tasks. Experimental results indicate that the proposed model: 1) outperforms the single-task baseline by ~3% and multi-task (without instruction) baseline by ~18% on an average, and 2) shows ~23% improvement compared to the single-task baseline in few-shot learning (i.e., 32 instances per task) on an average. Our analysis indicates that there is significant room for improvement across tasks in the BoX, implying the scope for future research direction.
△ Less
Submitted 15 April, 2022;
originally announced April 2022.
-
Enhanced Security of Symmetric Encryption Using Combination of Steganography with Visual Cryptography
Authors:
Sherief H. Murad,
Amr M. Gody,
Tamer M. Barakat
Abstract:
Data security is required when communications over untrusted networks takes place. Security tools such as cryptography and steganography are applied to achieve such objectives, but both have limitations and susceptible to attacks if they were used individually. To overcome these limitations, we proposed a powerful and secured system based on the integration of cryptography and steganography. The s…
▽ More
Data security is required when communications over untrusted networks takes place. Security tools such as cryptography and steganography are applied to achieve such objectives, but both have limitations and susceptible to attacks if they were used individually. To overcome these limitations, we proposed a powerful and secured system based on the integration of cryptography and steganography. The secret message is encrypted with blowfish cipher and visual cryptography. Finally, the encrypted data is embedded into two innocent cover images for future transmission. An extended analysis was made to prove the efficiency of the proposed model by measuring Mean-Square-Error (MSE), Peak-Signal-to-noise-Ratio (PSNR), and image histogram. The robustness was examined by launching statistical and 8-bit plane visual attacks. The proposed model provides a secure mean to transmit or store highly classified data that could be applied to the public security sector.
△ Less
Submitted 28 February, 2019;
originally announced February 2019.