Skip to main content

Showing 1–50 of 61 results for author: Roy, P P

.
  1. arXiv:2405.04040  [pdf, other

    math.CV

    Bohr radius for invariant families of bounded analytic functions and certain Integral transforms

    Authors: Molla Basir Ahamed, Partha Pratim Roy, Sabir Ahammed

    Abstract: In this paper, we first obtain a refined Bohr radius for invariant families of bounded analytic functions on unit disk $ \mathbb{D} $. Then, we obtain Bohr inequality for certain integral transforms, namely Fourier (discrete) and Laplace (discrete) transforms of bounded analytic functions $ f(z)=\sum_{n=0}^{\infty}a_nz^n $, in a simply connected domain \begin{align*} Ω_γ=\biggl\{z\in\mathbb{C}:… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 18 pages, 1 figure

    MSC Class: 30C45; 30C50; 30C65; 30C80

  2. arXiv:2405.01895  [pdf, other

    math.CV

    The Bohr inequality on a simply connected domain and its applications

    Authors: Sabir Ahammed, Molla Basir Ahamed, Partha Pratim Roy

    Abstract: In this article, we first establish a generalized Bohr inequality and examine its sharpness for a class of analytic functions $f$ in a simply connected domain $Ω_γ,$ where $0\leq γ<1$ with a sequence $\{\varphi_n(r) \}^{\infty}_{n=0}$ of non-negative continuous functions defined on $[0,1)$ such that the series $\sum_{n=0}^{\infty}\varphi_n(r)$ converges locally uniformly on $[0,1)$. Our results re… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  3. arXiv:2402.15689  [pdf, ps, other

    math.CV

    Revisiting Bohr Inequalities with Analytic and Harmonic Map**s on unit disk

    Authors: Molla Basir Ahamed, Partha Pratim Roy

    Abstract: In this paper, we study some improved and refined versions of the classical Bohr inequality applicable to the class $\mathcal{B}$, which consists of self-analytic map**s defined on the unit disk $\mathbb{D}$. First, we improve the Bohr inequality for the class $\mathcal{B}$ of analytic self-maps, incorporating the area measurements of sub-disks $\mathbb{D}_r$ of $\mathbb{D}$. Secondly, we establ… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: 25 pages, 0 figures

    MSC Class: Primary 30A10; 30H05; 30C35; 30C50 Secondary 30C45

  4. arXiv:2402.11808  [pdf, other

    math.CV

    Bohr inequalities via proper combinations for a certain class of close-to-convex harmonic map**s

    Authors: Molla Basir Ahamed, Partha Pratim Roy

    Abstract: Let $ \mathcal{H}(Ω) $ be the class of complex-valued functions harmonic in $ Ω\subset\mathbb{C} $ and each $f=h+\overline{g}\in \mathcal{H}(Ω)$, where $ h $ and $ g $ are analytic. In the study of Bohr phenomenon for certain class of harmonic map**s, it is to find a constant $ r_f\in (0, 1) $ such that the inequality \begin{align*} M_f(r):=r+\sum_{n=2}^{\infty}\left(|a_n|+|b_n|\right)r^n\le… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 26 pages, 9 figures

    MSC Class: Primary 30C45; 30C50; 30C80

  5. arXiv:2401.16878  [pdf, other

    cs.HC

    Enhancing EEG Signal-Based Emotion Recognition with Synthetic Data: Diffusion Model Approach

    Authors: Gourav Siddhad, Masakazu Iwamura, Partha Pratim Roy

    Abstract: Emotions are crucial in human life, influencing perceptions, relationships, behaviour, and choices. Emotion recognition using Electroencephalography (EEG) in the Brain-Computer Interface (BCI) domain presents significant challenges, particularly the need for extensive datasets. This study aims to generate synthetic EEG samples that are similar to real samples but are distinct by augmenting noise t… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 8 Pages, 3 Figures, 2 Tables

  6. arXiv:2311.11250  [pdf, other

    cs.AI

    A Comprehensive Review on Sentiment Analysis: Tasks, Approaches and Applications

    Authors: Sudhanshu Kumar, Partha Pratim Roy, Debi Prosad Dogra, Byung-Gyu Kim

    Abstract: Sentiment analysis (SA) is an emerging field in text mining. It is the process of computationally identifying and categorizing opinions expressed in a piece of text over different social media platforms. Social media plays an essential role in knowing the customer mindset towards a product, services, and the latest market trends. Most organizations depend on the customer's response and feedback to… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

  7. arXiv:2310.16527  [pdf, other

    cs.CV cs.LG

    Enhancing Document Information Analysis with Multi-Task Pre-training: A Robust Approach for Information Extraction in Visually-Rich Documents

    Authors: Tofik Ali, Partha Pratim Roy

    Abstract: This paper introduces a deep learning model tailored for document information analysis, emphasizing document classification, entity relation extraction, and document visual question answering. The proposed model leverages transformer-based models to encode all the information present in a document image, including textual, visual, and layout information. The model is pre-trained and subsequently f… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  8. arXiv:2308.02515  [pdf, other

    cs.LG cs.HC eess.SP

    Feature Reweighting for EEG-based Motor Imagery Classification

    Authors: Taveena Lotey, Prateek Keserwani, Debi Prosad Dogra, Partha Pratim Roy

    Abstract: Classification of motor imagery (MI) using non-invasive electroencephalographic (EEG) signals is a critical objective as it is used to predict the intention of limb movements of a subject. In recent research, convolutional neural network (CNN) based methods have been widely utilized for MI-EEG classification. The challenges of training neural networks for MI-EEG signals classification include low… ▽ More

    Submitted 29 July, 2023; originally announced August 2023.

  9. arXiv:2308.01548  [pdf, ps, other

    math.CV

    Hankel and Toeplitz determinants of logarithmic coefficients of Inverse functions for certain classes of univalent functions

    Authors: Sanju Mandal, Partha Pratim Roy, Molla Basir Ahamed

    Abstract: The Hankel and Toeplitz determinants $H_{2,1}(F_{f^{-1}}/2)$ and $T_{2,1}(F_{f^{-1}}/2)$ are defined as: \begin{align*} H_{2,1}(F_{f^{-1}}/2):= \begin{vmatrix} Γ_1 & Γ_2 Γ_2 & Γ_3 \end{vmatrix} \;\;\mbox{and} \;\; T_{2,1}(F_{f^{-1}}/2):= \begin{vmatrix} Γ_1 & Γ_2 Γ_2 & Γ_1 \end{vmatrix} \end{align*} where $Γ_1, Γ_2,$ and $Γ_3$ are the first, second and third logarithmic coefficie… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: 15 pages. arXiv admin note: substantial text overlap with arXiv:2305.12500, arXiv:2307.14365

    MSC Class: Primary 30A10; 30H05; 30C35; Secondary 30C45

  10. arXiv:2307.15991  [pdf, other

    cs.CV

    Separate Scene Text Detector for Unseen Scripts is Not All You Need

    Authors: Prateek Keserwani, Taveena Lotey, Rohit Keshari, Partha Pratim Roy

    Abstract: Text detection in the wild is a well-known problem that becomes more challenging while handling multiple scripts. In the last decade, some scripts have gained the attention of the research community and achieved good detection performance. However, many scripts are low-resourced for training deep learning-based scene text detectors. It raises a critical question: Is there a need for separate train… ▽ More

    Submitted 29 July, 2023; originally announced July 2023.

  11. arXiv:2307.02746  [pdf, ps, other

    math.CV

    The third Hankel determinant for inverse coefficients of starlike functions of order 1/2

    Authors: Molla Basir Ahamed, Partha Pratim Roy

    Abstract: The sharp bound for the third Hankel determinant for the coefficients of the inverse function of starlike function of order $1/2$ is obtained. In light of this, we can deduce that the functionals $|H_3(1)(f)|$ and $|H_3(1)(f^{-1})|$ exhibit invariance on the class $\mathcal{S}^*(1/2)$.

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: 9 pages

    MSC Class: Primary 30A10; 30H05; 30C35; Secondary 30C45

  12. arXiv:2305.12500  [pdf, ps, other

    math.CV

    Sharp bounds for second Hankel determinant of logarithmic coefficients for certain classes of univalent functions

    Authors: Sanju Mandal, Partha Pratim Roy, Molla Basir Ahamed

    Abstract: The Hankel determinant $H_{2,2}(F_{f}/2)$ is defined as: \begin{align*} H_{2,2}(F_{f}/2):= \begin{vmatrix} γ_2 & γ_3 γ_3 & γ_4 \end{vmatrix}, \end{align*} where $γ_2, γ_3,$ and $γ_4$ are the second, third, and fourth logarithmic coefficients of functions belonging to the class $\mathcal{S}$ of normalized univalent functions. In this article, we establish sharp inequalities… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: 10 pages, 0 figures

    MSC Class: Primary 30A10; 30H05; 30C35; Secondary 30C45

  13. arXiv:2204.09019  [pdf, other

    eess.SP cs.LG eess.SY physics.ao-ph

    Hybrid Transformer Network for Different Horizons-based Enriched Wind Speed Forecasting

    Authors: Dr. M. Madhiarasan, Prof. Partha Pratim Roy

    Abstract: Highly accurate different horizon-based wind speed forecasting facilitates a better modern power system. This paper proposed a novel astute hybrid wind speed forecasting model and applied it to different horizons. The proposed hybrid forecasting model decomposes the original wind speed data into IMFs (Intrinsic Mode Function) using Improved Complete Ensemble Empirical Mode Decomposition with Adapt… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: Communicated to IEEE Transactions on Power Systems status Under Review

  14. arXiv:2204.03328  [pdf, other

    cs.CV cs.AI cs.CL

    A Comprehensive Review of Sign Language Recognition: Different Types, Modalities, and Datasets

    Authors: Dr. M. Madhiarasan, Prof. Partha Pratim Roy

    Abstract: A machine can understand human activities, and the meaning of signs can help overcome the communication barriers between the inaudible and ordinary people. Sign Language Recognition (SLR) is a fascinating research area and a crucial task concerning computer vision and pattern recognition. Recently, SLR usage has increased in many applications, but the environment, background image resolution, moda… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: communicated to the Computer Science Review (Elsevier) status With Editor

  15. arXiv:2202.05170  [pdf, ps, other

    eess.SP cs.AI cs.LG

    Efficacy of Transformer Networks for Classification of Raw EEG Data

    Authors: Gourav Siddhad, Anmol Gupta, Debi Prosad Dogra, Partha Pratim Roy

    Abstract: With the unprecedented success of transformer networks in natural language processing (NLP), recently, they have been successfully adapted to areas like computer vision, generative adversarial networks (GAN), and reinforcement learning. Classifying electroencephalogram (EEG) data has been challenging and researchers have been overly dependent on pre-processing and hand-crafted feature extraction.… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

    Journal ref: Biomedical Signal Processing and Control, Vol 87, 2023

  16. arXiv:2106.15989  [pdf, other

    cs.CV cs.MM

    Word-level Sign Language Recognition with Multi-stream Neural Networks Focusing on Local Regions

    Authors: Mizuki Maruyama, Shuvozit Ghose, Katsufumi Inoue, Partha Pratim Roy, Masakazu Iwamura, Michifumi Yoshioka

    Abstract: In recent years, Word-level Sign Language Recognition (WSLR) research has gained popularity in the computer vision community, and thus various approaches have been proposed. Among these approaches, the method using I3D network achieves the highest recognition accuracy on large public datasets for WSLR. However, the method with I3D only utilizes appearance information of the upper body of the signe… ▽ More

    Submitted 30 June, 2021; originally announced June 2021.

  17. arXiv:2010.12669  [pdf, other

    cs.CV cs.HC

    Position and Rotation Invariant Sign Language Recognition from 3D Kinect Data with Recurrent Neural Networks

    Authors: Prasun Roy, Saumik Bhattacharya, Partha Pratim Roy, Umapada Pal

    Abstract: Sign language is a gesture-based symbolic communication medium among speech and hearing impaired people. It also serves as a communication bridge between non-impaired and impaired populations. Unfortunately, in most situations, a non-impaired person is not well conversant in such symbolic languages restricting the natural information flow between these two categories. Therefore, an automated trans… ▽ More

    Submitted 14 March, 2023; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: 10 pages

  18. arXiv:2010.06200  [pdf, other

    cs.SD eess.AS

    End-to-end Triplet Loss based Emotion Embedding System for Speech Emotion Recognition

    Authors: Puneet Kumar, Sidharth Jain, Balasubramanian Raman, Partha Pratim Roy, Masakazu Iwamura

    Abstract: In this paper, an end-to-end neural embedding system based on triplet loss and residual learning has been proposed for speech emotion recognition. The proposed system learns the embeddings from the emotional information of the speech utterances. The learned embeddings are used to recognize the emotions portrayed by given speech samples of various lengths. The proposed system implements Residual Ne… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

    Comments: Accepted in ICPR 2020

  19. arXiv:2007.07075  [pdf, other

    cs.CV

    UDBNET: Unsupervised Document Binarization Network via Adversarial Game

    Authors: Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, Partha Pratim Roy, Umapada Pal

    Abstract: Degraded document image binarization is one of the most challenging tasks in the domain of document image analysis. In this paper, we present a novel approach towards document image binarization by introducing three-player min-max adversarial game. We train the network in an unsupervised setup by assuming that we do not have any paired-training data. In our approach, an Adversarial Texture Augment… ▽ More

    Submitted 27 October, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

    Comments: Accepted in ICPR 2020

  20. arXiv:2007.05764  [pdf, ps, other

    eess.AS cs.MM

    Fast Griffin Lim based Waveform Generation Strategy for Text-to-Speech Synthesis

    Authors: Ankit Sharma, Puneet Kumar, Vikas Maddukuri, Nagasai Madamshettib, Kishore KG, Sahit Sai Sriram Kavurub, Balasubramanian Raman, Partha Pratim Roy

    Abstract: The performance of text-to-speech (TTS) systems heavily depends on spectrogram to waveform generation, also known as the speech reconstruction phase. The time required for the same is known as synthesis delay. In this paper, an approach to reduce speech synthesis delay has been proposed. It aims to enhance the TTS systems for real-time applications such as digital assistants, mobile phones, embedd… ▽ More

    Submitted 11 July, 2020; originally announced July 2020.

    Comments: Accepted for publication in Springer Multimedia Tools and Applications Journal

  21. arXiv:2004.08141  [pdf, other

    cs.CV

    Modeling Extent-of-Texture Information for Ground Terrain Recognition

    Authors: Shuvozit Ghose, Pinaki Nath Chowdhury, Partha Pratim Roy, Umapada Pal

    Abstract: Ground Terrain Recognition is a difficult task as the context information varies significantly over the regions of a ground terrain image. In this paper, we propose a novel approach towards ground-terrain recognition via modeling the Extent-of-Texture information to establish a balance between the order-less texture component and ordered-spatial information locally. At first, the proposed method u… ▽ More

    Submitted 27 October, 2020; v1 submitted 17 April, 2020; originally announced April 2020.

    Comments: Accepted in ICPR 2020

  22. arXiv:2003.05626  [pdf, other

    cs.CV

    Understanding Crowd Flow Movements Using Active-Langevin Model

    Authors: Shreetam Behera, Debi Prosad Dogra, Malay Kumar Bandyopadhyay, Partha Pratim Roy

    Abstract: Crowd flow describes the elementary group behavior of crowds. Understanding the dynamics behind these movements can help to identify various abnormalities in crowds. However, develo** a crowd model describing these flows is a challenging task. In this paper, a physics-based model is proposed to describe the movements in dense crowds. The crowd model is based on active Langevin equation where the… ▽ More

    Submitted 18 August, 2020; v1 submitted 12 March, 2020; originally announced March 2020.

  23. arXiv:1904.07233  [pdf, other

    cs.CV

    Estimation of Linear Motion in Dense Crowd Videos using Langevin Model

    Authors: Shreetam Behera, Debi Prosad Dogra, Malay Kumar Bandyopadhyay, Partha Pratim Roy

    Abstract: Crowd gatherings at social and cultural events are increasing in leaps and bounds with the increase in population. Surveillance through computer vision and expert decision making systems can help to understand the crowd phenomena at large gatherings. Understanding crowd phenomena can be helpful in early identification of unwanted incidents and their prevention. Motion flow is one of the important… ▽ More

    Submitted 15 April, 2019; originally announced April 2019.

  24. arXiv:1902.04955  [pdf, other

    cs.CV

    Can We Automate Diagrammatic Reasoning?

    Authors: Sk. Arif Ahmed, Debi Prosad Dogra, Samarjit Kar, Partha Pratim Roy, Dilip K. Prasad

    Abstract: Learning to solve diagrammatic reasoning (DR) can be a challenging but interesting problem to the computer vision research community. It is believed that next generation pattern recognition applications should be able to simulate human brain to understand and analyze reasoning of images. However, due to the lack of benchmarks of diagrammatic reasoning, the present research primarily focuses on vis… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

  25. Facial Micro-Expression Spotting and Recognition using Time Contrasted Feature with Visual Memory

    Authors: Sauradip Nag, Ayan Kumar Bhunia, Aishik Konwer, Partha Pratim Roy

    Abstract: Facial micro-expressions are sudden involuntary minute muscle movements which reveal true emotions that people try to conceal. Spotting a micro-expression and recognizing it is a major challenge owing to its short duration and intensity. Many works pursued traditional and deep learning based approaches to solve this issue but compromised on learning low-level features and higher accuracy due to un… ▽ More

    Submitted 18 April, 2019; v1 submitted 9 February, 2019; originally announced February 2019.

    Comments: International Conference on Acoustics, Speech, and Signal Processing(ICASSP), 2019

  26. Anomaly Detection in Road Traffic Using Visual Surveillance: A Survey

    Authors: Santhosh Kelathodi Kumaran, Debi Prosad Dogra, Partha Pratim Roy

    Abstract: Computer vision has evolved in the last decade as a key technology for numerous applications replacing human supervision. In this paper, we present a survey on relevant visual surveillance related researches for anomaly detection in public places, focusing primarily on roads. Firstly, we revisit the surveys done in the last 10 years in this field. Since the underlying building block of a typical a… ▽ More

    Submitted 24 January, 2019; originally announced January 2019.

    Journal ref: ACM Computing Surveys (2020), 6(53):Article 119, 2020

  27. arXiv:1812.07203  [pdf, other

    cs.CV

    Video Trajectory Classification and Anomaly Detection Using Hybrid CNN-VAE

    Authors: Santhosh Kelathodi Kumaran, Debi Prosad Dogra, Partha Pratim Roy, Adway Mitra

    Abstract: Classifying time series data using neural networks is a challenging problem when the length of the data varies. Video object trajectories, which are key to many of the visual surveillance applications, are often found to be of varying length. If such trajectories are used to understand the behavior (normal or anomalous) of moving objects, they need to be represented correctly. In this paper, we pr… ▽ More

    Submitted 18 December, 2018; originally announced December 2018.

    Comments: First version submitted in an Journal on 8-10-2018

  28. arXiv:1811.10804  [pdf, other

    cs.IR cs.SI

    Movie Recommendation System using Sentiment Analysis from Microblogging Data

    Authors: Sudhanshu Kumar, Shirsendu Sukanta Halder, Kanjar De, Partha Pratim Roy

    Abstract: Recommendation systems are important intelligent systems that play a vital role in providing selective information to users. Traditional approaches in recommendation systems include collaborative filtering and content-based filtering. However, these approaches have certain limitations like the necessity of prior user history and habits for performing the task of recommendation. In order to reduce… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

    Comments: 19 pages, 7 tables, 5 figures

  29. arXiv:1811.10801  [pdf, other

    cs.CV

    Perceptual Conditional Generative Adversarial Networks for End-to-End Image Colourization

    Authors: Shirsendu Sukanta Halder, Kanjar De, Partha Pratim Roy

    Abstract: Colours are everywhere. They embody a significant part of human visual perception. In this paper, we explore the paradigm of hallucinating colours from a given gray-scale image. The problem of colourization has been dealt in previous literature but mostly in a supervised manner involving user-interference. With the emergence of Deep Learning methods numerous tasks related to computer vision and pa… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

    Comments: 16 pages, 8 figures, 3 tables

  30. Texture Synthesis Guided Deep Hashing for Texture Image Retrieval

    Authors: Ayan Kumar Bhunia, Perla Sai Raj Kishore, Pranay Mukherjee, Abhirup Das, Partha Pratim Roy

    Abstract: With the large-scale explosion of images and videos over the internet, efficient hashing methods have been developed to facilitate memory and time efficient retrieval of similar images. However, none of the existing works uses hashing to address texture image retrieval mostly because of the lack of sufficiently large texture image databases. Our work addresses this problem by develo** a novel de… ▽ More

    Submitted 5 June, 2019; v1 submitted 4 November, 2018; originally announced November 2018.

    Comments: IEEE Winter Conference on Applications of Computer Vision (WACV), 2019 Video Presentation: https://www.youtube.com/watch?v=tXaXTGhzaJo

  31. arXiv:1811.01396  [pdf, other

    cs.CV

    Handwriting Recognition in Low-resource Scripts using Adversarial Learning

    Authors: Ayan Kumar Bhunia, Abhirup Das, Ankan Kumar Bhunia, Perla Sai Raj Kishore, Partha Pratim Roy

    Abstract: Handwritten Word Recognition and Spotting is a challenging field dealing with handwritten text possessing irregular and complex shapes. The design of deep neural network models makes it necessary to extend training datasets in order to introduce variations and increase the number of samples; word-retrieval is therefore very difficult in low-resource scripts. Much of the existing literature compris… ▽ More

    Submitted 25 February, 2019; v1 submitted 4 November, 2018; originally announced November 2018.

    Comments: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019

  32. A Deep One-Shot Network for Query-based Logo Retrieval

    Authors: Ayan Kumar Bhunia, Ankan Kumar Bhunia, Shuvozit Ghose, Abhirup Das, Partha Pratim Roy, Umapada Pal

    Abstract: Logo detection in real-world scene images is an important problem with applications in advertisement and marketing. Existing general-purpose object detection methods require large training data with annotations for every logo class. These methods do not satisfy the incremental demand of logo classes necessary for practical deployment since it is practically impossible to have such annotated data f… ▽ More

    Submitted 13 July, 2019; v1 submitted 4 November, 2018; originally announced November 2018.

    Comments: Accepted in Pattern Recognition, Elsevier(2019)

  33. arXiv:1811.00201  [pdf, other

    cs.CV

    Cogni-Net: Cognitive Feature Learning through Deep Visual Perception

    Authors: Pranay Mukherjee, Abhirup Das, Ayan Kumar Bhunia, Partha Pratim Roy

    Abstract: Can we ask computers to recognize what we see from brain signals alone? Our paper seeks to utilize the knowledge learnt in the visual domain by popular pre-trained vision models and use it to teach a recurrent model being trained on brain signals to learn a discriminative manifold of the human brain's cognition of different visual object categories in response to perceived visual cues. For this we… ▽ More

    Submitted 1 May, 2019; v1 submitted 31 October, 2018; originally announced November 2018.

    Comments: IEEE International Conference on Image Processing (ICIP), 2019

  34. User Constrained Thumbnail Generation using Adaptive Convolutions

    Authors: Perla Sai Raj Kishore, Ayan Kumar Bhunia, Shuvozit Ghose, Partha Pratim Roy

    Abstract: Thumbnails are widely used all over the world as a preview for digital images. In this work we propose a deep neural framework to generate thumbnails of any size and aspect ratio, even for unseen values during training, with high accuracy and precision. We use Global Context Aggregation (GCA) and a modified Region Proposal Network (RPN) with adaptive convolutions to generate thumbnails in real tim… ▽ More

    Submitted 18 April, 2019; v1 submitted 30 October, 2018; originally announced October 2018.

    Comments: International Conference on Acoustics, Speech, and Signal Processing(ICASSP), 2019

  35. arXiv:1810.11120  [pdf, other

    cs.CV

    Improving Document Binarization via Adversarial Noise-Texture Augmentation

    Authors: Ankan Kumar Bhunia, Ayan Kumar Bhunia, Aneeshan Sain, Partha Pratim Roy

    Abstract: Binarization of degraded document images is an elementary step in most of the problems in document image analysis domain. The paper re-visits the binarization problem by introducing an adversarial learning approach. We construct a Texture Augmentation Network that transfers the texture element of a degraded reference document image to a clean binary image. In this way, the network creates multiple… ▽ More

    Submitted 1 May, 2019; v1 submitted 25 October, 2018; originally announced October 2018.

    Comments: IEEE International Conference on Image Processing (ICIP), 2019. The full source code of the proposed system is publicly available at https://github.com/ankanbhunia/AdverseBiNet

  36. arXiv:1810.10581  [pdf, other

    cs.HC cs.CV cs.LG stat.ML

    Visual Rendering of Shapes on 2D Display Devices Guided by Hand Gestures

    Authors: Abhik Singla, Partha Pratim Roy, Debi Prosad Dogra

    Abstract: Designing of touchless user interface is gaining popularity in various contexts. Using such interfaces, users can interact with electronic devices even when the hands are dirty or non-conductive. Also, user with partial physical disability can interact with electronic devices using such systems. Research in this direction has got major boost because of the emergence of low-cost sensors such as Lea… ▽ More

    Submitted 22 October, 2018; originally announced October 2018.

    Comments: Submitted to Elsevier Displays Journal, 32 pages, 18 figures, 7 tables

  37. Fingertip Detection and Tracking for Recognition of Air-Writing in Videos

    Authors: Sohom Mukherjee, Arif Ahmed, Debi Prosad Dogra, Samarjit Kar, Partha Pratim Roy

    Abstract: Air-writing is the process of writing characters or words in free space using finger or hand movements without the aid of any hand-held device. In this work, we address the problem of mid-air finger writing using web-cam video as input. In spite of recent advances in object detection and tracking, accurate and robust detection and tracking of the fingertip remains a challenging task, primarily due… ▽ More

    Submitted 9 September, 2018; originally announced September 2018.

    Comments: 32 pages, 10 figures, 2 tables. Submitted to Journal of Expert Systems with Applications

    Journal ref: Expert Systems with Applications Volume 136, 1 December 2019, Pages 217-229

  38. arXiv:1807.06772  [pdf, ps, other

    cs.CV

    Bag-of-Visual-Words for Signature-Based Multi-Script Document Retrieval

    Authors: Ranju Mandal, Partha Pratim Roy, Umapada Pal, Michael Blumenstein

    Abstract: An end-to-end architecture for multi-script document retrieval using handwritten signatures is proposed in this paper. The user supplies a query signature sample and the system exclusively returns a set of documents that contain the query signature. In the first stage, a component-wise classification technique separates the potential signature components from all other components. A bag-of-visual-… ▽ More

    Submitted 18 July, 2018; originally announced July 2018.

  39. Temporal Unknown Incremental Clustering (TUIC) Model for Analysis of Traffic Surveillance Videos

    Authors: Santhosh Kelathodi Kumaran, Debi Prosad Dogra, Partha Pratim Roy

    Abstract: Optimized scene representation is an important characteristic of a framework for detecting abnormalities on live videos. One of the challenges for detecting abnormalities in live videos is real-time detection of objects in a non-parametric way. Another challenge is to efficiently represent the state of objects temporally across frames. In this paper, a Gibbs sampling based heuristic model referred… ▽ More

    Submitted 18 April, 2018; originally announced April 2018.

  40. arXiv:1804.06254  [pdf

    cs.CV

    Synthetic data generation for Indic handwritten text recognition

    Authors: Partha Pratim Roy, Akash Mohta, Bidyut B. Chaudhuri

    Abstract: This paper presents a novel approach to generate synthetic dataset for handwritten word recognition systems. It is difficult to recognize handwritten scripts for which sufficient training data is not readily available or it may be expensive to collect such data. Hence, it becomes hard to train recognition systems owing to lack of proper dataset. To overcome such problems, synthetic data could be u… ▽ More

    Submitted 17 April, 2018; originally announced April 2018.

  41. arXiv:1803.06613  [pdf, other

    cs.CV

    Trajectory-based Scene Understanding using Dirichlet Process Mixture Model

    Authors: Santhosh Kelathodi Kumaran, Debi Prosad Dogra, Partha Pratim Roy, Bidyut Baran Chaudhuri

    Abstract: Appropriate modeling of a surveillance scene is essential for detection of anomalies in road traffic. Learning usual paths can provide valuable insight into road traffic conditions and thus can help in identifying unusual routes taken by commuters/vehicles. If usual traffic paths are learned in a nonparametric way, manual interventions in road marking road can be avoided. In this paper, we propose… ▽ More

    Submitted 16 June, 2019; v1 submitted 18 March, 2018; originally announced March 2018.

    Comments: 14 pages, 27 figures

  42. Queuing Theory Guided Intelligent Traffic Scheduling through Video Analysis using Dirichlet Process Mixture Model

    Authors: Santhosh Kelathodi Kumaran, Debi Prosad Dogra, Partha Pratim Roy

    Abstract: Accurate prediction of traffic signal duration for roadway junction is a challenging problem due to the dynamic nature of traffic flows. Though supervised learning can be used, parameters may vary across roadway junctions. In this paper, we present a computer vision guided expert system that can learn the departure rate of a given traffic junction modeled using traditional queuing theory. First, w… ▽ More

    Submitted 17 March, 2018; originally announced March 2018.

    Journal ref: Expert Systems with Applications Volume 118, 15 March 2019, Pages 169-181

  43. arXiv:1802.08568  [pdf, other

    cs.CV

    Indic Handwritten Script Identification using Offline-Online Multimodal Deep Network

    Authors: Ayan Kumar Bhunia, Subham Mukherjee, Aneeshan Sain, Ankan Kumar Bhunia, Partha Pratim Roy, Umapada Pal

    Abstract: In this paper, we propose a novel approach of word-level Indic script identification using only character-level data in training stage. The advantages of using character level data for training have been outlined in section I. Our method uses a multimodal deep network which takes both offline and online modality of the data as input in order to explore the information from both the modalities join… ▽ More

    Submitted 15 October, 2019; v1 submitted 23 February, 2018; originally announced February 2018.

    Comments: Accepted in Information Fusion, Elsevier

  44. arXiv:1801.07211  [pdf

    cs.CV

    Handwriting Trajectory Recovery using End-to-End Deep Encoder-Decoder Network

    Authors: Ayan Kumar Bhunia, Abir Bhowmick, Ankan Kumar Bhunia, Aishik Konwer, Prithaj Banerjee, Partha Pratim Roy, Umapada Pal

    Abstract: In this paper, we introduce a novel technique to recover the pen trajectory of offline characters which is a crucial step for handwritten character recognition. Generally, online acquisition approach has more advantage than its offline counterpart as the online technique keeps track of the pen movement. Hence, pen tip trajectory retrieval from offline text can bridge the gap between online and off… ▽ More

    Submitted 3 June, 2018; v1 submitted 22 January, 2018; originally announced January 2018.

    Comments: To be appeared in ICPR 2018, 2018 International Conference on Pattern Recognition, Code Link: https://drive.google.com/file/d/1clT-UuXgPp6uFn1tmIXx481qvPUcY0fV/view

  45. arXiv:1801.07156  [pdf

    cs.CV

    Word Level Font-to-Font Image Translation using Convolutional Recurrent Generative Adversarial Networks

    Authors: Ankan Kumar Bhunia, Ayan Kumar Bhunia, Prithaj Banerjee, Aishik Konwer, Abir Bhowmick, Partha Pratim Roy, Umapada Pal

    Abstract: Conversion of one font to another font is very useful in real life applications. In this paper, we propose a Convolutional Recurrent Generative model to solve the word level font transfer problem. Our network is able to convert the font style of any printed text images from its current font to the required font. The network is trained end-to-end for the complete word images. Thus it eliminates the… ▽ More

    Submitted 23 May, 2018; v1 submitted 22 January, 2018; originally announced January 2018.

    Comments: To be appeared in ICPR 2018, 2018 International Conference on Pattern Recognition

  46. arXiv:1801.07141  [pdf

    cs.CV

    Staff line Removal using Generative Adversarial Networks

    Authors: Aishik Konwer, Ayan Kumar Bhunia, Abir Bhowmick, Ankan Kumar Bhunia, Prithaj Banerjee, Partha Pratim Roy, Umapada Pal

    Abstract: Staff line removal is a crucial pre-processing step in Optical Music Recognition. It is a challenging task to simultaneously reduce the noise and also retain the quality of music symbol context in ancient degraded music score images. In this paper we propose a novel approach for staff line removal, based on Generative Adversarial Networks. We convert staff line images into patches and feed them in… ▽ More

    Submitted 5 June, 2018; v1 submitted 22 January, 2018; originally announced January 2018.

    Comments: To be appeared in ICPR 2018, 2018 International Conference on Pattern Recognition(Oral)

  47. arXiv:1801.00879  [pdf

    cs.CV

    A Novel Feature Descriptor for Image Retrieval by Combining Modified Color Histogram and Diagonally Symmetric Co-occurrence Texture Pattern

    Authors: Ayan Kumar Bhunia, Avirup Bhattacharyya, Prithaj Banerjee, Partha Pratim Roy, Subrahmanyam Murala

    Abstract: In this paper, we have proposed a novel feature descriptors combining color and texture information collectively. In our proposed color descriptor component, the inter-channel relationship between Hue (H) and Saturation (S) channels in the HSV color space has been explored which was not done earlier. We have quantized the H channel into a number of bins and performed the voting with saturation val… ▽ More

    Submitted 2 January, 2018; originally announced January 2018.

    Comments: Preprint Submitted

  48. Script Identification in Natural Scene Image and Video Frame using Attention based Convolutional-LSTM Network

    Authors: Ankan Kumar Bhunia, Aishik Konwer, Ayan Kumar Bhunia, Abir Bhowmick, Partha P. Roy, Umapada Pal

    Abstract: Script identification plays a significant role in analysing documents and videos. In this paper, we focus on the problem of script identification in scene text images and video scripts. Because of low image quality, complex background and similar layout of characters shared by some scripts like Greek, Latin, etc., text recognition in those cases become challenging. In this paper, we propose a nove… ▽ More

    Submitted 7 August, 2018; v1 submitted 1 January, 2018; originally announced January 2018.

    Comments: The first and second authors contributed equally. Accepted in Pattern Recognition Journal

  49. arXiv:1801.00187  [pdf

    cs.CV

    Fractional Local Neighborhood Intensity Pattern for Image Retrieval using Genetic Algorithm

    Authors: Shuvozit Ghose, Abhirup Das, Ayan Kumar Bhunia, Partha Pratim Roy

    Abstract: In this paper, a new texture descriptor named "Fractional Local Neighborhood Intensity Pattern" (FLNIP) has been proposed for content based image retrieval (CBIR). It is an extension of the Local Neighborhood Intensity Pattern (LNIP)[1]. FLNIP calculates the relative intensity difference between a particular pixel and the center pixel of a 3x3 window by considering the relationship with adjacent n… ▽ More

    Submitted 20 November, 2019; v1 submitted 30 December, 2017; originally announced January 2018.

    Comments: MTAP, Springer(Minor Revision)

  50. Cross-language Framework for Word Recognition and Spotting of Indic Scripts

    Authors: Ayan Kumar Bhunia, Partha Pratim Roy, Akash Mohta, Umapada Pal

    Abstract: Handwritten word recognition and spotting of low-resource scripts are difficult as sufficient training data is not available and it is often expensive for collecting data of such scripts. This paper presents a novel cross language platform for handwritten word recognition and spotting for such low-resource scripts where training is performed with a sufficiently large dataset of an available script… ▽ More

    Submitted 28 January, 2018; v1 submitted 19 December, 2017; originally announced December 2017.

    Comments: Accepted in Pattern Recognition, Elsevier(2018)