-
Part-of-Speech Tagging of Odia Language Using statistical and Deep Learning-Based Approaches
Authors:
Tusarkanta Dalai,
Tapas Kumar Mishra,
Pankaj K Sa
Abstract:
Automatic Part-of-speech (POS) tagging is a preprocessing step of many natural language processing (NLP) tasks such as name entity recognition (NER), speech processing, information extraction, word sense disambiguation, and machine translation. It has already gained a promising result in English and European languages, but in Indian languages, particularly in Odia language, it is not yet well expl…
▽ More
Automatic Part-of-speech (POS) tagging is a preprocessing step of many natural language processing (NLP) tasks such as name entity recognition (NER), speech processing, information extraction, word sense disambiguation, and machine translation. It has already gained a promising result in English and European languages, but in Indian languages, particularly in Odia language, it is not yet well explored because of the lack of supporting tools, resources, and morphological richness of language. Unfortunately, we were unable to locate an open source POS tagger for Odia, and only a handful of attempts have been made to develop POS taggers for Odia language. The main contribution of this research work is to present a conditional random field (CRF) and deep learning-based approaches (CNN and Bidirectional Long Short-Term Memory) to develop Odia part-of-speech tagger. We used a publicly accessible corpus and the dataset is annotated with the Bureau of Indian Standards (BIS) tagset. However, most of the languages around the globe have used the dataset annotated with Universal Dependencies (UD) tagset. Hence, to maintain uniformity Odia dataset should use the same tagset. So we have constructed a simple map** from BIS tagset to UD tagset. We experimented with various feature set inputs to the CRF model, observed the impact of constructed feature set. The deep learning-based model includes Bi-LSTM network, CNN network, CRF layer, character sequence information, and pre-trained word vector. Character sequence information was extracted by using convolutional neural network (CNN) and Bi-LSTM network. Six different combinations of neural sequence labelling models are implemented, and their performance measures are investigated. It has been observed that Bi-LSTM model with character sequence feature and pre-trained word vector achieved a significant state-of-the-art result.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Telegram Monitor: Monitoring Brazilian Political Groups and Channels on Telegram
Authors:
Manoel Júnior,
Philipe Melo,
Daniel Kansaon,
Vitor Mafra,
Kaio Sá,
Fabrício Benevenuto
Abstract:
Instant messaging platforms such as Telegram became one of the main means of communication used by people all over the world. Most of them are home of several groups and channels that connect thousands of people focused on political topics. However, they have suffered with misinformation campaigns with a direct impact on electoral processes around the world. While some platforms, such as WhatsApp,…
▽ More
Instant messaging platforms such as Telegram became one of the main means of communication used by people all over the world. Most of them are home of several groups and channels that connect thousands of people focused on political topics. However, they have suffered with misinformation campaigns with a direct impact on electoral processes around the world. While some platforms, such as WhatsApp, took restrictive policies and measures to attenuate the issues arising from the abuse of their systems, others have emerged as alternatives, presenting little or no restrictions on content moderation or actions in combating misinformation. Telegram is one of those systems, which has been attracting more users and gaining popularity. In this work, we present the "Telegram Monitor", a web-based system that monitors the political debate in this environment and enables the analysis of the most shared content in multiple channels and public groups. Our system aims to allow journalists, researchers, and fact-checking agencies to identify trending conspiracy theories, misinformation campaigns, or simply to monitor the political debate in this space along the 2022 Brazilian elections. We hope our system can assist the combat of misinformation spreading through Telegram in Brazil.
△ Less
Submitted 9 February, 2022;
originally announced February 2022.
-
Weakly-supervised Joint Anomaly Detection and Classification
Authors:
Snehashis Majhi,
Srijan Das,
Francois Bremond,
Ratnakar Dash,
Pankaj Kumar Sa
Abstract:
Anomaly activities such as robbery, explosion, accidents, etc. need immediate actions for preventing loss of human life and property in real world surveillance systems. Although the recent automation in surveillance systems are capable of detecting the anomalies, but they still need human efforts for categorizing the anomalies and taking necessary preventive actions. This is due to the lack of met…
▽ More
Anomaly activities such as robbery, explosion, accidents, etc. need immediate actions for preventing loss of human life and property in real world surveillance systems. Although the recent automation in surveillance systems are capable of detecting the anomalies, but they still need human efforts for categorizing the anomalies and taking necessary preventive actions. This is due to the lack of methodology performing both anomaly detection and classification for real world scenarios. Thinking of a fully automatized surveillance system, which is capable of both detecting and classifying the anomalies that need immediate actions, a joint anomaly detection and classification method is a pressing need. The task of joint detection and classification of anomalies becomes challenging due to the unavailability of dense annotated videos pertaining to anomalous classes, which is a crucial factor for training modern deep architecture. Furthermore, doing it through manual human effort seems impossible. Thus, we propose a method that jointly handles the anomaly detection and classification in a single framework by adopting a weakly-supervised learning paradigm. In weakly-supervised learning instead of dense temporal annotations, only video-level labels are sufficient for learning. The proposed model is validated on a large-scale publicly available UCF-Crime dataset, achieving state-of-the-art results.
△ Less
Submitted 20 August, 2021;
originally announced August 2021.
-
Blind Deblurring using Deep Learning: A Survey
Authors:
Siddhant Sahu,
Manoj Kumar Lenka,
Pankaj Kumar Sa
Abstract:
We inspect all the deep learning based solutions and provide holistic understanding of various architectures that have evolved over the past few years to solve blind deblurring. The introductory work used deep learning to estimate some features of the blur kernel and then moved onto predicting the blur kernel entirely, which converts the problem into non-blind deblurring. The recent state of the a…
▽ More
We inspect all the deep learning based solutions and provide holistic understanding of various architectures that have evolved over the past few years to solve blind deblurring. The introductory work used deep learning to estimate some features of the blur kernel and then moved onto predicting the blur kernel entirely, which converts the problem into non-blind deblurring. The recent state of the art techniques are end to end, i.e., they don't estimate the blur kernel rather try to estimate the latent sharp image directly from the blurred image. The benchmarking PSNR and SSIM values on standard datasets of GOPRO and Kohler using various architectures are also provided.
△ Less
Submitted 23 July, 2019;
originally announced July 2019.
-
Localization of Unmanned Aerial Vehicles in Corridor Environments using Deep Learning
Authors:
Ram Prasad Padhy,
Shahzad Ahmad,
Sachin Verma,
Pankaj Kumar Sa,
Sambit Bakshi
Abstract:
Vision-based pose estimation of Unmanned Aerial Vehicles (UAV) in unknown environments is a rapidly growing research area in the field of robot vision. The task becomes more complex when the only available sensor is a static single camera (monocular vision). In this regard, we propose a monocular vision assisted localization algorithm, that will help a UAV to navigate safely in indoor corridor env…
▽ More
Vision-based pose estimation of Unmanned Aerial Vehicles (UAV) in unknown environments is a rapidly growing research area in the field of robot vision. The task becomes more complex when the only available sensor is a static single camera (monocular vision). In this regard, we propose a monocular vision assisted localization algorithm, that will help a UAV to navigate safely in indoor corridor environments. Always, the aim is to navigate the UAV through a corridor in the forward direction by kee** it at the center with no orientation either to the left or right side. The algorithm makes use of the RGB image, captured from the UAV front camera, and passes it through a trained deep neural network (DNN) to predict the position of the UAV as either on the left or center or right side of the corridor. Depending upon the divergence of the UAV with respect to the central bisector line (CBL) of the corridor, a suitable command is generated to bring the UAV to the center. When the UAV is at the center of the corridor, a new image is passed through another trained DNN to predict the orientation of the UAV with respect to the CBL of the corridor. If the UAV is either left or right tilted, an appropriate command is generated to rectify the orientation. We also propose a new corridor dataset, named NITRCorrV1, which contains images as captured by the UAV front camera when the UAV is at all possible locations of a variety of corridors. An exhaustive set of experiments in different corridors reveal the efficacy of the proposed algorithm.
△ Less
Submitted 21 March, 2019;
originally announced March 2019.
-
The Unconstrained Ear Recognition Challenge 2019 - ArXiv Version With Appendix
Authors:
Žiga Emeršič,
Aruna Kumar S. V.,
B. S. Harish,
Weronika Gutfeter,
Jalil Nourmohammadi Khiarak,
Andrzej Pacut,
Earnest Hansley,
Mauricio Pamplona Segundo,
Sudeep Sarkar,
Hyeonjung Park,
Gi Pyo Nam,
Ig-Jae Kim,
Sagar G. Sangodkar,
Ümit Kaçar,
Murvet Kirci,
Li Yuan,
Jishou Yuan,
Haonan Zhao,
Fei Lu,
Junying Mao,
Xiaoshuang Zhang,
Dogucan Yaman,
Fevziye Irem Eyiokur,
Kadir Bulut Özler,
Hazım Kemal Ekenel
, et al. (6 additional authors not shown)
Abstract:
This paper presents a summary of the 2019 Unconstrained Ear Recognition Challenge (UERC), the second in a series of group benchmarking efforts centered around the problem of person recognition from ear images captured in uncontrolled settings. The goal of the challenge is to assess the performance of existing ear recognition techniques on a challenging large-scale ear dataset and to analyze perfor…
▽ More
This paper presents a summary of the 2019 Unconstrained Ear Recognition Challenge (UERC), the second in a series of group benchmarking efforts centered around the problem of person recognition from ear images captured in uncontrolled settings. The goal of the challenge is to assess the performance of existing ear recognition techniques on a challenging large-scale ear dataset and to analyze performance of the technology from various viewpoints, such as generalization abilities to unseen data characteristics, sensitivity to rotations, occlusions and image resolution and performance bias on sub-groups of subjects, selected based on demographic criteria, i.e. gender and ethnicity. Research groups from 12 institutions entered the competition and submitted a total of 13 recognition approaches ranging from descriptor-based methods to deep-learning models. The majority of submissions focused on ensemble based methods combining either representations from multiple deep models or hand-crafted with learned image descriptors. Our analysis shows that methods incorporating deep learning models clearly outperform techniques relying solely on hand-crafted descriptors, even though both groups of techniques exhibit similar behaviour when it comes to robustness to various covariates, such presence of occlusions, changes in (head) pose, or variability in image resolution. The results of the challenge also show that there has been considerable progress since the first UERC in 2017, but that there is still ample room for further research in this area.
△ Less
Submitted 14 March, 2019; v1 submitted 11 March, 2019;
originally announced March 2019.
-
Considerations for a PAP Smear Image Analysis System with CNN Features
Authors:
Srishti Gautam,
Harinarayan K. K.,
Nirmal Jith,
Anil K. Sao,
Arnav Bhavsar,
Adarsh Natarajan
Abstract:
It has been shown that for automated PAP-smear image classification, nucleus features can be very informative. Therefore, the primary step for automated screening can be cell-nuclei detection followed by segmentation of nuclei in the resulting single cell PAP-smear images. We propose a patch based approach using CNN for segmentation of nuclei in single cell images. We then pose the question of ion…
▽ More
It has been shown that for automated PAP-smear image classification, nucleus features can be very informative. Therefore, the primary step for automated screening can be cell-nuclei detection followed by segmentation of nuclei in the resulting single cell PAP-smear images. We propose a patch based approach using CNN for segmentation of nuclei in single cell images. We then pose the question of ion of segmentation for classification using representation learning with CNN, and whether low-level CNN features may be useful for classification. We suggest a CNN-based feature level analysis and a transfer learning based approach for classification using both segmented as well full single cell images. We also propose a decision-tree based approach for classification. Experimental results demonstrate the effectiveness of the proposed algorithms individually (with low-level CNN features), and simultaneously proving the sufficiency of cell-nuclei detection (rather than accurate segmentation) for classification. Thus, we propose a system for analysis of multi-cell PAP-smear images consisting of a simple nuclei detection algorithm followed by classification using transfer learning.
△ Less
Submitted 23 June, 2018;
originally announced June 2018.
-
Proceedings of the third "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'16)
Authors:
V. Abrol,
O. Absil,
P. -A. Absil,
S. Anthoine,
P. Antoine,
T. Arildsen,
N. Bertin,
F. Bleichrodt,
J. Bobin,
A. Bol,
A. Bonnefoy,
F. Caltagirone,
V. Cambareri,
C. Chenot,
V. Crnojević,
M. Daňková,
K. Degraux,
J. Eisert,
J. M. Fadili,
M. Gabrié,
N. Gac,
D. Giacobello,
A. Gonzalez,
C. A. Gomez Gonzalez,
A. González
, et al. (36 additional authors not shown)
Abstract:
The third edition of the "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) took place in Aalborg, the 4th largest city in Denmark situated beautifully in the northern part of the country, from the 24th to 26th of August 2016. The workshop venue was at the Aalborg University campus. One implicit objective of this biennial workshop is to foster collab…
▽ More
The third edition of the "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) took place in Aalborg, the 4th largest city in Denmark situated beautifully in the northern part of the country, from the 24th to 26th of August 2016. The workshop venue was at the Aalborg University campus. One implicit objective of this biennial workshop is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For this third edition, iTWIST'16 gathered about 50 international participants and features 8 invited talks, 12 oral presentations, and 12 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing (e.g., optics, computer vision, genomics, biomedical, digital communication, channel estimation, astronomy); Application of sparse models in non-convex/non-linear inverse problems (e.g., phase retrieval, blind deconvolution, self calibration); Approximate probabilistic inference for sparse problems; Sparse machine learning and inference; "Blind" inverse problems and dictionary learning; Optimization for sparse modelling; Information theory, geometry and randomness; Sparsity? What's next? (Discrete-valued signals; Union of low-dimensional spaces, Cosparsity, mixed/group norm, model-based, low-complexity models, ...); Matrix/manifold sensing/processing (graph, low-rank approximation, ...); Complexity/accuracy tradeoffs in numerical methods/optimization; Electronic/optical compressive sensors (hardware).
△ Less
Submitted 14 September, 2016;
originally announced September 2016.
-
Validity and reliability of free software for bidimensional gait analysis
Authors:
Ana Paula Quixadá,
Andrea Naomi Onodera,
Norberto Peña,
José Garcia Vivas Miranda,
Katia Nunes Sá
Abstract:
Despite the evaluation systems of human movement that have been advancing in recent decades, their use are not feasible for clinical practice because it has a high cost and scarcity of trained operators to interpret their results. An ideal videogrammetry system should be easy to use, low cost, with minimal equipment, and fast realization. The CvMob is a free tool for dynamic evaluation of human mo…
▽ More
Despite the evaluation systems of human movement that have been advancing in recent decades, their use are not feasible for clinical practice because it has a high cost and scarcity of trained operators to interpret their results. An ideal videogrammetry system should be easy to use, low cost, with minimal equipment, and fast realization. The CvMob is a free tool for dynamic evaluation of human movements that express measurements in figures, tables, and graphics. This paper aims to determine if CvMob is a reliable tool for the evaluation of two dimensional human gait. This is a validity and reliability study. The sample was composed of 56 healthy individuals who walked on a 9-meterlong walkway and were simultaneously filmed by CvMob and Vicon system cameras. Linear trajectories and angular measurements were compared to validate the CvMob system, and inter and intrarater findings of the same measurements were used to determine reliability. A strong correlation (rs mean = 0.988) of the linear trajectories between systems and inter and intrarater analysis were found. According to the Bland-Altman method, the angles that had good agreement between systems were maximum flexion and extension (stance and swing) of the knee and dorsiflexion range of motion and stride length. The CvMob is a reliable tool for analysis of linear motion and lengths in two-dimensional evaluations of human gait. The angular measurements demonstrate high agreement for the knee joint; however, the hip and ankle measurements were limited by differences between systems.
△ Less
Submitted 14 February, 2016;
originally announced February 2016.
-
Making sense of randomness: an approach for fast recovery of compressively sensed signals
Authors:
V. Abrol,
P. Sharma,
A. K Sao
Abstract:
In compressed sensing (CS) framework, a signal is sampled below Nyquist rate, and the acquired compressed samples are generally random in nature. However, for efficient estimation of the actual signal, the sensing matrix must preserve the relative distances among the acquired compressed samples. Provided this condition is fulfilled, we show that CS samples will preserve the envelope of the actual…
▽ More
In compressed sensing (CS) framework, a signal is sampled below Nyquist rate, and the acquired compressed samples are generally random in nature. However, for efficient estimation of the actual signal, the sensing matrix must preserve the relative distances among the acquired compressed samples. Provided this condition is fulfilled, we show that CS samples will preserve the envelope of the actual signal even at different compression ratios. Exploiting this envelope preserving property of CS samples, we propose a new fast dictionary learning (DL) algorithm which is able to extract prototype signals from compressive samples for efficient sparse representation and recovery of signals. These prototype signals are orthogonal intrinsic mode functions (IMFs) extracted using empirical mode decomposition (EMD), which is one of the popular methods to capture the envelope of a signal. The extracted IMFs are used to build the dictionary without even comprehending the original signal or the sensing matrix. Moreover, one can build the dictionary on-line as new CS samples are available. In particularly, to recover first $L$ signals ($\in\mathbb{R}^n$) at the decoder, one can build the dictionary in just $\mathcal{O}(nL\log n)$ operations, that is far less as compared to existing approaches. The efficiency of the proposed approach is demonstrated experimentally for recovery of speech signals.
△ Less
Submitted 25 July, 2015;
originally announced July 2015.