Search | arXiv e-print repository

Towards Multilingual Audio-Visual Question Answering

Authors: Orchid Chetia Phukan, Priyabrata Mallick, Swarup Ranjan Behera, Aalekhya Satya Narayani, Arun Balaji Buduru, Rajesh Sharma

Abstract: In this paper, we work towards extending Audio-Visual Question Answering (AVQA) to multilingual settings. Existing AVQA research has predominantly revolved around English and replicating it for addressing AVQA in other languages requires a substantial allocation of resources. As a scalable solution, we leverage machine translation and present two multilingual AVQA datasets for eight languages crea… ▽ More In this paper, we work towards extending Audio-Visual Question Answering (AVQA) to multilingual settings. Existing AVQA research has predominantly revolved around English and replicating it for addressing AVQA in other languages requires a substantial allocation of resources. As a scalable solution, we leverage machine translation and present two multilingual AVQA datasets for eight languages created from existing benchmark AVQA datasets. This prevents extra human annotation efforts of collecting questions and answers manually. To this end, we propose, MERA framework, by leveraging state-of-the-art (SOTA) video, audio, and textual foundation models for AVQA in multiple languages. We introduce a suite of models namely MERA-L, MERA-C, MERA-T with varied model architectures to benchmark the proposed datasets. We believe our work will open new research directions and act as a reference benchmark for future works in multilingual AVQA. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted to Interspeech 2024

MSC Class: 68T45

arXiv:2406.07676 [pdf, other]

FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation

Authors: Swarup Ranjan Behera, Abhishek Dhiman, Karthik Gowda, Aalekhya Satya Narayani

Abstract: Audio classification models, particularly the Audio Spectrogram Transformer (AST), play a crucial role in efficient audio analysis. However, optimizing their efficiency without compromising accuracy remains a challenge. In this paper, we introduce FastAST, a framework that integrates Token Merging (ToMe) into the AST framework. FastAST enhances inference speed without requiring extensive retrainin… ▽ More Audio classification models, particularly the Audio Spectrogram Transformer (AST), play a crucial role in efficient audio analysis. However, optimizing their efficiency without compromising accuracy remains a challenge. In this paper, we introduce FastAST, a framework that integrates Token Merging (ToMe) into the AST framework. FastAST enhances inference speed without requiring extensive retraining by merging similar tokens in audio spectrograms. Furthermore, during training, FastAST brings about significant speed improvements. The experiments indicate that FastAST can increase audio classification throughput with minimal impact on accuracy. To mitigate the accuracy impact, we integrate Cross-Model Knowledge Distillation (CMKD) into the FastAST framework. Integrating ToMe and CMKD into AST results in improved accuracy compared to AST while maintaining faster inference speeds. FastAST represents a step towards real-time, resource-efficient audio analysis. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: Accepted to Interspeech 2024

MSC Class: 68T10

arXiv:2404.03012 [pdf, other]

Spectral Clustering in Convex and Constrained Settings

Authors: Swarup Ranjan Behera, Vijaya V. Saradhi

Abstract: Spectral clustering methods have gained widespread recognition for their effectiveness in clustering high-dimensional data. Among these techniques, constrained spectral clustering has emerged as a prominent approach, demonstrating enhanced performance by integrating pairwise constraints. However, the application of such constraints to semidefinite spectral clustering, a variant that leverages semi… ▽ More Spectral clustering methods have gained widespread recognition for their effectiveness in clustering high-dimensional data. Among these techniques, constrained spectral clustering has emerged as a prominent approach, demonstrating enhanced performance by integrating pairwise constraints. However, the application of such constraints to semidefinite spectral clustering, a variant that leverages semidefinite programming to optimize clustering objectives, remains largely unexplored. In this paper, we introduce a novel framework for seamlessly integrating pairwise constraints into semidefinite spectral clustering. Our methodology systematically extends the capabilities of semidefinite spectral clustering to capture complex data structures, thereby addressing real-world clustering challenges more effectively. Additionally, we extend this framework to encompass both active and self-taught learning scenarios, further enhancing its versatility and applicability. Empirical studies conducted on well-known datasets demonstrate the superiority of our proposed framework over existing spectral clustering methods, showcasing its robustness and scalability across diverse datasets and learning settings. By bridging the gap between constrained learning and semidefinite spectral clustering, our work contributes to the advancement of spectral clustering techniques, offering researchers and practitioners a versatile tool for addressing complex clustering challenges in various real-world applications. Access to the data, code, and experimental results is provided for further exploration (https://github.com/swarupbehera/SCCCS). △ Less

Submitted 3 April, 2024; originally announced April 2024.

ACM Class: I.2.7

arXiv:2404.00030 [pdf, other]

Visualization of Unstructured Sports Data -- An Example of Cricket Short Text Commentary

Authors: Swarup Ranjan Behera, Vijaya V Saradhi

Abstract: Sports visualization focuses on the use of structured data, such as box-score data and tracking data. Unstructured data sources pertaining to sports are available in various places such as blogs, social media posts, and online news articles. Sports visualization methods either not fully exploited the information present in these sources or the proposed visualizations through the use of these sourc… ▽ More Sports visualization focuses on the use of structured data, such as box-score data and tracking data. Unstructured data sources pertaining to sports are available in various places such as blogs, social media posts, and online news articles. Sports visualization methods either not fully exploited the information present in these sources or the proposed visualizations through the use of these sources did not augment to the body of sports visualization methods. We propose the use of unstructured data, namely cricket short text commentary for visualization. The short text commentary data is used for constructing individual player's strength rules and weakness rules. A computationally feasible definition for player's strength rule and weakness rule is proposed. A visualization method for the constructed rules is presented. In addition, players having similar strength rules or weakness rules is computed and visualized. We demonstrate the usefulness of short text commentary in visualization by analyzing the strengths and weaknesses of cricket players using more than one million text commentaries. We validate the constructed rules through two validation methods. The collected data, source code, and obtained results on more than 500 players are made publicly available. △ Less

Submitted 22 March, 2024; originally announced April 2024.

ACM Class: I.2.7

arXiv:2403.18580 [pdf, other]

MisGUIDE : Defense Against Data-Free Deep Learning Model Extraction

Authors: Mahendra Gurve, Sankar Behera, Satyadev Ahlawat, Yamuna Prasad

Abstract: The rise of Machine Learning as a Service (MLaaS) has led to the widespread deployment of machine learning models trained on diverse datasets. These models are employed for predictive services through APIs, raising concerns about the security and confidentiality of the models due to emerging vulnerabilities in prediction APIs. Of particular concern are model cloning attacks, where individuals with… ▽ More The rise of Machine Learning as a Service (MLaaS) has led to the widespread deployment of machine learning models trained on diverse datasets. These models are employed for predictive services through APIs, raising concerns about the security and confidentiality of the models due to emerging vulnerabilities in prediction APIs. Of particular concern are model cloning attacks, where individuals with limited data and no knowledge of the training dataset manage to replicate a victim model's functionality through black-box query access. This commonly entails generating adversarial queries to query the victim model, thereby creating a labeled dataset. This paper proposes "MisGUIDE", a two-step defense framework for Deep Learning models that disrupts the adversarial sample generation process by providing a probabilistic response when the query is deemed OOD. The first step employs a Vision Transformer-based framework to identify OOD queries, while the second step perturbs the response for such queries, introducing a probabilistic loss function to MisGUIDE the attackers. The aim of the proposed defense method is to reduce the accuracy of the cloned model while maintaining accuracy on authentic queries. Extensive experiments conducted on two benchmark datasets demonstrate that the proposed framework significantly enhances the resistance against state-of-the-art data-free model extraction in black-box settings. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: Under Review

MSC Class: Under Review

arXiv:2402.06646 [pdf]

Diffusion Model-based Probabilistic Downscaling for 180-year East Asian Climate Reconstruction

Authors: Fenghua Ling, Zeyu Lu, **g-Jia Luo, Lei Bai, Swadhin K. Behera, Dachao **, Baoxiang Pan, Huidong Jiang, Toshio Yamagata

Abstract: As our planet is entering into the "global boiling" era, understanding regional climate change becomes imperative. Effective downscaling methods that provide localized insights are crucial for this target. Traditional approaches, including computationally-demanding regional dynamical models or statistical downscaling frameworks, are often susceptible to the influence of downscaling uncertainty. He… ▽ More As our planet is entering into the "global boiling" era, understanding regional climate change becomes imperative. Effective downscaling methods that provide localized insights are crucial for this target. Traditional approaches, including computationally-demanding regional dynamical models or statistical downscaling frameworks, are often susceptible to the influence of downscaling uncertainty. Here, we address these limitations by introducing a diffusion probabilistic downscaling model (DPDM) into the meteorological field. This model can efficiently transform data from 1° to 0.1° resolution. Compared with deterministic downscaling schemes, it not only has more accurate local details, but also can generate a large number of ensemble members based on probability distribution sampling to evaluate the uncertainty of downscaling. Additionally, we apply the model to generate a 180-year dataset of monthly surface variables in East Asia, offering a more detailed perspective for understanding local scale climate change over the past centuries. △ Less

Submitted 5 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

arXiv:2312.17343 [pdf, other]

AQUALLM: Audio Question Answering Data Generation Using Large Language Models

Authors: Swarup Ranjan Behera, Krishna Mohan Injeti, Jaya Sai Kiran Patibandla, Praveen Kumar Pokala, Balakrishna Reddy Pailla

Abstract: Audio Question Answering (AQA) constitutes a pivotal task in which machines analyze both audio signals and natural language questions to produce precise natural language answers. The significance of possessing high-quality, diverse, and extensive AQA datasets cannot be overstated when aiming for the precision of an AQA system. While there has been notable focus on develo** accurate and efficient… ▽ More Audio Question Answering (AQA) constitutes a pivotal task in which machines analyze both audio signals and natural language questions to produce precise natural language answers. The significance of possessing high-quality, diverse, and extensive AQA datasets cannot be overstated when aiming for the precision of an AQA system. While there has been notable focus on develo** accurate and efficient AQA models, the creation of high-quality, diverse, and extensive datasets for the specific task at hand has not garnered considerable attention. To address this challenge, this work makes several contributions. We introduce a scalable AQA data generation pipeline, denoted as the AQUALLM framework, which relies on Large Language Models (LLMs). This framework utilizes existing audio-caption annotations and incorporates state-of-the-art LLMs to generate expansive, high-quality AQA datasets. Additionally, we present three extensive and high-quality benchmark datasets for AQA, contributing significantly to the progression of AQA research. AQA models trained on the proposed datasets set superior benchmarks compared to the existing state-of-the-art. Moreover, models trained on our datasets demonstrate enhanced generalizability when compared to models trained using human-annotated AQA data. Code and datasets will be accessible on GitHub~\footnote{\url{https://github.com/swarupbehera/AQUALLM}}. △ Less

Submitted 28 December, 2023; originally announced December 2023.

ACM Class: I.2.7

arXiv:2311.06818 [pdf, other]

Cricket Player Profiling: Unraveling Strengths and Weaknesses Using Text Commentary Data

Authors: Swarup Ranjan Behera, Vijaya V. Saradhi

Abstract: Devising player-specific strategies in cricket necessitates a meticulous understanding of each player's unique strengths and weaknesses. Nevertheless, the absence of a definitive computational approach to extract such insights from cricket players poses a significant challenge. This paper seeks to address this gap by establishing computational models designed to extract the rules governing player… ▽ More Devising player-specific strategies in cricket necessitates a meticulous understanding of each player's unique strengths and weaknesses. Nevertheless, the absence of a definitive computational approach to extract such insights from cricket players poses a significant challenge. This paper seeks to address this gap by establishing computational models designed to extract the rules governing player strengths and weaknesses, thereby facilitating the development of tailored strategies for individual players. The complexity of this endeavor lies in several key areas: the selection of a suitable dataset, the precise definition of strength and weakness rules, the identification of an appropriate learning algorithm, and the validation of the derived rules. To tackle these challenges, we propose the utilization of unstructured data, specifically cricket text commentary, as a valuable resource for constructing comprehensive strength and weakness rules for cricket players. We also introduce computationally feasible definitions for the construction of these rules, and present a dimensionality reduction technique for the rule-building process. In order to showcase the practicality of this approach, we conduct an in-depth analysis of cricket player strengths and weaknesses using a vast corpus of more than one million text commentaries. Furthermore, we validate the constructed rules through two distinct methodologies: intrinsic and extrinsic. The outcomes of this research are made openly accessible, including the collected data, source code, and results for over 250 cricket players, which can be accessed at https://bit.ly/2PKuzx8. △ Less

Submitted 12 November, 2023; originally announced November 2023.

Comments: The initial work was published in the ICMLA 2019 conference

ACM Class: I.2.7

arXiv:2306.10741 [pdf, ps, other]

doi 10.1109/ICTON59386.2023.10207370

Machine Learning for Real-Time Anomaly Detection in Optical Networks

Authors: Sadananda Behera, Tania Panayiotou, Georgios Ellinas

Abstract: This work proposes a real-time anomaly detection scheme that leverages the multi-step ahead prediction capabilities of encoder-decoder (ED) deep learning models with recurrent units. Specifically, an encoder-decoder is used to model soft-failure evolution over a long future horizon (i.e., for several days ahead) by analyzing past quality-of-transmission (QoT) observations. This information is subs… ▽ More This work proposes a real-time anomaly detection scheme that leverages the multi-step ahead prediction capabilities of encoder-decoder (ED) deep learning models with recurrent units. Specifically, an encoder-decoder is used to model soft-failure evolution over a long future horizon (i.e., for several days ahead) by analyzing past quality-of-transmission (QoT) observations. This information is subsequently used for real-time anomaly detection (e.g., of attack incidents), as the knowledge of how the QoT is expected to evolve allows capturing unexpected network behavior. Specifically, for anomaly detection, a statistical hypothesis testing scheme is used, alleviating the limitations of supervised (SL) and unsupervised learning (UL) schemes, usually applied for this purpose. Indicatively, the proposed scheme eliminates the need for labeled anomalies, required when SL is applied, and the need for on-line analyzing entire datasets to identify abnormal instances (i.e., UL). Overall, it is shown that by utilizing QoT evolution information, the proposed approach can effectively detect abnormal deviations in real-time. Importantly, it is shown that the information concerning soft-failure evolution (i.e., QoT predictions) is essential to accurately detect anomalies. △ Less

Submitted 19 June, 2023; originally announced June 2023.

Comments: accepted for publication in IEEE ICTON conference 2023

arXiv:2208.14535 [pdf, other]

doi 10.1109/GLOBECOM48099.2022.10000690

Modeling Soft-Failure Evolution for Triggering Timely Repair with Low QoT Margins

Authors: Sadananda Behera, Tania Panayiotou, Georgios Ellinas

Abstract: In this work, the capabilities of an encoder-decoder learning framework are leveraged to predict soft-failure evolution over a long future horizon. This enables the triggering of timely repair actions with low quality-of-transmission (QoT) margins before a costly hard-failure occurs, ultimately reducing the frequency of repair actions and associated operational expenses. Specifically, it is shown… ▽ More In this work, the capabilities of an encoder-decoder learning framework are leveraged to predict soft-failure evolution over a long future horizon. This enables the triggering of timely repair actions with low quality-of-transmission (QoT) margins before a costly hard-failure occurs, ultimately reducing the frequency of repair actions and associated operational expenses. Specifically, it is shown that the proposed scheme is capable of triggering a repair action several days prior to the expected day of a hard-failure, contrary to soft-failure detection schemes utilizing rule-based fixed QoT margins, that may lead either to premature repair actions (i.e., several months before the event of a hard-failure) or to repair actions that are taken too late (i.e., after the hard failure has occurred). Both frameworks are evaluated and compared for a lightpath established in an elastic optical network, where soft-failure evolution can be modeled by analyzing bit-error-rate information monitored at the coherent receivers. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Comments: accepted for presentation at the IEEE GLOBECOM 2022 conference

arXiv:2208.10784 [pdf, other]

doi 10.1088/2632-2153/acac01

Building Robust Machine Learning Models for Small Chemical Science Data: The Case of Shear Viscosity

Authors: Nikhil V. S. Avula, Shivanand K. Veesam, Sudarshan Behera, Sundaram Balasubramanian

Abstract: Shear viscosity, though being a fundamental property of all liquids, is computationally expensive to estimate from equilibrium molecular dynamics simulations. Recently, Machine Learning (ML) methods have been used to augment molecular simulations in many contexts, thus showing promise to estimate viscosity too in a relatively inexpensive manner. However, ML methods face significant challenges like… ▽ More Shear viscosity, though being a fundamental property of all liquids, is computationally expensive to estimate from equilibrium molecular dynamics simulations. Recently, Machine Learning (ML) methods have been used to augment molecular simulations in many contexts, thus showing promise to estimate viscosity too in a relatively inexpensive manner. However, ML methods face significant challenges like overfitting when the size of the data set is small, as is the case with viscosity. In this work, we train several ML models to predict the shear viscosity of a Lennard-Jones (LJ) fluid, with particular emphasis on addressing issues arising from a small data set. Specifically, the issues related to model selection, performance estimation and uncertainty quantification were investigated. First, we show that the widely used performance estimation procedure of using a single unseen data set shows a wide variability on small data sets. In this context, the common practice of using Cross validation (CV) to select the hyperparameters (model selection) can be adapted to estimate the generalization error (performance estimation) as well. We compare two simple CV procedures for their ability to do both model selection and performance estimation, and find that k-fold CV based procedure shows a lower variance of error estimates. We discuss the role of performance metrics in training and evaluation. Finally, Gaussian Process Regression (GPR) and ensemble methods were used to estimate the uncertainty on individual predictions. The uncertainty estimates from GPR were also used to construct an applicability domain using which the ML models provided more reliable predictions on another small data set generated in this work. Overall, the procedures prescribed in this work, together, lead to robust ML models for small data sets. △ Less

Submitted 23 August, 2022; originally announced August 2022.

Comments: main: 17 pages, 11 figures ; SI: 55 pages, 29 figures ; to be submitted to Journal of Chemical Physics

Journal ref: Mach. Learn.: Sci. Technol. 3 (2022) 045032

arXiv:2201.05270 [pdf, other]

Robust QoT Assured Resource Allocation in Shared Backup Path Protection Based EONs

Authors: Venkatesh Chebolu, Sadananda Behera, Goutam Das

Abstract: Survivability is mission-critical for elastic optical networks (EONs) as they are expected to carry an enormous amount of data. In this paper, we consider the problem of designing shared backup path protection (SBPP) based EON that facilitates the minimum quality-of-transmission (QoT) assured allocation against physical layer impairments (PLIs) under any single link/shared risk link group (SRLG) f… ▽ More Survivability is mission-critical for elastic optical networks (EONs) as they are expected to carry an enormous amount of data. In this paper, we consider the problem of designing shared backup path protection (SBPP) based EON that facilitates the minimum quality-of-transmission (QoT) assured allocation against physical layer impairments (PLIs) under any single link/shared risk link group (SRLG) failure for static and dynamic traffic scenarios. In general, the effect of PLIs on lightpath varies based on the location of failure of a link as it introduces different active working and backup paths. To address these issues in the design of SBPP EON, we formulate a mixed integer linear programming (MILP) based robust optimization framework for static traffic with the objective of minimizing overall fragmentation. In this process, we use the efficient bitloading technique for spectrum allocation for the first time in survivable EONs. In addition, we propose a novel SBPP-impairment aware (SBPP-IA) algorithm considering the limitations of MILP for larger networks. For this purpose, we introduce a novel sorting technique named most congested working-least congested backup first (MCW-LCBF) to sort the given set of static requests. Next, we employ our SBPP-IA algorithm for dynamic traffic scenario and compare it with existing algorithms in terms of different QoT parameters. We demonstrated through simulations that our study provides around 40% more QoT guaranteed requests compared to existing ones. △ Less

Submitted 13 January, 2022; originally announced January 2022.

arXiv:2108.01394 [pdf]

doi 10.1109/ICPECTS49113.2020.9337012

AI Based Waste classifier with Thermo-Rapid Composting

Authors: Saswati kumari behera, Aouthithiye Barathwaj SR Y, Vasundhara L, Saisudha G, Haariharan N C

Abstract: Waste management is a certainly a very complex and difficult process especially in very large cities. It needs immense man power and also uses up other resources such as electricity and fuel. This creates a need to use a novel method with help of latest technologies. Here in this article we present a new waste classification technique using Computer Vision (CV) and deep learning (DL). To further i… ▽ More Waste management is a certainly a very complex and difficult process especially in very large cities. It needs immense man power and also uses up other resources such as electricity and fuel. This creates a need to use a novel method with help of latest technologies. Here in this article we present a new waste classification technique using Computer Vision (CV) and deep learning (DL). To further improve waste classification ability, support machine vectors (SVM) are used. We also decompose the degradable waste with help of rapid composting. In this article we have mainly worked on segregation of municipal solid waste (MSW). For this model, we use YOLOv3 (You Only Look Once) a computer vision-based algorithm popularly used to detect objects which is developed based on Convolution Neural Networks (CNNs) which is a machine learning (ML) based tool. They are extensively used to extract features from a data especially image-oriented data. In this article we propose a waste classification technique which will be faster and more efficient. And we decompose the biodegradable waste by Berkley Method of composting (BKC) △ Less

Submitted 3 August, 2021; originally announced August 2021.

Comments: 4 pages, 8 figures, conference

arXiv:2003.05626 [pdf, other]

Understanding Crowd Flow Movements Using Active-Langevin Model

Authors: Shreetam Behera, Debi Prosad Dogra, Malay Kumar Bandyopadhyay, Partha Pratim Roy

Abstract: Crowd flow describes the elementary group behavior of crowds. Understanding the dynamics behind these movements can help to identify various abnormalities in crowds. However, develo** a crowd model describing these flows is a challenging task. In this paper, a physics-based model is proposed to describe the movements in dense crowds. The crowd model is based on active Langevin equation where the… ▽ More Crowd flow describes the elementary group behavior of crowds. Understanding the dynamics behind these movements can help to identify various abnormalities in crowds. However, develo** a crowd model describing these flows is a challenging task. In this paper, a physics-based model is proposed to describe the movements in dense crowds. The crowd model is based on active Langevin equation where the motion points are assumed to be similar to active colloidal particles in fluids. The model is further augmented with computer-vision techniques to segment both linear and non-linear motion flows in a dense crowd. The evaluation of the active Langevin equation-based crowd segmentation has been done on publicly available crowd videos and on our own videos. The proposed method is able to segment the flow with lesser optical flow error and better accuracy in comparison to existing state-of-the-art methods. △ Less

Submitted 18 August, 2020; v1 submitted 12 March, 2020; originally announced March 2020.

arXiv:1904.07233 [pdf, other]

Estimation of Linear Motion in Dense Crowd Videos using Langevin Model

Authors: Shreetam Behera, Debi Prosad Dogra, Malay Kumar Bandyopadhyay, Partha Pratim Roy

Abstract: Crowd gatherings at social and cultural events are increasing in leaps and bounds with the increase in population. Surveillance through computer vision and expert decision making systems can help to understand the crowd phenomena at large gatherings. Understanding crowd phenomena can be helpful in early identification of unwanted incidents and their prevention. Motion flow is one of the important… ▽ More Crowd gatherings at social and cultural events are increasing in leaps and bounds with the increase in population. Surveillance through computer vision and expert decision making systems can help to understand the crowd phenomena at large gatherings. Understanding crowd phenomena can be helpful in early identification of unwanted incidents and their prevention. Motion flow is one of the important crowd phenomena that can be instrumental in describing the crowd behavior. Flows can be useful in understanding instabilities in the crowd. However, extracting motion flows is a challenging task due to randomness in crowd movement and limitations of the sensing device. Moreover, low-level features such as optical flow can be misleading if the randomness is high. In this paper, we propose a new model based on Langevin equation to analyze the linear dominant flows in videos of densely crowded scenarios. We assume a force model with three components, namely external force, confinement/drift force, and disturbance force. These forces are found to be sufficient to describe the linear or near-linear motion in dense crowd videos. The method is significantly faster as compared to existing popular crowd segmentation methods. The evaluation of the proposed model has been carried out on publicly available datasets as well as using our dataset. It has been observed that the proposed method is able to estimate and segment the linear flows in the dense crowd with better accuracy as compared to state-of-the-art techniques with substantial decrease in the computational overhead. △ Less

Submitted 15 April, 2019; originally announced April 2019.

arXiv:1812.01465 [pdf, other]

Cross-spectral Periocular Recognition: A Survey

Authors: S. S. Behera, Bappaditya Mandal, N. B. Puhan

Abstract: Among many biometrics such as face, iris, fingerprint and others, periocular region has the advantages over other biometrics because it is non-intrusive and serves as a balance between iris or eye region (very stringent, small area) and the whole face region (very relaxed large area). Research have shown that this is the region which does not get affected much because of various poses, aging, expr… ▽ More Among many biometrics such as face, iris, fingerprint and others, periocular region has the advantages over other biometrics because it is non-intrusive and serves as a balance between iris or eye region (very stringent, small area) and the whole face region (very relaxed large area). Research have shown that this is the region which does not get affected much because of various poses, aging, expression, facial changes and other artifacts, which otherwise would change to a large variation. Active research has been carried out on this topic since past few years due to its obvious advantages over face and iris biometrics in unconstrained and uncooperative scenarios. Many researchers have explored periocular biometrics involving both visible (VIS) and infra-red (IR) spectrum images. For a system to work for 24/7 (such as in surveillance scenarios), the registration process may depend on the day time VIS periocular images (or any mug shot image) and the testing or recognition process may occur in the night time involving only IR periocular images. This gives rise to a challenging research problem called the cross-spectral matching of images where VIS images are used for registration or as gallery images and IR images are used for testing or recognition process and vice versa. After intensive research of more than two decades on face and iris biometrics in cross-spectral domain, a number of researchers have now focused their work on matching heterogeneous (cross-spectral) periocular images. Though a number of surveys have been made on existing periocular biometric research, no study has been done on its cross-spectral aspect. This paper analyses and reviews current state-of-the-art techniques in cross-spectral periocular recognition including various methodologies, databases, their protocols and current-state-of-the-art recognition performances. △ Less

Submitted 4 December, 2018; originally announced December 2018.

Comments: 12 pages, 4 figures, 1 table, accepted in the Third International Conference on Emerging Research in Electronics, Computer science and Technology (ICERECT), during August 2018

arXiv:1709.04616 [pdf, other]

Effect of Transmission Impairments in CO-OFDM Based Elastic Optical Network Design

Authors: Sadananda Behera, Jithin George, Goutam Das

Abstract: Coherent Optical Orthogonal Frequency Division Multiplexing (CO-OFDM) based Elastic Optical Network (EON) is one of the emerging technologies being considered for next generation high data rate optical network systems. Routing and Spectrum Allocation (RSA) is an important aspect of EON. Apart from spectral fragmentation created due to spectrum continuity and contiguity constraints of RSA, transmis… ▽ More Coherent Optical Orthogonal Frequency Division Multiplexing (CO-OFDM) based Elastic Optical Network (EON) is one of the emerging technologies being considered for next generation high data rate optical network systems. Routing and Spectrum Allocation (RSA) is an important aspect of EON. Apart from spectral fragmentation created due to spectrum continuity and contiguity constraints of RSA, transmission impairments such as shot noise, amplified spontaneous emission (ASE) beat noise due to coherent detection, crosstalk in cross-connect (XC), nonlinear interference, and filter narrowing, limit the transmission reach of optical signals in EON. This paper focuses on the cross-layer joint optimization of delay-bandwidth product, fragmentation and link congestion for RSA in CO-OFDM EON while considering the effect of physical layer impairments. First, we formulate an optimal Integer Linear Programming (ILP) that achieves load-balancing in presence of transmission impairments and minimizes delay-bandwidth product along with fragmentation. We next propose a heuristic algorithm for large networks with two different demand ordering techniques. We show the benefits of our algorithm compared to the existing load balancing algorithm. △ Less

Submitted 14 September, 2017; originally announced September 2017.

arXiv:1604.00493 [pdf]

Steganography -- A Game of Hide and Seek in Information Communication

Authors: Sanjeeb Kumar Behera, Minati Mishra

Abstract: With the growth of communication over computer networks, how to maintain the confidentiality and security of transmitted information have become some of the important issues. In order to transfer data securely to the destination without unwanted disclosure or damage, nature inspired hide and seek tricks such as, cryptography and Steganography are heavily in use. Just like the Chameleon and many ot… ▽ More With the growth of communication over computer networks, how to maintain the confidentiality and security of transmitted information have become some of the important issues. In order to transfer data securely to the destination without unwanted disclosure or damage, nature inspired hide and seek tricks such as, cryptography and Steganography are heavily in use. Just like the Chameleon and many other bio-species those change their body color and hide themselves in the background in order to protect them from external attacks, Cryptography and Steganography are techniques those are used to encrypt and hide the secret data inside other media to ensure data security. This paper discusses the concept of a simple spatial domain LSB Steganography that encrypts the secrets using Fibonacci- Lucas transformation, before hiding, for better security. △ Less

Submitted 2 April, 2016; originally announced April 2016.

Comments: 5 pages, 4 figures, National Conference on Recent Innovations in Engineering and Management Sciences (RIEMS-2016)

arXiv:1601.03481 [pdf]

A Fuzzy MLP Approach for Non-linear Pattern Classification

Authors: Tirtharaj Dash, H. S. Behera

Abstract: In case of decision making problems, classification of pattern is a complex and crucial task. Pattern classification using multilayer perceptron (MLP) trained with back propagation learning becomes much complex with increase in number of layers, number of nodes and number of epochs and ultimate increases computational time [31]. In this paper, an attempt has been made to use fuzzy MLP and its lear… ▽ More In case of decision making problems, classification of pattern is a complex and crucial task. Pattern classification using multilayer perceptron (MLP) trained with back propagation learning becomes much complex with increase in number of layers, number of nodes and number of epochs and ultimate increases computational time [31]. In this paper, an attempt has been made to use fuzzy MLP and its learning algorithm for pattern classification. The time and space complexities of the algorithm have been analyzed. A training performance comparison has been carried out between MLP and the proposed fuzzy-MLP model by considering six cases. Results are noted against different learning rates ranging from 0 to 1. A new performance evaluation factor 'convergence gain' has been introduced. It is observed that the number of epochs drastically reduced and performance increased compared to MLP. The average and minimum gain has been found to be 93% and 75% respectively. The best gain is found to be 95% and is obtained by setting the learning rate to 0.55. △ Less

Submitted 19 September, 2015; originally announced January 2016.

Comments: The final version of this paper has been published in "International Conference on Communication and Computing (ICC-2014)" [http://www.elsevierst.com/conference_book_download_chapter.php?cbid=86#chapter41]

Journal ref: In Proc: K.R. Venugopal, S.C. Lingareddy (eds.) International Conference on Communication and Computing (ICC- 2014), Bangalore, India (June 12-14, 2014), Computer Networks and Security, 314-323

arXiv:1109.3076 [pdf]

Comparative performance analysis of multi dynamic time quantum Round Robin(MDTQRR) algorithm with arrival time

Authors: H. S. Behera, Rakesh Mohanty, Sabyasachi Sahu, Sourav Kumar Bhoi

Abstract: CPU being considered a primary computer resource, its scheduling is central to operating-system design. A thorough performance evaluation of various scheduling algorithms manifests that Round Robin Algorithm is considered as optimal in time shared environment because the static time is equally shared among the processes. We have proposed an efficient technique in the process scheduling algorithm b… ▽ More CPU being considered a primary computer resource, its scheduling is central to operating-system design. A thorough performance evaluation of various scheduling algorithms manifests that Round Robin Algorithm is considered as optimal in time shared environment because the static time is equally shared among the processes. We have proposed an efficient technique in the process scheduling algorithm by using dynamic time quantum in Round Robin. Our approach is based on the calculation of time quantum twice in single round robin cycle. Taking into consideration the arrival time, we implement the algorithm. Experimental analysis shows better performance of this improved algorithm over the Round Robin algorithm and the Shortest Remaining Burst Round Robin algorithm. It minimizes the overall number of context switches, average waiting time and average turn-around time. Consequently the throughput and CPU utilization is better. △ Less

Submitted 14 September, 2011; originally announced September 2011.

Comments: 10 pages, 18 Figures, Indian Journal of Computer Science and Engineering vol. 2 no. 2 April-May 2011

arXiv:1105.1736 [pdf]

Priority Based Dynamic Round Robin (PBDRR) Algorithm with Intelligent Time Slice for Soft Real Time Systems

Authors: Rakesh Mohanty, H. S. Behera, Khusbu Patwari, Monisha Dash, M. Lakshmi Prasanna

Abstract: In this paper, a new variant of Round Robin (RR) algorithm is proposed which is suitable for soft real time systems. RR algorithm performs optimally in timeshared systems, but it is not suitable for soft real time systems. Because it gives more number of context switches, larger waiting time and larger response time. We have proposed a novel algorithm, known as Priority Based Dynamic Round Robin A… ▽ More In this paper, a new variant of Round Robin (RR) algorithm is proposed which is suitable for soft real time systems. RR algorithm performs optimally in timeshared systems, but it is not suitable for soft real time systems. Because it gives more number of context switches, larger waiting time and larger response time. We have proposed a novel algorithm, known as Priority Based Dynamic Round Robin Algorithm(PBDRR),which calculates intelligent time slice for individual processes and changes after every round of execution. The proposed scheduling algorithm is developed by taking dynamic time quantum concept into account. Our experimental results show that our proposed algorithm performs better than algorithm in [8] in terms of reducing the number of context switches, average waiting time and average turnaround time. △ Less

Submitted 9 May, 2011; originally announced May 2011.

Comments: 5 pages

Journal ref: International Journal of Advanced Computer Science and Applications(IJACSA), Vol. 2 No. 2, February 2011 2011, 46-50

arXiv:1103.3832 [pdf]

doi 10.5120/2037-2648

A New Dynamic Round Robin and SRTN Algorithm with Variable Original Time Slice and Intelligent Time Slice for Soft Real Time Systems

Authors: H. S. Behera, Simpi Patel, Bijayalakshmi Panda

Abstract: The main objective of the paper is to improve the Round Robin (RR) algorithm using dynamic ITS by coalescing it with Shortest Remaining Time Next (SRTN) algorithm thus reducing the average waiting time, average turnaround time and the number of context switches. The original time slice has been calculated for each process based on its burst time.This is mostly suited for soft real time systems whe… ▽ More The main objective of the paper is to improve the Round Robin (RR) algorithm using dynamic ITS by coalescing it with Shortest Remaining Time Next (SRTN) algorithm thus reducing the average waiting time, average turnaround time and the number of context switches. The original time slice has been calculated for each process based on its burst time.This is mostly suited for soft real time systems where meeting of deadlines is desirable to increase its performance. The advantage is that processes that are closer to their remaining completion time will get more chances to execute and leave the ready queue. This will reduce the number of processes in the ready queue by knocking out short jobs relatively faster in a hope to reduce the average waiting time, turn around time and number of context switches. This paper improves the algorithm [8] and the experimental analysis shows that the proposed algorithm performs better than algorithm [6] and [8] when the processes are having an increasing order, decreasing order and random order of burst time. △ Less

Submitted 20 March, 2011; originally announced March 2011.

Comments: 07 pages; International Journal of Computer Applications, Vol 16, No. 1(9) February 2011

arXiv:1103.3831 [pdf]

A New Proposed Dynamic Quantum with Re-Adjusted Round Robin Scheduling Algorithm and Its Performance Analysis

Authors: H. S. Behera, Rakesh Mohanty, Debashree Nayak

Abstract: Scheduling is the central concept used frequently in Operating System. It helps in choosing the processes for execution. Round Robin (RR) is one of the most widely used CPU scheduling algorithm. But, its performance degrades with respect to context switching, which is an overhead and it occurs during each scheduling. Overall performance of the system depends on choice of an optimal time quantum, s… ▽ More Scheduling is the central concept used frequently in Operating System. It helps in choosing the processes for execution. Round Robin (RR) is one of the most widely used CPU scheduling algorithm. But, its performance degrades with respect to context switching, which is an overhead and it occurs during each scheduling. Overall performance of the system depends on choice of an optimal time quantum, so that context switching can be reduced. In this paper, we have proposed a new variant of RR scheduling algorithm, known as Dynamic Quantum with Readjusted Round Robin (DQRRR) algorithm. We have experimentally shown that performance of DQRRR is better than RR by reducing number of context switching, average waiting time and average turn around time. △ Less

Submitted 20 March, 2011; originally announced March 2011.

Comments: 06 pages; International Journal of Computer Applications, Vol. 5, No. 5, August 2010

arXiv:1010.4007 [pdf]

Colour Guided Colour Image Steganography

Authors: R. Amirtharajan, Sandeep Kumar Behera, Motamarri Abhilash Swarup, Mohamed Ashfaaq K, John Bosco Balaguru Rayappan

Abstract: Information security has become a cause of concern because of the electronic eavesdrop**. Capacity, robustness and invisibility are important parameters in information hiding and are quite difficult to achieve in a single algorithm. This paper proposes a novel steganography technique for digital color image which achieves the purported targets. The professed methodology employs a complete random… ▽ More Information security has become a cause of concern because of the electronic eavesdrop**. Capacity, robustness and invisibility are important parameters in information hiding and are quite difficult to achieve in a single algorithm. This paper proposes a novel steganography technique for digital color image which achieves the purported targets. The professed methodology employs a complete random scheme for pixel selection and embedding of data. Of the three colour channels (Red, Green, Blue) in a given colour image, the least two significant bits of any one of the channels of the color image is used to channelize the embedding capacity of the remaining two channels. We have devised three approaches to achieve various levels of our desired targets. In the first approach, Red is the default guide but it results in localization of MSE in the remaining two channels, which makes it slightly vulnerable. In the second approach, user gets the liberty to select the guiding channel (Red, Green or Blue) to guide the remaining two channels. It will increase the robustness and imperceptibility of the embedded image however the MSE factor will still remain as a drawback. The third approach improves the performance factor as a cyclic methodology is employed and the guiding channel is selected in a cyclic fashion. This ensures the uniform distribution of MSE, which gives better robustness and imperceptibility along with enhanced embedding capacity. The imperceptibility has been enhanced by suitably adapting optimal pixel adjustment process (OPAP) on the stego covers. △ Less

Submitted 19 October, 2010; originally announced October 2010.

Comments: Universal Journal of Computer Science and Engineering Technology (UniCSE)

Journal ref: 1 (1), 16-23, Oct. 2010

Showing 1–24 of 24 results for author: Behera, S