Skip to main content

Showing 1–3 of 3 results for author: Pailla, B

.
  1. arXiv:2402.00918  [pdf, other

    cs.CV cs.AI

    MUSTAN: Multi-scale Temporal Context as Attention for Robust Video Foreground Segmentation

    Authors: Praveen Kumar Pokala, Jaya Sai Kiran Patibandla, Naveen Kumar Pandey, Balakrishna Reddy Pailla

    Abstract: Video foreground segmentation (VFS) is an important computer vision task wherein one aims to segment the objects under motion from the background. Most of the current methods are image-based, i.e., rely only on spatial cues while ignoring motion cues. Therefore, they tend to overfit the training data and don't generalize well to out-of-domain (OOD) distribution. To solve the above problem, prior w… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 10 pages, 8 figures

  2. arXiv:2312.17343  [pdf, other

    cs.CL cs.AI cs.LG cs.MM cs.SD eess.AS

    AQUALLM: Audio Question Answering Data Generation Using Large Language Models

    Authors: Swarup Ranjan Behera, Krishna Mohan Injeti, Jaya Sai Kiran Patibandla, Praveen Kumar Pokala, Balakrishna Reddy Pailla

    Abstract: Audio Question Answering (AQA) constitutes a pivotal task in which machines analyze both audio signals and natural language questions to produce precise natural language answers. The significance of possessing high-quality, diverse, and extensive AQA datasets cannot be overstated when aiming for the precision of an AQA system. While there has been notable focus on develo** accurate and efficient… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    ACM Class: I.2.7

  3. arXiv:2109.14796  [pdf, other

    cs.CL

    Phonetic Word Embeddings

    Authors: Rahul Sharma, Kunal Dhawan, Balakrishna Pailla

    Abstract: This work presents a novel methodology for calculating the phonetic similarity between words taking motivation from the human perception of sounds. This metric is employed to learn a continuous vector embedding space that groups similar sounding words together and can be used for various downstream computational phonology tasks. The efficacy of the method is presented for two different languages (… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.