Skip to main content

Showing 1–3 of 3 results for author: Chiniya, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.04432  [pdf, other

    eess.AS cs.AI cs.CL

    LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition

    Authors: Sreyan Ghosh, Sonal Kumar, Ashish Seth, Purva Chiniya, Utkarsh Tyagi, Ramani Duraiswami, Dinesh Manocha

    Abstract: Visual cues, like lip motion, have been shown to improve the performance of Automatic Speech Recognition (ASR) systems in noisy environments. We propose LipGER (Lip Motion aided Generative Error Correction), a novel framework for leveraging visual cues for noise-robust ASR. Instead of learning the cross-modal correlation between the audio and visual modalities, we make an LLM learn the task of vis… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: InterSpeech 2024. Code and Data: https://github.com/Sreyan88/LipGER

  2. arXiv:2312.00834  [pdf, other

    cs.SD cs.CV

    AV-RIR: Audio-Visual Room Impulse Response Estimation

    Authors: Anton Ratnarajah, Sreyan Ghosh, Sonal Kumar, Purva Chiniya, Dinesh Manocha

    Abstract: Accurate estimation of Room Impulse Response (RIR), which captures an environment's acoustic properties, is important for speech processing and AR/VR applications. We propose AV-RIR, a novel multi-modal multi-task learning approach to accurately estimate the RIR from a given reverberant speech signal and the visual cues of its corresponding environment. AV-RIR builds on a novel neural codec-based… ▽ More

    Submitted 23 April, 2024; v1 submitted 30 November, 2023; originally announced December 2023.

    Comments: Accepted to CVPR 2024

  3. arXiv:2303.03387  [pdf, other

    cs.LG cs.AI cs.CL cs.SI

    CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network

    Authors: Sreyan Ghosh, Manan Suri, Purva Chiniya, Utkarsh Tyagi, Sonal Kumar, Dinesh Manocha

    Abstract: The tremendous growth of social media users interacting in online conversations has led to significant growth in hate speech, affecting people from various demographics. Most of the prior works focus on detecting explicit hate speech, which is overt and leverages hateful phrases, with very little work focusing on detecting hate speech that is implicit or denotes hatred through indirect or coded la… ▽ More

    Submitted 24 October, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted to EMNLP 2023 Main Conference. Code: https://github.com/Sreyan88/CoSyn