Skip to main content

Showing 1–4 of 4 results for author: Parikh, R

Searching in archive eess. Search in all archives.
.
  1. arXiv:2206.13476  [pdf, other

    cs.SD cs.AI cs.LG eess.AS eess.SP

    Impact of Acoustic Event Tagging on Scene Classification in a Multi-Task Learning Framework

    Authors: Rahil Parikh, Harshavardhan Sundar, Ming Sun, Chao Wang, Spyros Matsoukas

    Abstract: Acoustic events are sounds with well-defined spectro-temporal characteristics which can be associated with the physical objects generating them. Acoustic scenes are collections of such acoustic events in no specific temporal order. Given this natural linkage between events and scenes, a common belief is that the ability to classify events must help in the classification of scenes. This has led to… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted at ISCA Interspeech 2022

  2. arXiv:2206.09556  [pdf, other

    eess.AS cs.AI cs.LG cs.SD eess.SP

    An Empirical Analysis on the Vulnerabilities of End-to-End Speech Segregation Models

    Authors: Rahil Parikh, Gaspar Rochette, Carol Espy-Wilson, Shihab Shamma

    Abstract: End-to-end learning models have demonstrated a remarkable capability in performing speech segregation. Despite their wide-scope of real-world applications, little is known about the mechanisms they employ to group and consequently segregate individual speakers. Knowing that harmonicity is a critical cue for these networks to group sources, in this work, we perform a thorough investigation on ConvT… ▽ More

    Submitted 19 June, 2022; originally announced June 2022.

    Comments: Accepted at Interspeech 2022

  3. arXiv:2203.05780  [pdf, other

    eess.AS cs.AI cs.LG cs.SD eess.SP

    Acoustic To Articulatory Speech Inversion Using Multi-Resolution Spectro-Temporal Representations Of Speech Signals

    Authors: Rahil Parikh, Nadee Seneviratne, Ganesh Sivaraman, Shihab Shamma, Carol Espy-Wilson

    Abstract: Multi-resolution spectro-temporal features of a speech signal represent how the brain perceives sounds by tuning cortical cells to different spectral and temporal modulations. These features produce a higher dimensional representation of the speech signals. The purpose of this paper is to evaluate how well the auditory cortex representation of speech signals contribute to estimate articulatory fea… ▽ More

    Submitted 25 June, 2022; v1 submitted 11 March, 2022; originally announced March 2022.

    Comments: Accepted at ISCA Interspeech 2022

  4. arXiv:2203.04420  [pdf, other

    eess.AS cs.AI cs.LG cs.SD eess.SP

    Harmonicity Plays a Critical Role in DNN Based Versus in Biologically-Inspired Monaural Speech Segregation Systems

    Authors: Rahil Parikh, Ilya Kavalerov, Carol Espy-Wilson, Shihab Shamma

    Abstract: Recent advancements in deep learning have led to drastic improvements in speech segregation models. Despite their success and growing applicability, few efforts have been made to analyze the underlying principles that these networks learn to perform segregation. Here we analyze the role of harmonicity on two state-of-the-art Deep Neural Networks (DNN)-based models- Conv-TasNet and DPT-Net. We eval… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

    Comments: 5 pages, IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP), 2022