Skip to main content

Showing 1–2 of 2 results for author: Bansal, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.12540  [pdf, other

    eess.AS cs.AI cs.SD

    On the Efficacy and Noise-Robustness of Jointly Learned Speech Emotion and Automatic Speech Recognition

    Authors: Lokesh Bansal, S. Pavankumar Dubagunta, Malolan Chetlur, Pushpak Jagtap, Aravind Ganapathiraju

    Abstract: New-age conversational agent systems perform both speech emotion recognition (SER) and automatic speech recognition (ASR) using two separate and often independent approaches for real-world application in noisy environments. In this paper, we investigate a joint ASR-SER multitask learning approach in a low-resource setting and show that improvements are observed not only in SER, but also in ASR. We… ▽ More

    Submitted 25 May, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: accepted to be part of INTERSPEECH 2023

  2. arXiv:2304.05866  [pdf, other

    cs.CV cs.LG

    NoisyTwins: Class-Consistent and Diverse Image Generation through StyleGANs

    Authors: Harsh Rangwani, Lavish Bansal, Kartik Sharma, Tejan Karmali, Varun Jampani, R. Venkatesh Babu

    Abstract: StyleGANs are at the forefront of controllable image generation as they produce a latent space that is semantically disentangled, making it suitable for image editing and manipulation. However, the performance of StyleGANs severely degrades when trained via class-conditioning on large-scale long-tailed datasets. We find that one reason for degradation is the collapse of latents for each class in t… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Comments: CVPR 2023. Project Page: https://rangwani-harsh.github.io/NoisyTwins/