Skip to main content

Showing 1–6 of 6 results for author: Jati, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2010.16038  [pdf, ps, other

    eess.AS

    Adversarial defense for deep speaker recognition using hybrid adversarial training

    Authors: Monisankha Pal, Arindam Jati, Raghuveer Peri, Chin-Cheng Hsu, Wael AbdAlmageed, Shrikanth Narayanan

    Abstract: Deep neural network based speaker recognition systems can easily be deceived by an adversary using minuscule imperceptible perturbations to the input speech samples. These adversarial attacks pose serious security threats to the speaker recognition systems that use speech biometric. To address this concern, in this work, we propose a new defense mechanism based on a hybrid adversarial training (HA… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

    Comments: Submitted to ICASSP 2021

  2. arXiv:2008.07685  [pdf, other

    eess.AS cs.LG cs.SD

    Adversarial Attack and Defense Strategies for Deep Speaker Recognition Systems

    Authors: Arindam Jati, Chin-Cheng Hsu, Monisankha Pal, Raghuveer Peri, Wael AbdAlmageed, Shrikanth Narayanan

    Abstract: Robust speaker recognition, including in the presence of malicious attacks, is becoming increasingly important and essential, especially due to the proliferation of several smart speakers and personal agents that interact with an individual's voice commands to perform diverse, and even sensitive tasks. Adversarial attack is a recently revived domain which is shown to be effective in breaking deep… ▽ More

    Submitted 17 August, 2020; originally announced August 2020.

  3. arXiv:2002.03520  [pdf, other

    eess.AS cs.SD

    An empirical analysis of information encoded in disentangled neural speaker representations

    Authors: Raghuveer Peri, Haoqi Li, Krishna Somandepalli, Arindam Jati, Shrikanth Narayanan

    Abstract: The primary characteristic of robust speaker representations is that they are invariant to factors of variability not related to speaker identity. Disentanglement of speaker representations is one of the techniques used to improve robustness of speaker representations to both intrinsic factors that are acquired during speech production (e.g., emotion, lexical content) and extrinsic factors that ar… ▽ More

    Submitted 7 April, 2020; v1 submitted 9 February, 2020; originally announced February 2020.

    Comments: Submitted to Speaker Odyssey 2020

  4. arXiv:1911.03843  [pdf, other

    eess.AS cs.LG cs.SD

    Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting

    Authors: Arindam Jati, Amrutha Nadarajan, Karel Mundnich, Shrikanth Narayanan

    Abstract: Devices capable of detecting and categorizing acoustic scenes have numerous applications such as providing context-aware user experiences. In this paper, we address the task of characterizing acoustic scenes in a workplace setting from audio recordings collected with wearable microphones. The acoustic scenes, tracked with Bluetooth transceivers, vary dynamically with time from the egocentric persp… ▽ More

    Submitted 9 November, 2019; originally announced November 2019.

    Comments: The paper is submitted to IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020

  5. arXiv:1911.00940  [pdf, other

    eess.AS cs.SD eess.SP

    Robust speaker recognition using unsupervised adversarial invariance

    Authors: Raghuveer Peri, Monisankha Pal, Arindam Jati, Krishna Somandepalli, Shrikanth Narayanan

    Abstract: In this paper, we address the problem of speaker recognition in challenging acoustic conditions using a novel method to extract robust speaker-discriminative speech representations. We adopt a recently proposed unsupervised adversarial invariance architecture to train a network that maps speaker embeddings extracted using a pre-trained model onto two lower dimensional embedding spaces. The embeddi… ▽ More

    Submitted 3 November, 2019; originally announced November 2019.

    Comments: Submitted to ICASSP 2020

  6. arXiv:1802.07860  [pdf, other

    cs.SD cs.CL eess.AS

    Neural Predictive Coding using Convolutional Neural Networks towards Unsupervised Learning of Speaker Characteristics

    Authors: Arindam Jati, Panayiotis Georgiou

    Abstract: Learning speaker-specific features is vital in many applications like speaker recognition, diarization and speech recognition. This paper provides a novel approach, we term Neural Predictive Coding (NPC), to learn speaker-specific characteristics in a completely unsupervised manner from large amounts of unlabeled training data that even contain many non-speech events and multi-speaker audio stream… ▽ More

    Submitted 25 April, 2019; v1 submitted 21 February, 2018; originally announced February 2018.

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 10, pp. 1577-1589, Oct. 2019