Skip to main content

Showing 1–4 of 4 results for author: Sekhar, C C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2108.02850  [pdf, other

    eess.AS cs.LG cs.SD

    Unsupervised Domain Adaptation in Speech Recognition using Phonetic Features

    Authors: Rupam Ojha, C Chandra Sekhar

    Abstract: Automatic speech recognition is a difficult problem in pattern recognition because several sources of variability exist in the speech input like the channel variations, the input might be clean or noisy, the speakers may have different accent and variations in the gender, etc. As a result, domain adaptation is important in speech recognition where we train the model for a particular source domain… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

    Comments: 5 pages, 3 figures

  2. arXiv:2103.03215  [pdf, other

    eess.AS cs.SD

    Front-end Diarization for Percussion Separation in Taniavartanam of Carnatic Music Concerts

    Authors: Nauman Dawalatabad, Jilt Sebastian, Jom Kuriakose, C. Chandra Sekhar, Shrikanth Narayanan, Hema A. Murthy

    Abstract: Instrument separation in an ensemble is a challenging task. In this work, we address the problem of separating the percussive voices in the taniavartanam segments of Carnatic music. In taniavartanam, a number of percussive instruments play together or in tandem. Separation of instruments in regions where only one percussion is present leads to interference and artifacts at the output, as source se… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

  3. Novel Architectures for Unsupervised Information Bottleneck based Speaker Diarization of Meetings

    Authors: Nauman Dawalatabad, Srikanth Madikeri, C. Chandra Sekhar, Hema A. Murthy

    Abstract: Speaker diarization is an important problem that is topical, and is especially useful as a preprocessor for conversational speech related applications. The objective of this paper is two-fold: (i) segment initialization by uniformly distributing speaker information across the initial segments, and (ii) incorporating speaker discriminative features within the unsupervised diarization framework. In… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

    Comments: Accepted in IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, 2021, pp 14-27

  4. Incremental Transfer Learning in Two-pass Information Bottleneck based Speaker Diarization System for Meetings

    Authors: Nauman Dawalatabad, Srikanth Madikeri, C Chandra Sekhar, Hema A Murthy

    Abstract: The two-pass information bottleneck (TPIB) based speaker diarization system operates independently on different conversational recordings. TPIB system does not consider previously learned speaker discriminative information while diarizing new conversations. Hence, the real time factor (RTF) of TPIB system is high owing to the training time required for the artificial neural network (ANN). This pap… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

    Comments: 5 pages, 2 figures, To appear in Proc. ICASSP 2019, May 12-17, 2019, Brighton, UK