Skip to main content

Showing 1–2 of 2 results for author: Katta, S V

Searching in archive eess. Search in all archives.
.
  1. arXiv:2110.08895  [pdf, other

    cs.SD cs.CL eess.AS

    DECAR: Deep Clustering for learning general-purpose Audio Representations

    Authors: Sreyan Ghosh, Sandesh V Katta, Ashish Seth, S. Umesh

    Abstract: We introduce DECAR, a self-supervised pre-training approach for learning general-purpose audio representations. Our system is based on clustering: it utilizes an offline clustering step to provide target labels that act as pseudo-labels for solving a prediction task. We develop on top of recent advances in self-supervised learning for computer vision and design a lightweight, easy-to-use self-supe… ▽ More

    Submitted 14 March, 2023; v1 submitted 17 October, 2021; originally announced October 2021.

  2. arXiv:2008.04659  [pdf, other

    eess.AS cs.SD

    S-vectors and TESA: Speaker Embeddings and a Speaker Authenticator Based on Transformer Encoder

    Authors: N J Metilda Sagaya Mary, S Umesh, Sandesh V Katta

    Abstract: One of the most popular speaker embeddings is x-vectors, which are obtained from an architecture that gradually builds a larger temporal context with layers. In this paper, we propose to derive speaker embeddings from Transformer's encoder trained for speaker classification. Self-attention, on which Transformer's encoder is built, attends to all the features over the entire utterance and might be… ▽ More

    Submitted 12 December, 2021; v1 submitted 11 August, 2020; originally announced August 2020.

    Comments: Version 2, Accepted for publication in IEEE TASLP