Skip to main content

Showing 1–3 of 3 results for author: Talnikar, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2101.00390  [pdf, other

    cs.CL eess.AS

    VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

    Authors: Changhan Wang, Morgane Rivière, Ann Lee, Anne Wu, Chaitanya Talnikar, Daniel Haziza, Mary Williamson, Juan Pino, Emmanuel Dupoux

    Abstract: We introduce VoxPopuli, a large-scale multilingual corpus providing 100K hours of unlabelled speech data in 23 languages. It is the largest open data to date for unsupervised representation learning as well as semi-supervised learning. VoxPopuli also contains 1.8K hours of transcribed speeches in 16 languages and their aligned oral interpretations into 5 other languages totaling 5.1K hours. We pro… ▽ More

    Submitted 27 July, 2021; v1 submitted 2 January, 2021; originally announced January 2021.

    Comments: Accepted to ACL 2021 (long paper)

  2. arXiv:2011.00093  [pdf, other

    cs.CL cs.LG cs.SD

    Joint Masked CPC and CTC Training for ASR

    Authors: Chaitanya Talnikar, Tatiana Likhomanenko, Ronan Collobert, Gabriel Synnaeve

    Abstract: Self-supervised learning (SSL) has shown promise in learning representations of audio that are useful for automatic speech recognition (ASR). But, training SSL models like wav2vec~2.0 requires a two-stage pipeline. In this paper we demonstrate a single-stage training of ASR models that can utilize both unlabeled and labeled data. During training, we alternately minimize two losses: an unsupervised… ▽ More

    Submitted 13 February, 2021; v1 submitted 30 October, 2020; originally announced November 2020.

    Comments: ICASSP 2021

  3. arXiv:1905.04561  [pdf, other

    math.OC cs.LG

    Linear Range in Gradient Descent

    Authors: Angxiu Ni, Chaitanya Talnikar

    Abstract: This paper defines linear range as the range of parameter perturbations which lead to approximately linear perturbations in the states of a network. We compute linear range from the difference between actual perturbations in states and the tangent solution. Linear range is a new criterion for estimating the effectivenss of gradients and thus having many possible applications. In particular, we pro… ▽ More

    Submitted 23 May, 2019; v1 submitted 11 May, 2019; originally announced May 2019.

    Comments: 9 pages, 4 figures