Skip to main content

Showing 1–3 of 3 results for author: Kesiraju, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2403.07767  [pdf, ps, other

    eess.AS cs.LG eess.SP

    Beyond the Labels: Unveiling Text-Dependency in Paralinguistic Speech Recognition Datasets

    Authors: Jan Pešán, Santosh Kesiraju, Lukáš Burget, Jan ''Honza'' Černocký

    Abstract: Paralinguistic traits like cognitive load and emotion are increasingly recognized as pivotal areas in speech recognition research, often examined through specialized datasets like CLSE and IEMOCAP. However, the integrity of these datasets is seldom scrutinized for text-dependency. This paper critically evaluates the prevalent assumption that machine learning models trained on such datasets genuine… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  2. Strategies for improving low resource speech to text translation relying on pre-trained ASR models

    Authors: Santosh Kesiraju, Marek Sarvas, Tomas Pavlicek, Cecile Macaire, Alejandro Ciuba

    Abstract: This paper presents techniques and findings for improving the performance of low-resource speech to text translation (ST). We conducted experiments on both simulated and real-low resource setups, on language pairs English - Portuguese, and Tamasheq - French respectively. Using the encoder-decoder framework for ST, our results show that a multilingual automatic speech recognition system acts as a g… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

  3. arXiv:2104.02332  [pdf, other

    eess.AS

    Detecting English Speech in the Air Traffic Control Voice Communication

    Authors: Igor Szoke, Santosh Kesiraju, Ondrej Novotny, Martin Kocour, Karel Vesely, Jan "Honza" Cernocky

    Abstract: We launched a community platform for collecting the ATC speech world-wide in the ATCO2 project. Filtering out unseen non-English speech is one of the main components in the data processing pipeline. The proposed English Language Detection (ELD) system is based on the embeddings from Bayesian subspace multinomial model. It is trained on the word confusion network from an ASR system. It is robust, e… ▽ More

    Submitted 6 April, 2021; originally announced April 2021.