Skip to main content

Showing 1–6 of 6 results for author: Karafiát, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2310.11921  [pdf, other

    cs.SD eess.AS

    BUT CHiME-7 system description

    Authors: Martin Karafiát, Karel Veselý, Igor Szöke, Ladislav Mošner, Karel Beneš, Marcin Witkowski, Germán Barchi, Leonardo Pepino

    Abstract: This paper describes the joint effort of Brno University of Technology (BUT), AGH University of Krakow and University of Buenos Aires on the development of Automatic Speech Recognition systems for the CHiME-7 Challenge. We train and evaluate various end-to-end models with several toolkits. We heavily relied on Guided Source Separation (GSS) to convert multi-channel audio to single channel. The ASR… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 6 pages, Chime-7 challenge 2023

  2. arXiv:2101.12729  [pdf, other

    eess.AS cs.CL

    BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge

    Authors: Martin Kocour, Guillermo Cámbara, Jordi Luque, David Bonet, Mireia Farrús, Martin Karafiát, Karel Veselý, Jan ''Honza'' Ĉernocký

    Abstract: This paper describes joint effort of BUT and Telefónica Research on development of Automatic Speech Recognition systems for Albayzin 2020 Challenge. We compare approaches based on either hybrid or end-to-end models. In hybrid modelling, we explore the impact of SpecAugment layer on performance. For end-to-end modelling, we used a convolutional neural network with gated linear units (GLUs). The per… ▽ More

    Submitted 29 January, 2021; originally announced January 2021.

    Comments: fusion, end-to-end model, hybrid model, semisupervised, automatic speech recognition, convolutional neural network

  3. arXiv:2001.11360  [pdf, ps, other

    eess.AS cs.LG cs.SD

    BUT Opensat 2019 Speech Recognition System

    Authors: Martin Karafiát, Murali Karthick Baskar, Igor Szöke, Hari Krishna Vydana, Karel Veselý, Jan "Honza'' Černocký

    Abstract: The paper describes the BUT Automatic Speech Recognition (ASR) systems submitted for OpenSAT evaluations under two domain categories such as low resourced languages and public safety communications. The first was challenging due to lack of training data, therefore various architectures and multilingual approaches were employed. The combination led to superior performance. The second domain was cha… ▽ More

    Submitted 30 January, 2020; originally announced January 2020.

    Comments: REJECTED in ICASSP 2020

  4. arXiv:1811.03451  [pdf, other

    eess.AS cs.CL cs.LG

    Analysis of Multilingual Sequence-to-Sequence speech recognition systems

    Authors: Martin Karafiát, Murali Karthick Baskar, Shinji Watanabe, Takaaki Hori, Matthew Wiesner, Jan "Honza'' Černocký

    Abstract: This paper investigates the applications of various multilingual approaches developed in conventional hidden Markov model (HMM) systems to sequence-to-sequence (seq2seq) automatic speech recognition (ASR). On a set composed of Babel data, we first show the effectiveness of multi-lingual training with stacked bottle-neck (SBN) features. Then we explore various architectures and training strategies… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

    Comments: arXiv admin note: text overlap with arXiv:1810.03459

  5. arXiv:1811.02770  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Promising Accurate Prefix Boosting for sequence-to-sequence ASR

    Authors: Murali Karthick Baskar, Lukáš Burget, Shinji Watanabe, Martin Karafiát, Takaaki Hori, Jan Honza Černocký

    Abstract: In this paper, we present promising accurate prefix boosting (PAPB), a discriminative training technique for attention based sequence-to-sequence (seq2seq) ASR. PAPB is devised to unify the training and testing scheme in an effective manner. The training procedure involves maximizing the score of each partial correct sequence obtained during beam search compared to other hypotheses. The training o… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

  6. arXiv:1810.03459  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling

    Authors: Jae** Cho, Murali Karthick Baskar, Ruizhi Li, Matthew Wiesner, Sri Harish Mallidi, Nelson Yalta, Martin Karafiat, Shinji Watanabe, Takaaki Hori

    Abstract: Sequence-to-sequence (seq2seq) approach for low-resource ASR is a relatively new direction in speech research. The approach benefits by performing model training without using lexicon and alignments. However, this poses a new problem of requiring more data compared to conventional DNN-HMM systems. In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model a… ▽ More

    Submitted 4 October, 2018; originally announced October 2018.