Skip to main content

Showing 1–4 of 4 results for author: Szaszák, G

Searching in archive eess. Search in all archives.
.
  1. arXiv:2007.06949  [pdf, other

    eess.AS cs.CL

    Deep Transformer based Data Augmentation with Subword Units for Morphologically Rich Online ASR

    Authors: Balázs Tarján, György Szaszák, Tibor Fegyó, Péter Mihajlik

    Abstract: Recently Deep Transformer models have proven to be particularly powerful in language modeling tasks for ASR. Their high complexity, however, makes them very difficult to apply in the first (single) pass of an online system. Recent studies showed that a considerable part of the knowledge of neural network Language Models (LM) can be transferred to traditional n-grams by using neural text generation… ▽ More

    Submitted 4 November, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

    Comments: 7 pages, 4 figures

  2. On the Effectiveness of Neural Text Generation based Data Augmentation for Recognition of Morphologically Rich Speech

    Authors: Balázs Tarján, György Szaszák, Tibor Fegyó, Péter Mihajlik

    Abstract: Advanced neural network models have penetrated Automatic Speech Recognition (ASR) in recent years, however, in language modeling many systems still rely on traditional Back-off N-gram Language Models (BNLM) partly or entirely. The reason for this are the high cost and complexity of training and using neural language models, mostly possible by adding a second decoding pass (rescoring). In our recen… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

    Comments: 8 pages, 2 figures, accepted for publication at TSD 2020

  3. arXiv:1911.06615  [pdf

    eess.AS cs.LG cs.SD stat.ML

    Deep learning methods in speaker recognition: a review

    Authors: Dávid Sztahó, György Szaszák, András Beke

    Abstract: This paper summarizes the applied deep learning practices in the field of speaker recognition, both verification and identification. Speaker recognition has been a widely used field topic of speech technology. Many research works have been carried out and little progress has been achieved in the past 5-6 years. However, as deep learning techniques do advance in most machine learning fields, the fo… ▽ More

    Submitted 14 November, 2019; originally announced November 2019.

  4. Investigation on N-gram Approximated RNNLMs for Recognition of Morphologically Rich Speech

    Authors: Balázs Tarján, György Szaszák, Tibor Fegyó, Péter Mihajlik

    Abstract: Recognition of Hungarian conversational telephone speech is challenging due to the informal style and morphological richness of the language. Recurrent Neural Network Language Model (RNNLM) can provide remedy for the high perplexity of the task; however, two-pass decoding introduces a considerable processing delay. In order to eliminate this delay we investigate approaches aiming at the complexity… ▽ More

    Submitted 19 September, 2019; v1 submitted 15 July, 2019; originally announced July 2019.

    Comments: 12 pages, 2 figures, accepted for publication at SLSP 2019