Skip to main content

Showing 1–3 of 3 results for author: Lamanov, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.01092  [pdf, other

    cs.SD cs.LG eess.AS

    A Semi-Supervised Deep Learning Approach to Dataset Collection for Query-By-Humming Task

    Authors: Amantur Amatov, Dmitry Lamanov, Maksim Titov, Ivan Vovk, Ilya Makarov, Mikhail Kudinov

    Abstract: Query-by-Humming (QbH) is a task that involves finding the most relevant song based on a hummed or sung fragment. Despite recent successful commercial solutions, implementing QbH systems remains challenging due to the lack of high-quality datasets for training machine learning models. In this paper, we propose a deep learning data collection technique and introduce Covers and Hummings Aligned Data… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  2. arXiv:2206.10914  [pdf, other

    cs.CL

    Template-based Approach to Zero-shot Intent Recognition

    Authors: Dmitry Lamanov, Pavel Burnyshev, Ekaterina Artemova, Valentin Malykh, Andrey Bout, Irina Piontkovskaya

    Abstract: The recent advances in transfer learning techniques and pre-training of large contextualized encoders foster innovation in real-life applications, including dialog assistants. Practical needs of intent recognition require effective data usage and the ability to constantly update supported intents, adopting new ones, and abandoning outdated ones. In particular, the generalized zero-shot paradigm, i… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

    Comments: accepted to INLG 2022

  3. arXiv:2104.10121  [pdf, other

    cs.SD cs.CL eess.AS

    On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

    Authors: Shahin Amiriparian, Artem Sokolov, Ilhan Aslan, Lukas Christ, Maurice Gerczuk, Tobias Hübner, Dmitry Lamanov, Manuel Milling, Sandra Ottl, Ilya Poduremennykh, Evgeniy Shuranov, Björn W. Schuller

    Abstract: Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the SER systems. Further, more clarification is required for analysing the impact of ASR's word error rate (WER) on linguistic emotion recognition per se and in the… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: 5 pages, 1 figure

    ACM Class: I.2.7; I.5.0