Skip to main content

Showing 1–4 of 4 results for author: Mitrofanov, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2107.10882  [pdf, other

    cs.LG

    Size doesn't matter: predicting physico- or biochemical properties based on dozens of molecules

    Authors: Kirill Karpov, Artem Mitrofanov, Vadim Korolev, Valery Tkachenko

    Abstract: The use of machine learning in chemistry has become a common practice. At the same time, despite the success of modern machine learning methods, the lack of data limits their use. Using a transfer learning methodology can help solve this problem. This methodology assumes that a model built on a sufficient amount of data captures general features of the chemical compound structure on which it was t… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

    Comments: 9 pages, 6 figures

  2. arXiv:2104.02526  [pdf, ps, other

    eess.AS cs.CL cs.LG

    LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring

    Authors: Anton Mitrofanov, Mariya Korenevskaya, Ivan Podluzhny, Yuri Khokhlov, Aleksandr Laptev, Andrei Andrusenko, Aleksei Ilin, Maxim Korenevsky, Ivan Medennikov, Aleksei Romanenko

    Abstract: Neural network-based language models are commonly used in rescoring approaches to improve the quality of modern automatic speech recognition (ASR) systems. Most of the existing methods are computationally expensive since they use autoregressive language models. We propose a novel rescoring approach, which processes the entire lattice in a single call to the model. The key feature of our rescoring… ▽ More

    Submitted 6 April, 2021; originally announced April 2021.

    Comments: Submitted to InterSpeech 2021

  3. arXiv:2103.07186  [pdf, ps, other

    eess.AS cs.CL cs.LG cs.SD

    Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition

    Authors: Aleksandr Laptev, Andrei Andrusenko, Ivan Podluzhny, Anton Mitrofanov, Ivan Medennikov, Yuri Matveev

    Abstract: With the rapid development of speech assistants, adapting server-intended automatic speech recognition (ASR) solutions to a direct device has become crucial. Researchers and industry prefer to use end-to-end ASR systems for on-device speech recognition tasks. This is because end-to-end systems can be made resource-efficient while maintaining a higher quality compared to hybrid systems. However, bu… ▽ More

    Submitted 12 March, 2021; originally announced March 2021.

    Comments: 16 pages, 7 figures

  4. Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario

    Authors: Ivan Medennikov, Maxim Korenevsky, Tatiana Prisyach, Yuri Khokhlov, Mariya Korenevskaya, Ivan Sorokin, Tatiana Timofeeva, Anton Mitrofanov, Andrei Andrusenko, Ivan Podluzhny, Aleksandr Laptev, Aleksei Romanenko

    Abstract: Speaker diarization for real-life scenarios is an extremely challenging problem. Widely used clustering-based diarization approaches perform rather poorly in such conditions, mainly due to the limited ability to handle overlap** speech. We propose a novel Target-Speaker Voice Activity Detection (TS-VAD) approach, which directly predicts an activity of each speaker on each time frame. TS-VAD mode… ▽ More

    Submitted 27 July, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: Accepted to Interspeech 2020