Skip to main content

Showing 1–7 of 7 results for author: Shonenkov, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2202.10784  [pdf, other

    cs.CV cs.AI

    RuCLIP -- new models and experiments: a technical report

    Authors: Alex Shonenkov, Andrey Kuznetsov, Denis Dimitrov, Tatyana Shavrina, Daniil Chesakov, Anastasia Maltseva, Alena Fenogenova, Igor Pavlov, Anton Emelyanov, Sergey Markov, Daria Bakshandaeva, Vera Shybaeva, Andrey Chertok

    Abstract: In the report we propose six new implementations of ruCLIP model trained on our 240M pairs. The accuracy results are compared with original CLIP model with Ru-En translation (OPUS-MT) on 16 datasets from different domains. Our best implementations outperform CLIP + OPUS-MT solution on most of the datasets in few-show and zero-shot tasks. In the report we briefly describe the implementations and co… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  2. arXiv:2202.00441  [pdf, other

    cs.LG cs.AI

    Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

    Authors: Georgii Novikov, Daniel Bershatsky, Julia Gusak, Alex Shonenkov, Denis Dimitrov, Ivan Oseledets

    Abstract: Memory footprint is one of the main limiting factors for large neural network training. In backpropagation, one needs to store the input to each operation in the computational graph. Every modern neural network model has quite a few pointwise nonlinearities in its architecture, and such operation induces additional memory costs which -- as we show -- can be significantly reduced by quantization of… ▽ More

    Submitted 2 February, 2022; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: Submitted

  3. arXiv:2112.07395  [pdf, other

    cs.CV

    Handwritten text generation and strikethrough characters augmentation

    Authors: Alex Shonenkov, Denis Karachev, Max Novopoltsev, Mark Potanin, Denis Dimitrov, Andrey Chertok

    Abstract: We introduce two data augmentation techniques, which, used with a Resnet-BiLSTM-CTC network, significantly reduce Word Error Rate (WER) and Character Error Rate (CER) beyond best-reported results on handwriting text recognition (HTR) tasks. We apply a novel augmentation that simulates strikethrough text (HandWritten Blots) and a handwritten text generation method based on printed text (StackMix),… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: 16 pages, 15 figures. arXiv admin note: substantial text overlap with arXiv:2108.11667

    MSC Class: 68-04 ACM Class: I.7.5; I.4.6

  4. arXiv:2112.02448  [pdf, other

    cs.CL cs.AI cs.LG

    Emojich -- zero-shot emoji generation using Russian language: a technical report

    Authors: Alex Shonenkov, Daria Bakshandaeva, Denis Dimitrov, Aleksandr Nikolich

    Abstract: This technical report presents a text-to-image neural network "Emojich" that generates emojis using captions in Russian language as a condition. We aim to keep the generalization ability of a pretrained big model ruDALL-E Malevich (XL) 1.3B parameters at the fine-tuning stage, while giving special style to the images generated. Here are presented some engineering methods, code realization, all hyp… ▽ More

    Submitted 12 January, 2022; v1 submitted 4 December, 2021; originally announced December 2021.

    Comments: 5 pages, 4 figures and big figure at appendix, technical report

  5. arXiv:2111.10974  [pdf, other

    cs.CV cs.AI cs.CL

    Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask Architecture

    Authors: Daria Bakshandaeva, Denis Dimitrov, Vladimir Arkhipkin, Alex Shonenkov, Mark Potanin, Denis Karachev, Andrey Kuznetsov, Anton Voronov, Vera Davydova, Elena Tutubalina, Aleksandr Petiushko

    Abstract: Supporting the current trend in the AI community, we present the AI Journey 2021 Challenge called Fusion Brain, the first competition which is targeted to make the universal architecture which could process different modalities (in this case, images, texts, and code) and solve multiple tasks for vision and language. The Fusion Brain Challenge combines the following specific tasks: Code2code Transl… ▽ More

    Submitted 28 December, 2022; v1 submitted 21 November, 2021; originally announced November 2021.

  6. arXiv:2108.11667  [pdf, other

    cs.CV

    StackMix and Blot Augmentations for Handwritten Text Recognition

    Authors: Alex Shonenkov, Denis Karachev, Maxim Novopoltsev, Mark Potanin, Denis Dimitrov

    Abstract: This paper proposes a handwritten text recognition(HTR) system that outperforms current state-of-the-artmethods. The comparison was carried out on three of themost frequently used in HTR task datasets, namely Ben-tham, IAM, and Saint Gall. In addition, the results on tworecently presented datasets, Peter the Greats manuscriptsand HKR Dataset, are provided.The paper describes the architecture of th… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

    Comments: 17 pages, 9 figures

    MSC Class: 68-04 ACM Class: I.7.5; I.4.6

  7. arXiv:2103.09354  [pdf, other

    cs.CV cs.AI cs.LG

    Digital Peter: Dataset, Competition and Handwriting Recognition Methods

    Authors: Mark Potanin, Denis Dimitrov, Alex Shonenkov, Vladimir Bataev, Denis Karachev, Maxim Novopoltsev

    Abstract: This paper presents a new dataset of Peter the Great's manuscripts and describes a segmentation procedure that converts initial images of documents into the lines. The new dataset may be useful for researchers to train handwriting text recognition models as a benchmark for comparing different models. It consists of 9 694 images and text files corresponding to lines in historical documents. The ope… ▽ More

    Submitted 27 August, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

    Comments: 17 pages, 7 figures, submitted to ICDAR 2021

    ACM Class: I.7.5; I.4.6