Skip to main content

Showing 1–16 of 16 results for author: Shavrina, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.10931  [pdf, ps, other

    cs.CL

    A Family of Pretrained Transformer Language Models for Russian

    Authors: Dmitry Zmitrovich, Alexander Abramov, Andrey Kalmykov, Maria Tikhonova, Ekaterina Taktasheva, Danil Astafurov, Mark Baushenko, Artem Snegirev, Vitalii Kadulin, Sergey Markov, Tatiana Shavrina, Vladislav Mikhailov, Alena Fenogenova

    Abstract: Transformer language models (LMs) are fundamental to NLP research methodologies and applications in various languages. However, develo** such models specifically for the Russian language has received little attention. This paper introduces a collection of 13 Russian Transformer LMs, which spans encoder (ruBERT, ruRoBERTa, ruELECTRA), decoder (ruGPT-3), and encoder-decoder (ruT5, FRED-T5) archite… ▽ More

    Submitted 18 April, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: to appear in LREC-COLING-2024

  2. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  3. arXiv:2210.13236  [pdf, other

    cs.CL cs.AI

    Universal and Independent: Multilingual Probing Framework for Exhaustive Model Interpretation and Evaluation

    Authors: Oleg Serikov, Vitaly Protasov, Ekaterina Voloshina, Viktoria Knyazkova, Tatiana Shavrina

    Abstract: Linguistic analysis of language models is one of the ways to explain and describe their reasoning, weaknesses, and limitations. In the probing part of the model interpretability research, studies concern individual languages as well as individual linguistic structures. The question arises: are the detected regularities linguistically coherent, or on the contrary, do they dissonate at the typologic… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted to BlackBoxNLP, EMNLP 2022

    MSC Class: 68-04; 68-06; 68T50 ACM Class: G.3; I.2.7

  4. TAPE: Assessing Few-shot Russian Language Understanding

    Authors: Ekaterina Taktasheva, Tatiana Shavrina, Alena Fenogenova, Denis Shevelev, Nadezhda Katricheva, Maria Tikhonova, Albina Akhmetgareeva, Oleg Zinkevich, Anastasiia Bashmakova, Svetlana Iordanskaia, Alena Spiridonova, Valentina Kurenshchikova, Ekaterina Artemova, Vladislav Mikhailov

    Abstract: Recent advances in zero-shot and few-shot learning have shown promise for a scope of research and practical purposes. However, this fast-growing area lacks standardized evaluation suites for non-English languages, hindering progress outside the Anglo-centric paradigm. To address this line of research, we propose TAPE (Text Attack and Perturbation Evaluation), a novel benchmark that includes six mo… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022 Findings

  5. Vote'n'Rank: Revision of Benchmarking with Social Choice Theory

    Authors: Mark Rofin, Vladislav Mikhailov, Mikhail Florinskiy, Andrey Kravchenko, Elena Tutubalina, Tatiana Shavrina, Daniel Karabekyan, Ekaterina Artemova

    Abstract: The development of state-of-the-art systems in different applied areas of machine learning (ML) is driven by benchmarks, which have shaped the paradigm of evaluating generalisation capabilities from multiple perspectives. Although the paradigm is shifting towards more fine-grained evaluation across diverse tasks, the delicate question of how to aggregate the performances has received particular in… ▽ More

    Submitted 12 February, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: To appear in EACL 2023 (main)

  6. Is neural language acquisition similar to natural? A chronological probing study

    Authors: Ekaterina Voloshina, Oleg Serikov, Tatiana Shavrina

    Abstract: The probing methodology allows one to obtain a partial representation of linguistic phenomena stored in the inner layers of the neural network, using external classifiers and statistical analysis. Pre-trained transformer-based language models are widely used both for natural language understanding (NLU) and natural language generation (NLG) tasks making them most commonly used for downstream appli… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: Published in proceedings of Dialogue-2022 "Computational Linguistics and Intellectual Technologies"

  7. Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian

    Authors: Tatiana Shamardina, Vladislav Mikhailov, Daniil Chernianskii, Alena Fenogenova, Marat Saidov, Anastasiya Valeeva, Tatiana Shavrina, Ivan Smurov, Elena Tutubalina, Ekaterina Artemova

    Abstract: We present the shared task on artificial text detection in Russian, which is organized as a part of the Dialogue Evaluation initiative, held in 2022. The shared task dataset includes texts from 14 text generators, i.e., one human writer and 13 text generative models fine-tuned for one or more of the following generation tasks: machine translation, paraphrase generation, text summarization, text si… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: Accepted to Dialogue-22

  8. arXiv:2204.08009  [pdf, other

    cs.CL cs.AI

    WikiOmnia: generative QA corpus on the whole Russian Wikipedia

    Authors: Dina Pisarevskaya, Tatiana Shavrina

    Abstract: The General QA field has been develo** the methodology referencing the Stanford Question answering dataset (SQuAD) as the significant benchmark. However, compiling factual questions is accompanied by time- and labour-consuming annotation, limiting the training data's potential size. We present the WikiOmnia dataset, a new publicly available set of QA-pairs and corresponding Russian Wikipedia art… ▽ More

    Submitted 9 November, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

    Comments: Accepted to GEM Workshop, EMNLP 2022

  9. arXiv:2204.07580  [pdf, other

    cs.CL cs.AI

    mGPT: Few-Shot Learners Go Multilingual

    Authors: Oleh Shliazhko, Alena Fenogenova, Maria Tikhonova, Vladislav Mikhailov, Anastasia Kozlova, Tatiana Shavrina

    Abstract: Recent studies report that autoregressive language models can successfully solve many NLP tasks via zero- and few-shot learning paradigms, which opens up new possibilities for using the pre-trained language models. This paper introduces two autoregressive GPT-like models with 1.3 billion and 13 billion parameters trained on 60 languages from 25 language families using Wikipedia and Colossal Clean… ▽ More

    Submitted 12 October, 2023; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: Accepted for publication at Transactions of the Association for Computational Linguistics (TACL) To be presented at the Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)

    MSC Class: 68-06; 68-04; 68T50; 68T01 ACM Class: I.2; I.2.7

  10. arXiv:2202.10784  [pdf, other

    cs.CV cs.AI

    RuCLIP -- new models and experiments: a technical report

    Authors: Alex Shonenkov, Andrey Kuznetsov, Denis Dimitrov, Tatyana Shavrina, Daniil Chesakov, Anastasia Maltseva, Alena Fenogenova, Igor Pavlov, Anton Emelyanov, Sergey Markov, Daria Bakshandaeva, Vera Shybaeva, Andrey Chertok

    Abstract: In the report we propose six new implementations of ruCLIP model trained on our 240M pairs. The accuracy results are compared with original CLIP model with Ru-En translation (OPUS-MT) on 16 datasets from different domains. Our best implementations outperform CLIP + OPUS-MT solution on most of the datasets in few-show and zero-shot tasks. In the report we briefly describe the implementations and co… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  11. arXiv:2202.07791  [pdf, other

    cs.CL cs.AI

    Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models

    Authors: Alena Fenogenova, Maria Tikhonova, Vladislav Mikhailov, Tatiana Shavrina, Anton Emelyanov, Denis Shevelev, Alexandr Kukushkin, Valentin Malykh, Ekaterina Artemova

    Abstract: In the last year, new neural architectures and multilingual pre-trained models have been released for Russian, which led to performance evaluation problems across a range of language understanding tasks. This paper presents Russian SuperGLUE 1.1, an updated benchmark styled after GLUE for Russian NLP models. The new version includes a number of technical, user experience and methodological impro… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

    Comments: Computational Linguistics and Intellectual Technologies Papers from the Annual International Conference "Dialogue" (2021) Issue 20

    MSC Class: 68-06; 68T50; 68T01 ACM Class: G.3; I.2.7

  12. arXiv:2104.14314  [pdf, other

    cs.CL

    MOROCCO: Model Resource Comparison Framework

    Authors: Valentin Malykh, Alexander Kukushkin, Ekaterina Artemova, Vladislav Mikhailov, Maria Tikhonova, Tatiana Shavrina

    Abstract: The new generation of pre-trained NLP models push the SOTA to the new limits, but at the cost of computational resources, to the point that their use in real production environments is often prohibitively expensive. We tackle this problem by evaluating not only the standard quality metrics on downstream tasks but also the memory footprint and inference time. We present MOROCCO, a framework to comp… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

  13. arXiv:2011.12170  [pdf

    cs.CL

    Domain-Transferable Method for Named Entity Recognition Task

    Authors: Vladislav Mikhailov, Tatiana Shavrina

    Abstract: Named Entity Recognition (NER) is a fundamental task in the fields of natural language processing and information extraction. NER has been widely used as a standalone tool or an essential component in a variety of applications such as question answering, dialogue assistants and knowledge graphs development. However, training reliable NER models requires a large amount of labelled data which is exp… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

  14. RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

    Authors: Tatiana Shavrina, Alena Fenogenova, Anton Emelyanov, Denis Shevelev, Ekaterina Artemova, Valentin Malykh, Vladislav Mikhailov, Maria Tikhonova, Andrey Chertok, Andrey Evlampiev

    Abstract: In this paper, we introduce an advanced Russian general language understanding evaluation benchmark -- RussianGLUE. Recent advances in the field of universal language models and transformers require the development of a methodology for their broad diagnostics and testing for general intellectual skills - detection of natural language inference, commonsense reasoning, ability to perform simple logi… ▽ More

    Submitted 2 November, 2020; v1 submitted 29 October, 2020; originally announced October 2020.

    Comments: to appear in EMNLP 2020

  15. DaNetQA: a yes/no Question Answering Dataset for the Russian Language

    Authors: Taisia Glushkova, Alexey Machnev, Alena Fenogenova, Tatiana Shavrina, Ekaterina Artemova, Dmitry I. Ignatov

    Abstract: DaNetQA, a new question-answering corpus, follows (Clark et. al, 2019) design: it comprises natural yes/no questions. Each question is paired with a paragraph from Wikipedia and an answer, derived from the paragraph. The task is to take both the question and a paragraph as input and come up with a yes/no answer, i.e. to produce a binary output. In this paper, we present a reproducible approach to… ▽ More

    Submitted 15 October, 2020; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: Analysis of Images, Social Networks and Texts - 9 th International Conference, AIST 2020, Skolkovo, Russia, October 15-16, 2020, Revised Selected Papers. Lecture Notes in Computer Science (https://dblp.org/db/series/lncs/index.html), Springer 2020

  16. arXiv:1906.04099  [pdf, ps, other

    cs.CL

    AGRR-2019: A Corpus for Gap** Resolution in Russian

    Authors: Maria Ponomareva, Kira Droganova, Ivan Smurov, Tatiana Shavrina

    Abstract: This paper provides a comprehensive overview of the gap** dataset for Russian that consists of 7.5k sentences with gap** (as well as 15k relevant negative sentences) and comprises data from various genres: news, fiction, social media and technical texts. The dataset was prepared for the Automatic Gap** Resolution Shared Task for Russian (AGRR-2019) - a competition aimed at stimulating the de… ▽ More

    Submitted 10 June, 2019; originally announced June 2019.

    Comments: Accepted to BSNLP at ACL 2019