Skip to main content

Showing 1–11 of 11 results for author: Logacheva, V

Searching in archive cs. Search in all archives.
.
  1. Studying the role of named entities for content preservation in text style transfer

    Authors: Nikolay Babakov, David Dale, Varvara Logacheva, Irina Krotova, Alexander Panchenko

    Abstract: Text style transfer techniques are gaining popularity in Natural Language Processing, finding various applications such as text detoxification, sentiment, or formality transfer. However, the majority of the existing approaches were tested on such domains as online communications on public platforms, music, or entertainment yet none of them were applied to the domains which are typical for task-ori… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

    Journal ref: Natural Language Processing and Information Systems. NLDB 2022. Lecture Notes in Computer Science, vol 13286. Springer, Cham, p.437--448

  2. arXiv:2203.02392  [pdf, other

    cs.CL

    Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable Topics for the Russian Language

    Authors: Nikolay Babakov, Varvara Logacheva, Alexander Panchenko

    Abstract: Toxicity on the Internet, such as hate speech, offenses towards particular users or groups of people, or the use of obscene words, is an acknowledged problem. However, there also exist other types of inappropriate messages which are usually not viewed as toxic, e.g. as they do not contain explicit offences. Such messages can contain covered toxicity or generalizations, incite harmful actions (crim… ▽ More

    Submitted 4 March, 2022; originally announced March 2022.

    Comments: arXiv admin note: text overlap with arXiv:2103.05345

  3. arXiv:2201.08598  [pdf, other

    cs.CL

    Taxonomy Enrichment with Text and Graph Vector Representations

    Authors: Irina Nikishina, Mikhail Tikhomirov, Varvara Logacheva, Yuriy Nazarov, Alexander Panchenko, Natalia Loukachevitch

    Abstract: Knowledge graphs such as DBpedia, Freebase or Wikidata always contain a taxonomic backbone that allows the arrangement and structuring of various concepts in accordance with the hypo-hypernym ("class-subclass") relationship. With the rapid growth of lexical resources for specific domains, the problem of automatic extension of the existing knowledge bases with new words is becoming more and more wi… ▽ More

    Submitted 21 January, 2022; originally announced January 2022.

  4. arXiv:2109.08914  [pdf, other

    cs.CL cs.LG

    Text Detoxification using Large Pre-trained Neural Models

    Authors: David Dale, Anton Voronov, Daryna Dementieva, Varvara Logacheva, Olga Kozlova, Nikita Semenov, Alexander Panchenko

    Abstract: We present two novel unsupervised methods for eliminating toxicity in text. Our first method combines two recent ideas: (1) guidance of the generation process with small style-conditional language models and (2) use of paraphrasing models to perform style transfer. We use a well-performing paraphraser guided by style-trained language models to keep the text content and remove toxicity. Our second… ▽ More

    Submitted 3 November, 2021; v1 submitted 18 September, 2021; originally announced September 2021.

    Comments: Accepted to the EMNLP 2021 conference

  5. arXiv:2105.09052  [pdf, other

    cs.CL cs.LG

    Methods for Detoxification of Texts for the Russian Language

    Authors: Daryna Dementieva, Daniil Moskovskiy, Varvara Logacheva, David Dale, Olga Kozlova, Nikita Semenov, Alexander Panchenko

    Abstract: We introduce the first study of automatic detoxification of Russian texts to combat offensive language. Such a kind of textual style transfer can be used, for instance, for processing toxic content in social media. While much work has been done for the English language in this field, it has never been solved for the Russian language yet. We test two types of models - unsupervised approach based on… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

  6. arXiv:2103.05345  [pdf, other

    cs.CL

    Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company's Reputation

    Authors: Nikolay Babakov, Varvara Logacheva, Olga Kozlova, Nikita Semenov, Alexander Panchenko

    Abstract: Not all topics are equally "flammable" in terms of toxicity: a calm discussion of turtles or fishing less often fuels inappropriate toxic dialogues than a discussion of politics or sexual minorities. We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labeling a dataset for appropriateness. While toxicity in user-genera… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

    Comments: Accepted to the Balto-Slavic NLP workshop 2021 co-located with EACL-2021

  7. arXiv:2011.11536  [pdf, other

    cs.CL

    Studying Taxonomy Enrichment on Diachronic WordNet Versions

    Authors: Irina Nikishina, Alexander Panchenko, Varvara Logacheva, Natalia Loukachevitch

    Abstract: Ontologies, taxonomies, and thesauri are used in many NLP tasks. However, most studies are focused on the creation of these lexical resources rather than the maintenance of the existing ones. Thus, we address the problem of taxonomy enrichment. We explore the possibilities of taxonomy extension in a resource-poor setting and present methods which are applicable to a large number of languages. We c… ▽ More

    Submitted 23 November, 2020; originally announced November 2020.

  8. arXiv:2005.11176  [pdf, other

    cs.CL cs.AI

    RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the Russian language

    Authors: Irina Nikishina, Varvara Logacheva, Alexander Panchenko, Natalia Loukachevitch

    Abstract: This paper describes the results of the first shared task on taxonomy enrichment for the Russian language. The participants were asked to extend an existing taxonomy with previously unseen words: for each new word their systems should provide a ranked list of possible (candidate) hypernyms. In comparison to the previous tasks for other languages, our competition has a more realistic task setting:… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

  9. arXiv:2003.06651  [pdf, other

    cs.CL

    Word Sense Disambiguation for 158 Languages using Word Embeddings Only

    Authors: Varvara Logacheva, Denis Teslenko, Artem Shelmanov, Steffen Remus, Dmitry Ustalov, Andrey Kutuzov, Ekaterina Artemova, Chris Biemann, Simone Paolo Ponzetto, Alexander Panchenko

    Abstract: Disambiguation of word senses in context is easy for humans, but is a major challenge for automatic approaches. Sophisticated supervised and knowledge-based models were developed to solve this task. However, (i) the inherent Zipfian distribution of supervised training instances for a given word and/or (ii) the quality of linguistic knowledge representations motivate the development of completely u… ▽ More

    Submitted 14 March, 2020; originally announced March 2020.

    Comments: 10 pages, 5 figures, 4 tables, accepted at LREC 2020

  10. arXiv:1902.00098  [pdf, other

    cs.AI cs.CL cs.HC

    The Second Conversational Intelligence Challenge (ConvAI2)

    Authors: Emily Dinan, Varvara Logacheva, Valentin Malykh, Alexander Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, Shrimai Prabhumoye, Alan W Black, Alexander Rudnicky, Jason Williams, Joelle Pineau, Mikhail Burtsev, Jason Weston

    Abstract: We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots. Some key takeaways from the competition are: (i) pretrained Transformer variants are currently the best performing models on this task, (ii) but to improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics lik… ▽ More

    Submitted 31 January, 2019; originally announced February 2019.

  11. arXiv:1812.06158  [pdf, other

    cs.CL cs.LG stat.ML

    Few-shot classification in Named Entity Recognition Task

    Authors: Alexander Fritzler, Varvara Logacheva, Maksim Kretov

    Abstract: For many natural language processing (NLP) tasks the amount of annotated data is limited. This urges a need to apply semi-supervised learning techniques, such as transfer learning or meta-learning. In this work we tackle Named Entity Recognition (NER) task using Prototypical Network - a metric learning technique. It learns intermediate representations of words which cluster well into named entity… ▽ More

    Submitted 14 December, 2018; originally announced December 2018.

    Comments: In proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing