Skip to main content

Showing 1–6 of 6 results for author: Mouromtsev, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2111.14192  [pdf, other

    cs.CL cs.AI

    Zero-Shot Cross-Lingual Transfer in Legal Domain Using Transformer Models

    Authors: Zein Shaheen, Gerhard Wohlgenannt, Dmitry Mouromtsev

    Abstract: Zero-shot cross-lingual transfer is an important feature in modern NLP models and architectures to support low-resource languages. In this work, We study zero-shot cross-lingual transfer from English to French and German under Multi-Label Text Classification, where we train a classifier using English training set, and we test using French and German test sets. We extend EURLEX57K dataset, the Engl… ▽ More

    Submitted 11 December, 2021; v1 submitted 28 November, 2021; originally announced November 2021.

    Comments: Accepted in CSCI2021 conference

  2. arXiv:2005.02470  [pdf, other

    cs.CL cs.LG

    Russian Natural Language Generation: Creation of a Language Modelling Dataset and Evaluation with Modern Neural Architectures

    Authors: Zein Shaheen, Gerhard Wohlgenannt, Bassel Zaity, Dmitry Mouromtsev, Vadim Pak

    Abstract: Generating coherent, grammatically correct, and meaningful text is very challenging, however, it is crucial to many modern NLP systems. So far, research has mostly focused on English language, for other languages both standardized datasets, as well as experiments with state-of-the-art models, are rare. In this work, we i) provide a novel reference dataset for Russian language modeling, ii) experim… ▽ More

    Submitted 5 May, 2020; originally announced May 2020.

  3. arXiv:1907.08501  [pdf, other

    cs.IR cs.CL

    A Comparative Evaluation of Visual and Natural Language Question Answering Over Linked Data

    Authors: Gerhard Wohlgenannt, Dmitry Mouromtsev, Dmitry Pavlov, Yury Emelyanov, Alexey Morozov

    Abstract: With the growing number and size of Linked Data datasets, it is crucial to make the data accessible and useful for users without knowledge of formal query languages. Two approaches towards this goal are knowledge graph visualization and natural language interfaces. Here, we investigate specifically question answering (QA) over Linked Data by comparing a diagrammatic visual approach with existing n… ▽ More

    Submitted 19 July, 2019; originally announced July 2019.

    Comments: KEOD 2019

  4. Relation Extraction Datasets in the Digital Humanities Domain and their Evaluation with Word Embeddings

    Authors: Gerhard Wohlgenannt, Ekaterina Chernyak, Dmitry Ilvovsky, Ariadna Barinova, Dmitry Mouromtsev

    Abstract: In this research, we manually create high-quality datasets in the digital humanities domain for the evaluation of language models, specifically word embedding models. The first step comprises the creation of unigram and n-gram datasets for two fantasy novel book series for two task types each, analogy and doesn't-match. This is followed by the training of models on the two book series with various… ▽ More

    Submitted 4 March, 2019; originally announced March 2019.

  5. arXiv:1903.01275  [pdf, other

    cs.CL cs.IR

    Using Word Embeddings for Visual Data Exploration with Ontodia and Wikidata

    Authors: Gerhard Wohlgenannt, Nikolay Klimov, Dmitry Mouromtsev, Daniil Razdyakonov, Dmitry Pavlov, Yury Emelyanov

    Abstract: One of the big challenges in Linked Data consumption is to create visual and natural language interfaces to the data usable for non-technical users. Ontodia provides support for diagrammatic data exploration, showcased in this publication in combination with the Wikidata dataset. We present improvements to the natural language interface regarding exploring and querying Linked Data entities. The me… ▽ More

    Submitted 4 March, 2019; originally announced March 2019.

  6. arXiv:1503.06598  [pdf, other

    cs.IR

    Identifying Web Tables - Supporting a Neglected Type of Content on the Web

    Authors: Mikhail Galkin, Dmitry Mouromtsev, Sören Auer

    Abstract: The abundance of the data in the Internet facilitates the improvement of extraction and processing tools. The trend in the open data publishing encourages the adoption of structured formats like CSV and RDF. However, there is still a plethora of unstructured data on the Web which we assume contain semantics. For this reason, we propose an approach to derive semantics from web tables which are stil… ▽ More

    Submitted 23 March, 2015; originally announced March 2015.

    Comments: 9 pages, 4 figures