Skip to main content

Showing 1–4 of 4 results for author: Mroczkowski, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.05474  [pdf, other

    cs.CL cs.LG

    Going beyond research datasets: Novel intent discovery in the industry setting

    Authors: Aleksandra Chrabrowa, Tsimur Hadeliya, Dariusz Kajtoch, Robert Mroczkowski, Piotr Rybak

    Abstract: Novel intent discovery automates the process of grou** similar messages (questions) to identify previously unknown intents. However, current research focuses on publicly available datasets which have only the question field and significantly differ from real-life datasets. This paper proposes methods to improve the intent discovery pipeline deployed in a large e-commerce platform. We show the be… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of EACL 2023

  2. arXiv:2205.08808  [pdf, other

    cs.CL cs.LG

    Evaluation of Transfer Learning for Polish with a Text-to-Text Model

    Authors: Aleksandra Chrabrowa, Łukasz Dragan, Karol Grzegorczyk, Dariusz Kajtoch, Mikołaj Koszowski, Robert Mroczkowski, Piotr Rybak

    Abstract: We introduce a new benchmark for assessing the quality of text-to-text models for Polish. The benchmark consists of diverse tasks and datasets: KLEJ benchmark adapted for text-to-text, en-pl translation, summarization, and question answering. In particular, since summarization and question answering lack benchmark datasets for the Polish language, we describe their construction and make them publi… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

    Comments: Accepted at LREC 2022

  3. arXiv:2105.01735  [pdf, other

    cs.CL cs.LG

    HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish

    Authors: Robert Mroczkowski, Piotr Rybak, Alina Wróblewska, Ireneusz Gawlik

    Abstract: BERT-based models are currently used for solving nearly all Natural Language Processing (NLP) tasks and most often achieve state-of-the-art results. Therefore, the NLP community conducts extensive research on understanding these models, but above all on designing effective and efficient training procedures. Several ablation studies investigating how to train BERT-like models have been carried out,… ▽ More

    Submitted 4 May, 2021; originally announced May 2021.

    Comments: Published in Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing

  4. arXiv:2005.00630  [pdf, other

    cs.CL

    KLEJ: Comprehensive Benchmark for Polish Language Understanding

    Authors: Piotr Rybak, Robert Mroczkowski, Janusz Tracz, Ireneusz Gawlik

    Abstract: In recent years, a series of Transformer-based models unlocked major improvements in general natural language understanding (NLU) tasks. Such a fast pace of research would not be possible without general NLU benchmarks, which allow for a fair comparison of the proposed methods. However, such benchmarks are available only for a handful of languages. To alleviate this issue, we introduce a comprehen… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.