Skip to main content

Showing 1–2 of 2 results for author: Jentoft, M

.
  1. arXiv:2306.07790  [pdf, other

    cs.CL cs.AI

    NoCoLA: The Norwegian Corpus of Linguistic Acceptability

    Authors: Matias Jentoft, David Samuel

    Abstract: While there has been a surge of large language models for Norwegian in recent years, we lack any tool to evaluate their understanding of grammaticality. We present two new Norwegian datasets for this task. NoCoLA_class is a supervised binary classification task where the goal is to discriminate between acceptable and non-acceptable sentences. On the other hand, NoCoLA_zero is a purely diagnostic t… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: Published at NoDaLiDa 2023

  2. arXiv:2305.01957  [pdf, other

    cs.CL

    NorQuAD: Norwegian Question Answering Dataset

    Authors: Sardana Ivanova, Fredrik Aas Andreassen, Matias Jentoft, Sondre Wold, Lilja Øvrelid

    Abstract: In this paper we present NorQuAD: the first Norwegian question answering dataset for machine reading comprehension. The dataset consists of 4,752 manually created question-answer pairs. We here detail the data collection procedure and present statistics of the dataset. We also benchmark several multilingual and Norwegian monolingual language models on the dataset and compare them against human per… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: Accepted to NoDaLiDa 2023