Skip to main content

Showing 1–3 of 3 results for author: Jaff, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.01983  [pdf

    cs.CL

    Language and Speech Technology for Central Kurdish Varieties

    Authors: Sina Ahmadi, Daban Q. Jaff, Md Mahfuz Ibn Alam, Antonios Anastasopoulos

    Abstract: Kurdish, an Indo-European language spoken by over 30 million speakers, is considered a dialect continuum and known for its diversity in language varieties. Previous studies addressing language and speech technology for Kurdish handle it in a monolithic way as a macro-language, resulting in disparities for dialects and varieties for which there are few resources and tools available. In this paper,… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted to LREC-COLING 2024

  2. arXiv:2106.09325  [pdf

    cs.AI

    Central Kurdish machine translation: First large scale parallel corpus and experiments

    Authors: Zhila Amini, Mohammad Mohammadamini, Hawre Hosseini, Mehran Mansouri, Daban Jaff

    Abstract: While the computational processing of Kurdish has experienced a relative increase, the machine translation of this language seems to be lacking a considerable body of scientific work. This is in part due to the lack of resources especially curated for this task. In this paper, we present the first large scale parallel corpus of Central Kurdish-English, Awta, containing 229,222 pairs of manually al… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

  3. arXiv:2010.01554  [pdf, other

    cs.CL

    Leveraging Multilingual News Websites for Building a Kurdish Parallel Corpus

    Authors: Sina Ahmadi, Hossein Hassani, Daban Q. Jaff

    Abstract: Machine translation has been a major motivation of development in natural language processing. Despite the burgeoning achievements in creating more efficient machine translation systems thanks to deep learning methods, parallel corpora have remained indispensable for progress in the field. In an attempt to create parallel corpora for the Kurdish language, in this paper, we describe our approach in… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

    Comments: 11 pages, under review in the ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) Corpus available at https://github.com/KurdishBLARK/InterdialectCorpus