Skip to main content

Showing 1–3 of 3 results for author: Okoh, I

.
  1. arXiv:2405.00997  [pdf, other

    cs.CL

    The IgboAPI Dataset: Empowering Igbo Language Technologies through Multi-dialectal Enrichment

    Authors: Chris Chinenye Emezue, Ifeoma Okoh, Chinedu Mbonu, Chiamaka Chukwuneke, Daisy Lal, Ignatius Ezeani, Paul Rayson, Ijemma Onwuzulike, Chukwuma Okeke, Gerald Nweya, Bright Ogbonna, Chukwuebuka Oraegbunam, Esther Chidinma Awo-Ndubuisi, Akudo Amarachukwu Osuagwu, Obioha Nmezi

    Abstract: The Igbo language is facing a risk of becoming endangered, as indicated by a 2025 UNESCO study. This highlights the need to develop language technologies for Igbo to foster communication, learning and preservation. To create robust, impactful, and widely adopted language technologies for Igbo, it is essential to incorporate the multi-dialectal nature of the language. The primary obstacle in achiev… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted to the LREC-COLING 2024 conference

  2. arXiv:2402.06619  [pdf, other

    cs.CL cs.AI

    Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning

    Authors: Shivalika Singh, Freddie Vargus, Daniel Dsouza, Börje F. Karlsson, Abinaya Mahendiran, Wei-Yin Ko, Herumb Shandilya, Jay Patel, Deividas Mataciunas, Laura OMahony, Mike Zhang, Ramith Hettiarachchi, Joseph Wilson, Marina Machado, Luisa Souza Moura, Dominik Krzemiński, Hakimeh Fadaei, Irem Ergün, Ifeoma Okoh, Aisha Alaagib, Oshan Mudannayake, Zaid Alyafeai, Vu Minh Chien, Sebastian Ruder, Surya Guthikonda , et al. (8 additional authors not shown)

    Abstract: Datasets are foundational to many breakthroughs in modern artificial intelligence. Many recent achievements in the space of natural language processing (NLP) can be attributed to the finetuning of pre-trained models on a diverse set of tasks that enables a large language model (LLM) to respond to instructions. Instruction fine-tuning (IFT) requires specifically constructed and annotated datasets.… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  3. arXiv:2105.00810  [pdf, other

    cs.CL

    NaijaNER : Comprehensive Named Entity Recognition for 5 Nigerian Languages

    Authors: Wuraola Fisayo Oyewusi, Olubayo Adekanmbi, Ifeoma Okoh, Vitus Onuigwe, Mary Idera Salami, Opeyemi Osakuade, Sharon Ibejih, Usman Abdullahi Musa

    Abstract: Most of the common applications of Named Entity Recognition (NER) is on English and other highly available languages. In this work, we present our findings on Named Entity Recognition for 5 Nigerian Languages (Nigerian English, Nigerian Pidgin English, Igbo, Yoruba and Hausa). These languages are considered low-resourced, and very little openly available Natural Language Processing work has been d… ▽ More

    Submitted 30 March, 2021; originally announced May 2021.

    Comments: Accepted at the AfricaNLP Workshop, EACL 2021