Skip to main content

Showing 1–6 of 6 results for author: Azunre, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2103.15963  [pdf, ps, other

    cs.CL cs.AI

    Contextual Text Embeddings for Twi

    Authors: Paul Azunre, Salomey Osei, Salomey Addo, Lawrence Asamoah Adu-Gyamfi, Stephen Moore, Bernard Adabankah, Bernard Opoku, Clara Asare-Nyarko, Samuel Nyarko, Cynthia Amoaba, Esther Dansoa Appiah, Felix Akwerh, Richard Nii Lante Lawson, Joel Budu, Emmanuel Debrah, Nana Boateng, Wisdom Ofori, Edwin Buabeng-Munkoh, Franklin Adjei, Isaac Kojo Essel Ampomah, Joseph Otoo, Reindorf Borkor, Standylove Birago Mensah, Lucien Mensah, Mark Amoako Marcel , et al. (2 additional authors not shown)

    Abstract: Transformer-based language models have been changing the modern Natural Language Processing (NLP) landscape for high-resource languages such as English, Chinese, Russian, etc. However, this technology does not yet exist for any Ghanaian language. In this paper, we introduce the first of such models for Twi or Akan, the most widely spoken Ghanaian language. The specific contribution of this researc… ▽ More

    Submitted 31 March, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: 10 pages paper; Accepted at African NLP Workshop @ EACL 2021

  2. arXiv:2103.15625  [pdf, other

    cs.CL cs.AI

    English-Twi Parallel Corpus for Machine Translation

    Authors: Paul Azunre, Salomey Osei, Salomey Addo, Lawrence Asamoah Adu-Gyamfi, Stephen Moore, Bernard Adabankah, Bernard Opoku, Clara Asare-Nyarko, Samuel Nyarko, Cynthia Amoaba, Esther Dansoa Appiah, Felix Akwerh, Richard Nii Lante Lawson, Joel Budu, Emmanuel Debrah, Nana Boateng, Wisdom Ofori, Edwin Buabeng-Munkoh, Franklin Adjei, Isaac Kojo Essel Ampomah, Joseph Otoo, Reindorf Borkor, Standylove Birago Mensah, Lucien Mensah, Mark Amoako Marcel , et al. (2 additional authors not shown)

    Abstract: We present a parallel machine translation training corpus for English and Akuapem Twi of 25,421 sentence pairs. We used a transformer-based translator to generate initial translations in Akuapem Twi, which were later verified and corrected where necessary by native speakers to eliminate any occurrence of translationese. In addition, 697 higher quality crowd-sourced sentences are provided for use a… ▽ More

    Submitted 1 April, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: 9 pages paper, Accepted at African NLP workshop @EACL 2021

  3. arXiv:2103.15475  [pdf, ps, other

    cs.CL cs.AI

    NLP for Ghanaian Languages

    Authors: Paul Azunre, Salomey Osei, Salomey Addo, Lawrence Asamoah Adu-Gyamfi, Stephen Moore, Bernard Adabankah, Bernard Opoku, Clara Asare-Nyarko, Samuel Nyarko, Cynthia Amoaba, Esther Dansoa Appiah, Felix Akwerh, Richard Nii Lante Lawson, Joel Budu, Emmanuel Debrah, Nana Boateng, Wisdom Ofori, Edwin Buabeng-Munkoh, Franklin Adjei, Isaac Kojo Essel Ampomah, Joseph Otoo, Reindorf Borkor, Standylove Birago Mensah, Lucien Mensah, Mark Amoako Marcel , et al. (2 additional authors not shown)

    Abstract: NLP Ghana is an open-source non-profit organization aiming to advance the development and adoption of state-of-the-art NLP techniques and digital language tools to Ghanaian languages and problems. In this paper, we first present the motivation and necessity for the efforts of the organization; by introducing some popular Ghanaian languages while presenting the state of NLP in Ghana. We then presen… ▽ More

    Submitted 1 April, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: 6 pages paper; Accepted at AfricaNLP @EACL 2021

  4. arXiv:1905.10412  [pdf, other

    cs.CL cs.AI cs.LG

    Using Deep Networks and Transfer Learning to Address Disinformation

    Authors: Numa Dhamani, Paul Azunre, Jeffrey L. Gleason, Craig Corcoran, Garrett Honke, Steve Kramer, Jonathon Morgan

    Abstract: We apply an ensemble pipeline composed of a character-level convolutional neural network (CNN) and a long short-term memory (LSTM) as a general tool for addressing a range of disinformation problems. We also demonstrate the ability to use this architecture to transfer knowledge from labeled data in one domain to related (supervised and unsupervised) tasks. Character-level neural networks and trans… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

    Comments: AI for Social Good Workshop at the International Conference on Machine Learning, Long Beach, United States (2019)

  5. arXiv:1901.08456  [pdf, ps, other

    cs.CL cs.LG

    Semantic Classification of Tabular Datasets via Character-Level Convolutional Neural Networks

    Authors: Paul Azunre, Craig Corcoran, Numa Dhamani, Jeffrey Gleason, Garrett Honke, David Sullivan, Rebecca Ruppel, Sandeep Verma, Jonathon Morgan

    Abstract: A character-level convolutional neural network (CNN) motivated by applications in "automated machine learning" (AutoML) is proposed to semantically classify columns in tabular data. Simulated data containing a set of base classes is first used to learn an initial set of weights. Hand-labeled data from the CKAN repository is then used in a transfer-learning paradigm to adapt the initial weights to… ▽ More

    Submitted 24 January, 2019; originally announced January 2019.

  6. arXiv:1804.01503  [pdf, ps, other

    cs.AI cs.CL

    Abstractive Tabular Dataset Summarization via Knowledge Base Semantic Embeddings

    Authors: Paul Azunre, Craig Corcoran, David Sullivan, Garrett Honke, Rebecca Ruppel, Sandeep Verma, Jonathon Morgan

    Abstract: This paper describes an abstractive summarization method for tabular data which employs a knowledge base semantic embedding to generate the summary. Assuming the dataset contains descriptive text in headers, columns and/or some augmenting metadata, the system employs the embedding to recommend a subject/type for each text segment. Recommendations are aggregated into a small collection of super typ… ▽ More

    Submitted 5 April, 2018; v1 submitted 4 April, 2018; originally announced April 2018.