Skip to main content

Showing 1–2 of 2 results for author: Njie, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.00565  [pdf, other

    cs.CL

    Leveraging Corpus Metadata to Detect Template-based Translation: An Exploratory Case Study of the Egyptian Arabic Wikipedia Edition

    Authors: Saied Alshahrani, Hesham Haroon, Ali Elfilali, Mariama Njie, Jeanna Matthews

    Abstract: Wikipedia articles (content pages) are commonly used corpora in Natural Language Processing (NLP) research, especially in low-resource languages other than English. Yet, a few research studies have studied the three Arabic Wikipedia editions, Arabic Wikipedia (AR), Egyptian Arabic Wikipedia (ARZ), and Moroccan Arabic Wikipedia (ARY), and documented issues in the Egyptian Arabic Wikipedia edition r… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: This paper has been accepted at LREC-COLING 2024: The 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT6)

  2. arXiv:2007.05872  [pdf

    cs.CL cs.LG

    Is Machine Learning Speaking my Language? A Critical Look at the NLP-Pipeline Across 8 Human Languages

    Authors: Esma Wali, Yan Chen, Christopher Mahoney, Thomas Middleton, Marzieh Babaeianjelodar, Mariama Njie, Jeanna Neefe Matthews

    Abstract: Natural Language Processing (NLP) is increasingly used as a key ingredient in critical decision-making systems such as resume parsers used in sorting a list of job candidates. NLP systems often ingest large corpora of human text, attempting to learn from past human behavior and decisions in order to produce systems that will make recommendations about our future world. Over 7000 human languages ar… ▽ More

    Submitted 11 July, 2020; originally announced July 2020.

    Comments: Participatory Approaches to Machine Learning Workshop, 37th International Conference on Machine Learning

    ACM Class: I.2.7