AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages
Authors:
Jiayi Wang,
David Ifeoluwa Adelani,
Sweta Agrawal,
Marek Masiak,
Ricardo Rei,
Eleftheria Briakou,
Marine Carpuat,
Xuanli He,
Sofia Bourhim,
Andiswa Bukula,
Muhidin Mohamed,
Temitayo Olatoye,
Tosin Adewumi,
Hamam Mokayed,
Christine Mwase,
Wangui Kimotho,
Foutse Yuehgoh,
Anuoluwapo Aremu,
Jessica Ojo,
Shamsuddeen Hassan Muhammad,
Salomey Osei,
Abdul-Hakeem Omotayo,
Chiamaka Chukwuneke,
Perez Ogayo,
Oumaima Hourrane
, et al. (33 additional authors not shown)
Abstract:
Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measuring this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments. Learned metrics such as COMET have higher correlation; however, the lack of eval…
▽ More
Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measuring this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments. Learned metrics such as COMET have higher correlation; however, the lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLM-R) to create the state-of-the-art MT evaluation metrics for African languages with respect to Spearman-rank correlation with human judgments (0.441).
△ Less
Submitted 23 April, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
Towards A Sign Language Gloss Representation Of Modern Standard Arabic
Authors:
Salma El Anigri,
Mohammed Majid Himmi,
Abdelhak Mahmoudi
Abstract:
Over 5% of the world's population (466 million people) has disabling hearing loss. 4 million are children. They can be hard of hearing or deaf. Deaf people mostly have profound hearing loss. Which implies very little or no hearing. Over the world, deaf people often communicate using a sign language with gestures of both hands and facial expressions. The sign language is a full-fledged natural lang…
▽ More
Over 5% of the world's population (466 million people) has disabling hearing loss. 4 million are children. They can be hard of hearing or deaf. Deaf people mostly have profound hearing loss. Which implies very little or no hearing. Over the world, deaf people often communicate using a sign language with gestures of both hands and facial expressions. The sign language is a full-fledged natural language with its own grammar and lexicon. Therefore, there is a need for translation models from and to sign languages. In this work, we are interested in the translation of Modern Standard Arabic(MSAr) into sign language. We generated a gloss representation from MSAr that extracts the features mandatory for the generation of animation signs. Our approach locates the most pertinent features that maintain the meaning of the input Arabic sentence.
△ Less
Submitted 4 May, 2020;
originally announced May 2020.