Skip to main content

Showing 1–2 of 2 results for author: Abandah, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2303.14588  [pdf

    cs.CL

    Fine-Tashkeel: Finetuning Byte-Level Models for Accurate Arabic Text Diacritization

    Authors: Bashar Al-Rfooh, Gheith Abandah, Rami Al-Rfou

    Abstract: Most of previous work on learning diacritization of the Arabic language relied on training models from scratch. In this paper, we investigate how to leverage pre-trained language models to learn diacritization. We finetune token-free pre-trained multilingual models (ByT5) to learn to predict and insert missing diacritics in Arabic text, a complex task that requires understanding the sentence seman… ▽ More

    Submitted 25 March, 2023; originally announced March 2023.

  2. arXiv:2108.01141  [pdf

    cs.CL cs.LG

    Correcting Arabic Soft Spelling Mistakes using BiLSTM-based Machine Learning

    Authors: Gheith A. Abandah, Ashraf Suyyagh, Mohammed Z. Khedher

    Abstract: Soft spelling errors are a class of spelling mistakes that is widespread among native Arabic speakers and foreign learners alike. Some of these errors are typographical in nature. They occur due to orthographic variations of some Arabic letters and the complex rules that dictate their correct usage. Many people forgo these rules, and given the identical phonetic sounds, they often confuse such let… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: A preprint research paper of 25 pages, 15 figures, and 8 tables all included in one pdf file

    MSC Class: 68T50 (Primary) 68T07 (Secondary) ACM Class: I.2.7; I.5.1; I.7.1