Skip to main content

Showing 1–6 of 6 results for author: Talafha, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.16153  [pdf, other

    cs.CL

    WojoodNER 2023: The First Arabic Named Entity Recognition Shared Task

    Authors: Mustafa Jarrar, Muhammad Abdul-Mageed, Mohammed Khalilia, Bashar Talafha, AbdelRahim Elmadany, Nagham Hamad, Alaa' Omar

    Abstract: We present WojoodNER-2023, the first Arabic Named Entity Recognition (NER) Shared Task. The primary focus of WojoodNER-2023 is on Arabic NER, offering novel NER datasets (i.e., Wojood) and the definition of subtasks designed to facilitate meaningful comparisons between different NER approaches. WojoodNER-2023 encompassed two Subtasks: FlatNER and NestedNER. A total of 45 unique teams registered fo… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  2. arXiv:2310.11069  [pdf, other

    cs.CL cs.SD eess.AS

    VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System

    Authors: Abdul Waheed, Bashar Talafha, Peter Sullivan, AbdelRahim Elmadany, Muhammad Abdul-Mageed

    Abstract: Arabic is a complex language with many varieties and dialects spoken by over 450 millions all around the world. Due to the linguistic diversity and variations, it is challenging to build a robust and generalized ASR system for Arabic. In this work, we address this gap by develo** and demoing a system, dubbed VoxArabica, for dialect identification (DID) as well as automatic speech recognition (AS… ▽ More

    Submitted 27 October, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted at ArabicNLP conference co-located with EMNLP'23. First three authors contributed equally

  3. arXiv:2306.02902  [pdf, ps, other

    cs.CL cs.SD eess.AS

    N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition

    Authors: Bashar Talafha, Abdul Waheed, Muhammad Abdul-Mageed

    Abstract: Whisper, the recently developed multilingual weakly supervised model, is reported to perform well on multiple speech recognition benchmarks in both monolingual and multilingual settings. However, it is not clear how Whisper would fare under diverse conditions even on languages it was evaluated on such as Arabic. In this work, we address this gap by comprehensively evaluating Whisper on several var… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: 4 pages, INTERSPEECH 2023

  4. arXiv:2202.05474  [pdf, other

    cs.CV

    Bench-Marking And Improving Arabic Automatic Image Captioning Through The Use Of Multi-Task Learning Paradigm

    Authors: Muhy Eddin Za'ter, Bashar Talafha

    Abstract: The continuous increase in the use of social media and the visual content on the internet have accelerated the research in computer vision field in general and the image captioning task in specific. The process of generating a caption that best describes an image is a useful task for various applications such as it can be used in image indexing and as a hearing aid for the visually impaired. In re… ▽ More

    Submitted 10 March, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

  5. arXiv:2108.01425  [pdf, other

    cs.CL cs.AI

    sarcasm detection and quantification in arabic tweets

    Authors: Bashar Talafha, Muhy Eddin Za'ter, Samer Suleiman, Mahmoud Al-Ayyoub, Mohammed N. Al-Kabi

    Abstract: The role of predicting sarcasm in the text is known as automatic sarcasm detection. Given the prevalence and challenges of sarcasm in sentiment-bearing text, this is a critical phase in most sentiment analysis tasks. With the increasing popularity and usage of different social media platforms among users around the world, people are using sarcasm more and more in their day-to-day conversations, so… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

  6. arXiv:2007.05612  [pdf, other

    cs.CL cs.LG

    Multi-Dialect Arabic BERT for Country-Level Dialect Identification

    Authors: Bashar Talafha, Mohammad Ali, Muhy Eddin Za'ter, Haitham Seelawi, Ibraheem Tuffaha, Mostafa Samir, Wael Farhan, Hussein T. Al-Natsheh

    Abstract: Arabic dialect identification is a complex problem for a number of inherent properties of the language itself. In this paper, we present the experiments conducted, and the models developed by our competing team, Mawdoo3 AI, along the way to achieving our winning solution to subtask 1 of the Nuanced Arabic Dialect Identification (NADI) shared task. The dialect identification subtask provides 21,000… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

    Comments: Accepted at the Fifth Arabic Natural Language Processing Workshop (WANLP2020) co-located with the 28th International Conference on Computational Linguistics (COLING'2020), Barcelona, Spain, 12 Dec. 2020