Showing 1–2 of 2 results for author: Salameh, R

Search v0.5.6 released 2020-02-24

arXiv:2405.02675 [pdf, other]

cs.SD cs.AI eess.AS

Quranic Audio Dataset: Crowdsourced and Labeled Recitation from Non-Arabic Speakers

Authors: Raghad Salameh, Mohamad Al Mdfaa, Nursultan Askarbekuly, Manuel Mazzara

Abstract: This paper addresses the challenge of learning to recite the Quran for non-Arabic speakers. We explore the possibility of crowdsourcing a carefully annotated Quranic dataset, on top of which AI models can be built to simplify the learning process. In particular, we use the volunteer-based crowdsourcing genre and implement a crowdsourcing API to gather audio assets. We integrated the API into an ex… ▽ More This paper addresses the challenge of learning to recite the Quran for non-Arabic speakers. We explore the possibility of crowdsourcing a carefully annotated Quranic dataset, on top of which AI models can be built to simplify the learning process. In particular, we use the volunteer-based crowdsourcing genre and implement a crowdsourcing API to gather audio assets. We integrated the API into an existing mobile application called NamazApp to collect audio recitations. We developed a crowdsourcing platform called Quran Voice for annotating the gathered audio assets. As a result, we have collected around 7000 Quranic recitations from a pool of 1287 participants across more than 11 non-Arabic countries, and we have annotated 1166 recitations from the dataset in six categories. We have achieved a crowd accuracy of 0.77, an inter-rater agreement of 0.63 between the annotators, and 0.89 between the labels assigned by the algorithm and the expert judgments. △ Less

Submitted 4 May, 2024; originally announced May 2024.
arXiv:2405.02162 [pdf, other]

cs.CV cs.AI cs.RO

Map** the Unseen: Unified Promptable Panoptic Map** with Dynamic Labeling using Foundation Models

Authors: Mohamad Al Mdfaa, Raghad Salameh, Sergey Zagoruyko, Gonzalo Ferrer

Abstract: In the field of robotics and computer vision, efficient and accurate semantic map** remains a significant challenge due to the growing demand for intelligent machines that can comprehend and interact with complex environments. Conventional panoptic map** methods, however, are limited by predefined semantic classes, thus making them ineffective for handling novel or unforeseen objects. In respo… ▽ More In the field of robotics and computer vision, efficient and accurate semantic map** remains a significant challenge due to the growing demand for intelligent machines that can comprehend and interact with complex environments. Conventional panoptic map** methods, however, are limited by predefined semantic classes, thus making them ineffective for handling novel or unforeseen objects. In response to this limitation, we introduce the Unified Promptable Panoptic Map** (UPPM) method. UPPM utilizes recent advances in foundation models to enable real-time, on-demand label generation using natural language prompts. By incorporating a dynamic labeling strategy into traditional panoptic map** techniques, UPPM provides significant improvements in adaptability and versatility while maintaining high performance levels in map reconstruction. We demonstrate our approach on real-world and simulated datasets. Results show that UPPM can accurately reconstruct scenes and segment objects while generating rich semantic labels through natural language interactions. A series of ablation experiments validated the advantages of foundation model-based labeling over fixed label sets. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Search v0.5.6 released 2020-02-24