Skip to main content

Showing 1–9 of 9 results for author: Kheir, Y E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16099  [pdf, other

    cs.SD eess.AS

    Speech Representation Analysis based on Inter- and Intra-Model Similarities

    Authors: Yassine El Kheir, Ahmed Ali, Shammur Absar Chowdhury

    Abstract: Self-supervised models have revolutionized speech processing, achieving new levels of performance in a wide variety of tasks with limited resources. However, the inner workings of these models are still opaque. In this paper, we aim to analyze the encoded contextual representation of these foundation models based on their inter- and intra-model similarity, independent of any external annotation an… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 5 pages, Accepted to appear in ICASSP XAI-SA Workshop

  2. arXiv:2310.13974  [pdf, other

    cs.CL cs.SD eess.AS

    Automatic Pronunciation Assessment -- A Review

    Authors: Yassine El Kheir, Ahmed Ali, Shammur Absar Chowdhury

    Abstract: Pronunciation assessment and its application in computer-aided pronunciation training (CAPT) have seen impressive progress in recent years. With the rapid growth in language processing and deep learning over the past few years, there is a need for an updated review. In this paper, we review methods employed in pronunciation assessment for both phonemic and prosodic. We categorize the main challeng… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: 9 pages, accepted to EMNLP Findings

  3. arXiv:2309.07739  [pdf, other

    cs.CL cs.SD eess.AS

    The complementary roles of non-verbal cues for Robust Pronunciation Assessment

    Authors: Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali

    Abstract: Research on pronunciation assessment systems focuses on utilizing phonetic and phonological aspects of non-native (L2) speech, often neglecting the rich layer of information hidden within the non-verbal cues. In this study, we proposed a novel pronunciation assessment framework, IntraVerbalPA. % The framework innovatively incorporates both fine-grained frame- and abstract utterance-level non-verba… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: 5 pages, submitted to ICASSP 2024

  4. arXiv:2309.07719  [pdf, other

    cs.CL cs.SD eess.AS

    L1-aware Multilingual Mispronunciation Detection Framework

    Authors: Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali

    Abstract: The phonological discrepancies between a speaker's native (L1) and the non-native language (L2) serves as a major factor for mispronunciation. This paper introduces a novel multilingual MDD architecture, L1-MultiMDD, enriched with L1-aware speech representation. An end-to-end speech encoder is trained on the input signal and its corresponding reference phoneme sequence. First, an attention mechani… ▽ More

    Submitted 21 September, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: 5 papers, submitted to ICASSP 2024

  5. arXiv:2308.02503  [pdf, other

    eess.AS cs.CL cs.SD

    MyVoice: Arabic Speech Resource Collaboration Platform

    Authors: Yousseif Elshahawy, Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali

    Abstract: We introduce MyVoice, a crowdsourcing platform designed to collect Arabic speech to enhance dialectal speech technologies. This platform offers an opportunity to design large dialectal speech datasets; and makes them publicly available. MyVoice allows contributors to select city/country-level fine-grained dialect and record the displayed utterances. Users can switch roles between contributors and… ▽ More

    Submitted 23 July, 2023; originally announced August 2023.

    Comments: 2 pages, accepted at InterSpeech23 Show and Tell Session

  6. arXiv:2306.01845  [pdf, other

    cs.SD eess.AS

    Multi-View Multi-Task Representation Learning for Mispronunciation Detection

    Authors: Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali

    Abstract: The disparity in phonology between learner's native (L1) and target (L2) language poses a significant challenge for mispronunciation detection and diagnosis (MDD) systems. This challenge is further intensified by lack of annotated L2 data. This paper proposes a novel MDD architecture that exploits multiple `views' of the same input data assisted by auxiliary tasks to learn more distinctive phoneti… ▽ More

    Submitted 7 August, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 5 pages, Accepted SLaTE23

  7. arXiv:2305.14982  [pdf, other

    cs.CL cs.AI

    LAraBench: Benchmarking Arabic AI with Large Language Models

    Authors: Ahmed Abdelali, Hamdy Mubarak, Shammur Absar Chowdhury, Maram Hasanain, Basel Mousi, Sabri Boughorbel, Yassine El Kheir, Daniel Izham, Fahim Dalvi, Majd Hawasly, Nizi Nazar, Yousseif Elshahawy, Ahmed Ali, Nadir Durrani, Natasa Milic-Frayling, Firoj Alam

    Abstract: Recent advancements in Large Language Models (LLMs) have significantly influenced the landscape of language and speech research. Despite this progress, these models lack specific benchmarking against state-of-the-art (SOTA) models tailored to particular languages and tasks. LAraBench addresses this gap for Arabic Natural Language Processing (NLP) and Speech Processing tasks, including sequence tag… ▽ More

    Submitted 5 February, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Foundation Models, Large Language Models, Arabic NLP, Arabic Speech, Arabic AI, GPT3.5 Evaluation, USM Evaluation, Whisper Evaluation, GPT-4, BLOOMZ, Jais13b

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  8. arXiv:2305.07445  [pdf, other

    eess.AS cs.CL cs.SD

    QVoice: Arabic Speech Pronunciation Learning Application

    Authors: Yassine El Kheir, Fouad Khnaisser, Shammur Absar Chowdhury, Hamdy Mubarak, Shazia Afzal, Ahmed Ali

    Abstract: This paper introduces a novel Arabic pronunciation learning application QVoice, powered with end-to-end mispronunciation detection and feedback generator module. The application is designed to support non-native Arabic speakers in enhancing their pronunciation skills, while also hel** native speakers mitigate any potential influence from regional dialects on their Modern Standard Arabic (MSA) pr… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: 2 pages, Accepted InterSpeech23 Show & Tell Demo Session

    Journal ref: InterSpeech 2023

  9. arXiv:2211.00923  [pdf, other

    cs.SD cs.CL eess.AS

    SpeechBlender: Speech Augmentation Framework for Mispronunciation Data Generation

    Authors: Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali, Hamdy Mubarak, Shazia Afzal

    Abstract: The lack of labeled second language (L2) speech data is a major challenge in designing mispronunciation detection models. We introduce SpeechBlender - a fine-grained data augmentation pipeline for generating mispronunciation errors to overcome such data scarcity. The SpeechBlender utilizes varieties of masks to target different regions of phonetic units, and use the mixing factors to linearly inte… ▽ More

    Submitted 12 July, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: 5 pages