Skip to main content

Showing 1–12 of 12 results for author: Abdullah, B M

.
  1. arXiv:2406.09855  [pdf, other

    cs.CL

    On the Encoding of Gender in Transformer-based ASR Representations

    Authors: Aravind Krishnan, Badr M. Abdullah, Dietrich Klakow

    Abstract: While existing literature relies on performance differences to uncover gender biases in ASR models, a deeper analysis is essential to understand how gender is encoded and utilized during transcript generation. This work investigates the encoding and utilization of gender in the latent representations of two transformer-based ASR models, Wav2Vec2 and HuBERT. Using linear erasure, we demonstrate the… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024

  2. arXiv:2312.07338  [pdf, other

    cs.CL cs.SD eess.AS

    Self-supervised Adaptive Pre-training of Multilingual Speech Models for Language and Dialect Identification

    Authors: Mohammed Maqsood Shaik, Dietrich Klakow, Badr M. Abdullah

    Abstract: Pre-trained Transformer-based speech models have shown striking performance when fine-tuned on various downstream tasks such as automatic speech recognition and spoken language identification (SLID). However, the problem of domain mismatch remains a challenge in this area, where the domain of the pre-training data might differ from that of the downstream labeled data used for fine-tuning. In multi… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: Submitted to ICASSP 2024

  3. arXiv:2306.02405  [pdf, other

    cs.CL

    An Information-Theoretic Analysis of Self-supervised Discrete Representations of Speech

    Authors: Badr M. Abdullah, Mohammed Maqsood Shaik, Bernd Möbius, Dietrich Klakow

    Abstract: Self-supervised representation learning for speech often involves a quantization step that transforms the acoustic input into discrete units. However, it remains unclear how to characterize the relationship between these discrete units and abstract phonetic categories such as phonemes. In this paper, we develop an information-theoretic framework whereby we represent each phonetic category as a dis… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: Accepted in Interspeech 2023

  4. arXiv:2303.03873  [pdf

    eess.SY cs.AI cs.LG

    Develo** the Reliable Shallow Supervised Learning for Thermal Comfort using ASHRAE RP-884 and ASHRAE Global Thermal Comfort Database II

    Authors: Kanisius Karyono, Badr M. Abdullah, Alison J. Cotgrave, Ana Bras, Jeff Cullen

    Abstract: The artificial intelligence (AI) system designer for thermal comfort faces insufficient data recorded from the current user or overfitting due to unreliable training data. This work introduces the reliable data set for training the AI subsystem for thermal comfort. This paper presents the control algorithm based on shallow supervised learning, which is simple enough to be implemented in the Intern… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: 15 pages with Appendix

    Report number: https://ieeexplore.ieee.org/document/10471265 MSC Class: 93 ACM Class: I.2.1; I.2.6

    Journal ref: 2024, Aug Vol 1

  5. arXiv:2301.03012  [pdf, other

    cs.CL

    Analyzing the Representational Geometry of Acoustic Word Embeddings

    Authors: Badr M. Abdullah, Dietrich Klakow

    Abstract: Acoustic word embeddings (AWEs) are vector representations such that different acoustic exemplars of the same word are projected nearby in the embedding space. In addition to their use in speech technology applications such as spoken term discovery and keyword spotting, AWE models have been adopted as models of spoken-word processing in several cognitively motivated studies and have been shown to… ▽ More

    Submitted 8 January, 2023; originally announced January 2023.

    Comments: In BlackboxNLP workshop, EMNLP 2022 [ oral presentation ]

  6. arXiv:2209.06633  [pdf, other

    cs.CL eess.AS

    Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic Word Embeddings

    Authors: Badr M. Abdullah, Bernd Möbius, Dietrich Klakow

    Abstract: Models of acoustic word embeddings (AWEs) learn to map variable-length spoken word segments onto fixed-dimensionality vector representations such that different acoustic exemplars of the same word are projected nearby in the embedding space. In addition to their speech technology applications, AWE models have been shown to predict human performance on a variety of auditory lexical processing tasks… ▽ More

    Submitted 18 September, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: Accepted in INTERSPEECH 2022

  7. arXiv:2109.10179  [pdf, other

    cs.CL

    How Familiar Does That Sound? Cross-Lingual Representational Similarity Analysis of Acoustic Word Embeddings

    Authors: Badr M. Abdullah, Iuliia Zaitova, Tania Avgustinova, Bernd Möbius, Dietrich Klakow

    Abstract: How do neural networks "perceive" speech sounds from unknown languages? Does the typological similarity between the model's training language (L1) and an unknown language (L2) have an impact on the model representations of L2 speech signals? To answer these questions, we present a novel experimental design based on representational similarity analysis (RSA) to analyze acoustic word embeddings (AWE… ▽ More

    Submitted 21 September, 2021; originally announced September 2021.

    Comments: BlackboxNLP 2021

  8. arXiv:2106.08686  [pdf, other

    cs.CL cs.SD eess.AS

    Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study

    Authors: Badr M. Abdullah, Marius Mosbach, Iuliia Zaitova, Bernd Möbius, Dietrich Klakow

    Abstract: Several variants of deep neural networks have been successfully employed for building parametric models that project variable-duration spoken word segments onto fixed-size vector representations, or acoustic word embeddings (AWEs). However, it remains unclear to what degree we can rely on the distance in the emerging AWE space as an estimate of word-form similarity. In this paper, we ask: does the… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: Accepted in Interspeech 2021

  9. arXiv:2106.03895  [pdf, other

    cs.CL cs.SD eess.AS

    SIGTYP 2021 Shared Task: Robust Spoken Language Identification

    Authors: Elizabeth Salesky, Badr M. Abdullah, Sabrina J. Mielke, Elena Klyachko, Oleg Serikov, Edoardo Ponti, Ritesh Kumar, Ryan Cotterell, Ekaterina Vylomova

    Abstract: While language identification is a fundamental speech and language processing task, for many languages and language families it remains a challenging task. For many low-resource and endangered languages this is in part due to resource availability: where larger datasets exist, they may be single-speaker or have different domains than desired application scenarios, demanding a need for domain and s… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: The first three authors contributed equally

  10. arXiv:2011.00960  [pdf, other

    cs.CL

    A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English

    Authors: Marius Mosbach, Stefania Degaetano-Ortlieb, Marie-Pauline Krielke, Badr M. Abdullah, Dietrich Klakow

    Abstract: Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on. We evaluate three models (BERT, RoBERTa, and ALBERT), testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks. We focus on relative clauses (in American English) as… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: Accepted to COLING 2020

  11. arXiv:2010.11973  [pdf, other

    cs.CL

    Rediscovering the Slavic Continuum in Representations Emerging from Neural Models of Spoken Language Identification

    Authors: Badr M. Abdullah, Jacek Kudera, Tania Avgustinova, Bernd Möbius, Dietrich Klakow

    Abstract: Deep neural networks have been employed for various spoken language recognition tasks, including tasks that are multilingual by definition such as spoken language identification. In this paper, we present a neural model for Slavic language identification in speech signals and analyze its emergent representations to investigate whether they reflect objective measures of language relatedness and/or… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: Accepted in VarDial 2020 Workshop

  12. arXiv:2008.00545  [pdf, other

    eess.AS cs.CL

    Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages

    Authors: Badr M. Abdullah, Tania Avgustinova, Bernd Möbius, Dietrich Klakow

    Abstract: State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language. However, it is still unclear to what extent neural LID models generalize to speech samples with differ… ▽ More

    Submitted 6 August, 2020; v1 submitted 2 August, 2020; originally announced August 2020.

    Comments: To appear in INTERSPEECH 2020