Skip to main content

Showing 1–5 of 5 results for author: Abavisani, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2309.13343  [pdf, other

    cs.SD eess.AS

    Two vs. Four-Channel Sound Event Localization and Detection

    Authors: Julia Wilkins, Magdalena Fuentes, Luca Bondi, Shabnam Ghaffarzadegan, Ali Abavisani, Juan Pablo Bello

    Abstract: Sound event localization and detection (SELD) systems estimate both the direction-of-arrival (DOA) and class of sound sources over time. In the DCASE 2022 SELD Challenge (Task 3), models are designed to operate in a 4-channel setting. While beneficial to further the development of SELD systems using a multichannel recording setup such as first-order Ambisonics (FOA), most consumer electronics devi… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

  2. arXiv:2201.11207  [pdf, other

    cs.SD cs.CL eess.AS

    Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition

    Authors: Piotr Żelasko, Siyuan Feng, Laureano Moro Velazquez, Ali Abavisani, Saurabhchand Bhati, Odette Scharenborg, Mark Hasegawa-Johnson, Najim Dehak

    Abstract: The high cost of data acquisition makes Automatic Speech Recognition (ASR) model training problematic for most existing languages, including languages that do not even have a written script, or for which the phone inventories remain unknown. Past works explored multilingual training, transfer learning, as well as zero-shot learning in order to build ASR systems for these low-resource languages. Wh… ▽ More

    Submitted 27 January, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

    Comments: Accepted for publication in Computer Speech and Language

  3. How Phonotactics Affect Multilingual and Zero-shot ASR Performance

    Authors: Siyuan Feng, Piotr Żelasko, Laureano Moro-Velázquez, Ali Abavisani, Mark Hasegawa-Johnson, Odette Scharenborg, Najim Dehak

    Abstract: The idea of combining multiple languages' recordings to train a single automatic speech recognition (ASR) model brings the promise of the emergence of universal speech representation. Recently, a Transformer encoder-decoder model has been shown to leverage multilingual data well in IPA transcriptions of languages presented during training. However, the representations it learned were not successfu… ▽ More

    Submitted 10 February, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: Accepted for publication in IEEE ICASSP 2021. The first 2 authors contributed equally to this work

  4. Automatic Estimation of Intelligibility Measure for Consonants in Speech

    Authors: Ali Abavisani, Mark Hasegawa-Johnson

    Abstract: In this article, we provide a model to estimate a real-valued measure of the intelligibility of individual speech segments. We trained regression models based on Convolutional Neural Networks (CNN) for stop consonants \textipa{/p,t,k,b,d,g/} associated with vowel \textipa{/A/}, to estimate the corresponding Signal to Noise Ratio (SNR) at which the Consonant-Vowel (CV) sound becomes intelligible fo… ▽ More

    Submitted 28 June, 2020; v1 submitted 12 May, 2020; originally announced May 2020.

    Comments: 5 pages, 1 figure, 7 tables, submitted to Inter Speech 2020 Conference

  5. arXiv:1908.04751  [pdf, other

    q-bio.QM cs.LG cs.SD eess.AS eess.SP

    The role of cue enhancement and frequency fine-tuning in hearing impaired phone recognition

    Authors: Ali Abavisani, Mark A Hasegawa-Johnson

    Abstract: A speech-based hearing test is designed to identify the susceptible error-prone phones for individual hearing impaired (HI) ear. Only robust tokens in the experiment noise levels had been chosen for the test. The noise-robustness of tokens is measured as SNR90 of the token, which is the signal to the speech-weighted noise ratio where a normal hearing (NH) listener would recognize the token with an… ▽ More

    Submitted 9 August, 2019; originally announced August 2019.

    Comments: 16 pages, 10 figures, proceedings of the Acoustical Society of America meeting, May 2019, Louisville, KY