Skip to main content

Showing 1–5 of 5 results for author: Bahmaninezhad, F

Searching in archive eess. Search in all archives.
.
  1. arXiv:2308.06327  [pdf, other

    eess.AS cs.CL cs.SD

    Bilingual Streaming ASR with Grapheme units and Auxiliary Monolingual Loss

    Authors: Mohammad Soleymanpour, Mahmoud Al Ismail, Fahimeh Bahmaninezhad, Kshitiz Kumar, Jian Wu

    Abstract: We introduce a bilingual solution to support English as secondary locale for most primary locales in hybrid automatic speech recognition (ASR) settings. Our key developments constitute: (a) pronunciation lexicon with grapheme units instead of phone units, (b) a fully bilingual alignment model and subsequently bilingual streaming transformer model, (c) a parallel encoder structure with language ide… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

  2. arXiv:1912.07814  [pdf, other

    cs.LG eess.AS stat.ML

    A Unified Framework for Speech Separation

    Authors: Fahimeh Bahmaninezhad, Shi-Xiong Zhang, Yong Xu, Meng Yu, John H. L. Hansen, Dong Yu

    Abstract: Speech separation refers to extracting each individual speech source in a given mixed signal. Recent advancements in speech separation and ongoing research in this area, have made these approaches as promising techniques for pre-processing of naturalistic audio streams. After incorporating deep learning techniques into speech separation, performance on these systems is improving faster. The initia… ▽ More

    Submitted 16 December, 2019; originally announced December 2019.

  3. arXiv:1905.07497  [pdf, other

    cs.SD cs.LG eess.AS

    A comprehensive study of speech separation: spectrogram vs waveform separation

    Authors: Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu

    Abstract: Speech separation has been studied widely for single-channel close-talk microphone recordings over the past few years; developed solutions are mostly in frequency-domain. Recently, a raw audio waveform separation network (TasNet) is introduced for single-channel data, with achieving high Si-SNR (scale-invariant source-to-noise ratio) and SDR (source-to-distortion ratio) comparing against the state… ▽ More

    Submitted 23 July, 2019; v1 submitted 17 May, 2019; originally announced May 2019.

    Comments: INTERSPEECH 2019

  4. arXiv:1904.07386  [pdf, other

    eess.AS cs.CL cs.SD

    I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

    Authors: Kong Aik Lee, Ville Hautamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, **g Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Hector Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda , et al. (21 additional authors not shown)

    Abstract: The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the res… ▽ More

    Submitted 15 April, 2019; originally announced April 2019.

    Comments: 5 pages

  5. arXiv:1710.00113  [pdf, ps, other

    eess.AS cs.SD

    UTD-CRSS Submission for MGB-3 Arabic Dialect Identification: Front-end and Back-end Advancements on Broadcast Speech

    Authors: Ahmet E. Bulut, Qian Zhang, Chunlei Zhang, Fahimeh Bahmaninezhad, John H. L. Hansen

    Abstract: This study presents systems submitted by the University of Texas at Dallas, Center for Robust Speech Systems (UTD-CRSS) to the MGB-3 Arabic Dialect Identification (ADI) subtask. This task is defined to discriminate between five dialects of Arabic, including Egyptian, Gulf, Levantine, North African, and Modern Standard Arabic. We develop multiple single systems with different front-end representati… ▽ More

    Submitted 29 September, 2017; originally announced October 2017.