Skip to main content

Showing 1–4 of 4 results for author: Siong, C E

Searching in archive eess. Search in all archives.
.
  1. arXiv:2309.07458  [pdf, other

    cs.SD eess.AS

    Analysis of Speech Separation Performance Degradation on Emotional Speech Mixtures

    Authors: Jia Qi Yip, Dianwen Ng, Bin Ma, Chng Eng Siong

    Abstract: Despite recent strides made in Speech Separation, most models are trained on datasets with neutral emotions. Emotional speech has been known to degrade performance of models in a variety of speech tasks, which reduces the effectiveness of these models when deployed in real-world scenarios. In this paper we perform analysis to differentiate the performance degradation arising from the emotions in s… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted by APSIPA ASC 2023

  2. arXiv:2305.12460  [pdf, other

    cs.SD eess.AS

    Study of GANs for Noisy Speech Simulation from Clean Speech

    Authors: Leander Melroy Maben, Zixun Guo, Chen Chen, Utkarsh Chudiwal, Chng Eng Siong

    Abstract: The performance of speech processing models trained on clean speech drops significantly in noisy conditions. Training with noisy datasets alleviates the problem, but procuring such datasets is not always feasible. Noisy speech simulation models that generate noisy speech from clean speech help remedy this issue. In our work, we study the ability of Generative Adversarial Networks (GANs) to simulat… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

  3. arXiv:2203.11774  [pdf, other

    cs.SD cs.LG eess.AS

    Estimation of speaker age and height from speech signal using bi-encoder transformer mixture model

    Authors: Tarun Gupta, Duc-Tuan Truong, Tran The Anh, Chng Eng Siong

    Abstract: The estimation of speaker characteristics such as age and height is a challenging task, having numerous applications in voice forensic analysis. In this work, we propose a bi-encoder transformer mixture model for speaker age and height estimation. Considering the wide differences in male and female voice characteristics such as differences in formant and fundamental frequencies, we propose the use… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: Submitted to Interspeech 2022

  4. arXiv:2110.13653  [pdf, other

    eess.AS cs.LG cs.SD

    Learning Speaker Representation with Semi-supervised Learning approach for Speaker Profiling

    Authors: Shangeth Rajaa, Pham Van Tung, Chng Eng Siong

    Abstract: Speaker profiling, which aims to estimate speaker characteristics such as age and height, has a wide range of applications inforensics, recommendation systems, etc. In this work, we propose a semisupervised learning approach to mitigate the issue of low training data for speaker profiling. This is done by utilizing external corpus with speaker information to train a better representation which can… ▽ More

    Submitted 24 October, 2021; originally announced October 2021.

    Comments: 5 pages, 4 figures