Skip to main content

Showing 1–4 of 4 results for author: Shifas, M P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2011.06548  [pdf, other

    eess.AS cs.SD

    Evaluating the Intelligibility Benefits of Neural Speech Enrichment for Listeners with Normal Hearing and Hearing Impairment using the Greek Harvard Corpus

    Authors: Muhammed PV Shifas, Anna Sfakianaki, Theognosia Chimona, Yannis Stylianou

    Abstract: In this work we evaluate a neural based speech intelligibility booster based on spectral sha** and dynamic range compression (SSDRC), referred to as WaveNet-based SSDRC (wSSDRC), using a recently designed Greek Harvard-style corpus. The corpus has been developed according to the format of the Harvard/IEEE sentences and offers the opportunity to apply neural speech enhancement models and examine… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

  2. arXiv:2008.05809  [pdf, other

    cs.SD cs.LG eess.AS

    Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion

    Authors: Dipjyoti Paul, Muhammed PV Shifas, Yannis Pantazis, Yannis Stylianou

    Abstract: The increased adoption of digital assistants makes text-to-speech (TTS) synthesis systems an indispensable feature of modern mobile devices. It is hence desirable to build a system capable of generating highly intelligible speech in the presence of noise. Past studies have investigated style conversion in TTS synthesis, yet degraded synthesized quality often leads to worse intelligibility. To over… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

    Comments: Accepted in INTERSPEECH 2020

  3. arXiv:2006.05233  [pdf, other

    eess.AS cs.SD

    A fully recurrent feature extraction for single channel speech enhancement

    Authors: Muhammed PV Shifas, Santelli Claudio, Vassilis Tsiaras, Yannis Stylianou

    Abstract: Convolutional neural network (CNN) modules are widely being used to build high-end speech enhancement neural models. However, the feature extraction power of vanilla CNN modules has been limited by the dimensionality constraint of the convolution kernels that are integrated - thereby, they have limitations to adequately model the noise context information at the feature extraction stage. To this e… ▽ More

    Submitted 3 June, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: 5 pages

  4. A non-causal FFTNet architecture for speech enhancement

    Authors: Muhammed PV Shifas, Nagaraj Adiga, Vassilis Tsiaras, Yannis Stylianou

    Abstract: In this paper, we suggest a new parallel, non-causal and shallow waveform domain architecture for speech enhancement based on FFTNet, a neural network for generating high quality audio waveform. In contrast to other waveform based approaches like WaveNet, FFTNet uses an initial wide dilation pattern. Such an architecture better represents the long term correlated structure of speech in the time do… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

    Comments: 5 pages