Skip to main content

Showing 1–7 of 7 results for author: Paliwal, K K

Searching in archive eess. Search in all archives.
.
  1. arXiv:2009.02940  [pdf, ps, other

    eess.AS cs.SD eess.SP

    Deep Learning-Based Single-Ended Objective Quality Measures for Time-Scale Modified Audio

    Authors: Timothy Roberts, Aaron Nicolson, Kuldip K. Paliwal

    Abstract: Objective evaluation of audio processed with Time-Scale Modification (TSM) is seeing a resurgence of interest. Recently, a labelled time-scaled audio dataset was used to train an objective measure for TSM evaluation. This DE measure was an extension of Perceptual Evaluation of Audio Quality, and required reference and test signals. In this paper, two single-ended objective quality measures for tim… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

    Comments: 13 pages, 11 figures, Submitted to The Journal of the Acoustical Society of America

  2. arXiv:2006.06153  [pdf, ps, other

    eess.AS cs.SD

    An Objective Measure of Quality for Time-Scale Modification of Audio

    Authors: Timothy Roberts, Kuldip K. Paliwal

    Abstract: Objective evaluation of audio processed with Time-Scale Modification (TSM) remains an open problem. Recently, a dataset of time-scaled audio with subjective quality labels was published and used to create an initial objective measure of quality. In this paper, an improved objective measure of quality for time-scaled audio is proposed. The measure uses hand-crafted features and a fully connected ne… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

    Comments: 12 pages, 7 figures, Submitted to The Journal of the Acoustical Society of America, Currently under review

  3. arXiv:2006.00848  [pdf, ps, other

    eess.AS cs.SD

    A time-scale modification dataset with subjective quality labels

    Authors: Timothy Roberts, Kuldip K. Paliwal

    Abstract: Time Scale Modification (TSM) is a well-researched field; however, no effective objective measure of quality exists. This paper details the creation, subjective evaluation, and analysis of a dataset for use in the development of an objective measure of quality for TSM. Comprised of two parts, the training component contains 88 source files processed using six TSM methods at 10 time scales, while t… ▽ More

    Submitted 15 July, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: 12 Pages, 13 Figures, Published in The Journal of the Acoustical Society of America (Vol.148, Issue 1), For associated dataset, see http://ieee-dataport.org/1987

    Journal ref: J. Acoust. Soc. Am. 148(1). pp. 201-210 (2020)

  4. arXiv:2002.12794  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Deep Residual-Dense Lattice Network for Speech Enhancement

    Authors: Mohammad Nikzad, Aaron Nicolson, Yongsheng Gao, Jun Zhou, Kuldip K. Paliwal, Fanhua Shang

    Abstract: Convolutional neural networks (CNNs) with residual links (ResNets) and causal dilated convolutional units have been the network of choice for deep learning approaches to speech enhancement. While residual links improve gradient flow during training, feature diminution of shallow layer outputs can occur due to repetitive summations with deeper layer outputs. One strategy to improve feature re-usage… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

    Comments: 8 pages, Accepted by AAAI-2020

  5. arXiv:1912.12023  [pdf, other

    eess.AS eess.SP

    Monaural Speech Enhancement Using a Multi-Branch Temporal Convolutional Network

    Authors: Qiquan Zhang, Aaron Nicolson, Mingjiang Wang, Kuldip K. Paliwal, Chenxu Wang

    Abstract: Deep learning has achieved substantial improvement on single-channel speech enhancement tasks. However, the performance of multi-layer perceptions (MLPs)-based methods is limited by the ability to capture the long-term effective history information. The recurrent neural networks (RNNs), e.g., long short-term memory (LSTM) model, are able to capture the long-term temporal dependencies, but come wit… ▽ More

    Submitted 17 May, 2020; v1 submitted 27 December, 2019; originally announced December 2019.

    Comments: There are some inappropriate decriptions. These descriptions exist on many pages

  6. arXiv:1910.11969  [pdf, other

    eess.AS cs.SD

    Sum-Product Networks for Robust Automatic Speaker Identification

    Authors: Aaron Nicolson, Kuldip K. Paliwal

    Abstract: We introduce sum-product networks (SPNs) for robust speech processing through a simple robust automatic speaker identification (ASI) task. SPNs are deep probabilistic graphical models capable of answering multiple probabilistic queries. We show that SPNs are able to remain robust by using the marginal probability density function (PDF) of the spectral features that reliably represent speech. Thoug… ▽ More

    Submitted 13 August, 2020; v1 submitted 25 October, 2019; originally announced October 2019.

    Comments: Proc. Interspeech 2020

  7. arXiv:1906.07319  [pdf, other

    eess.AS cs.SD eess.SP

    Deep Xi as a Front-End for Robust Automatic Speech Recognition

    Authors: Aaron Nicolson, Kuldip K. Paliwal

    Abstract: Current front-ends for robust automatic speech recognition(ASR) include masking- and map**-based deep learning approaches to speech enhancement. A recently proposed deep learning approach toa prioriSNR estimation, called Deep**-based approaches. Motivated by this, we investigate Deep Xi… ▽ More

    Submitted 27 January, 2020; v1 submitted 17 June, 2019; originally announced June 2019.