Skip to main content

Showing 1–9 of 9 results for author: López-Espejo, I

Searching in archive eess. Search in all archives.
.
  1. arXiv:2405.07641  [pdf, ps, other

    eess.AS cs.SD

    Evaluating Speech Enhancement Systems Through Listening Effort

    Authors: Femke B. Gelderblom, Tron V. Tronstad, Iván López-Espejo

    Abstract: Understanding degraded speech is demanding, requiring increased listening effort (LE). Evaluating processed and unprocessed speech with respect to LE can objectively indicate if speech enhancement systems benefit listeners. However, existing methods for measuring LE are complex and not widely applicable. In this study, we propose a simple method to evaluate speech intelligibility and LE simultaneo… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  2. arXiv:2401.09315  [pdf, other

    eess.AS

    On Speech Pre-emphasis as a Simple and Inexpensive Method to Boost Speech Enhancement

    Authors: Iván López-Espejo, Aditya Joglekar, Antonio M. Peinado, Jesper Jensen

    Abstract: Pre-emphasis filtering, compensating for the natural energy decay of speech at higher frequencies, has been considered as a common pre-processing step in a number of speech processing tasks over the years. In this work, we demonstrate, for the first time, that pre-emphasis filtering may also be used as a simple and computationally-inexpensive way to leverage deep neural network-based speech enhanc… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  3. arXiv:2305.02147  [pdf, other

    eess.AS cs.HC

    Improved Vocal Effort Transfer Vector Estimation for Vocal Effort-Robust Speaker Verification

    Authors: Iván López-Espejo, Santi Prieto, Alfonso Ortega, Eduardo Lleida

    Abstract: Despite the maturity of modern speaker verification technology, its performance still significantly degrades when facing non-neutrally-phonated (e.g., shouted and whispered) speech. To address this issue, in this paper, we propose a new speaker embedding compensation method based on a minimum mean square error (MMSE) estimator. This method models the joint distribution of the vocal effort transfer… ▽ More

    Submitted 4 July, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

  4. arXiv:2211.10565  [pdf, other

    eess.AS cs.HC cs.LG cs.SD

    Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting

    Authors: Iván López-Espejo, Ram C. M. C. Shekar, Zheng-Hua Tan, Jesper Jensen, John H. L. Hansen

    Abstract: In the context of keyword spotting (KWS), the replacement of handcrafted speech features by learnable features has not yielded superior KWS performance. In this study, we demonstrate that filterbank learning outperforms handcrafted speech features for KWS whenever the number of filterbank channels is severely decreased. Reducing the number of channels might yield certain KWS performance drop, but… ▽ More

    Submitted 23 February, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

  5. arXiv:2111.10592  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    Deep Spoken Keyword Spotting: An Overview

    Authors: Iván López-Espejo, Zheng-Hua Tan, John Hansen, Jesper Jensen

    Abstract: Spoken keyword spotting (KWS) deals with the identification of keywords in audio streams and has become a fast-growing technology thanks to the paradigm shift introduced by deep learning a few years ago. This has allowed the rapid embedding of deep KWS in a myriad of small electronic devices with different purposes like the activation of voice assistants. Prospects suggest a sustained growth in te… ▽ More

    Submitted 20 November, 2021; originally announced November 2021.

  6. arXiv:2008.02487  [pdf, other

    eess.AS cs.HC cs.LG cs.SD

    Shouted Speech Compensation for Speaker Verification Robust to Vocal Effort Conditions

    Authors: Santi Prieto, Alfonso Ortega, Iván López-Espejo, Eduardo Lleida

    Abstract: The performance of speaker verification systems degrades when vocal effort conditions between enrollment and test (e.g., shouted vs. normal speech) are different. This is a potential situation in non-cooperative speaker verification tasks. In this paper, we present a study on different methods for linear compensation of embeddings making use of Gaussian mixture models to cluster shouted and normal… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

  7. arXiv:2006.00217  [pdf, other

    eess.AS cs.LG cs.SD

    Exploring Filterbank Learning for Keyword Spotting

    Authors: Iván López-Espejo, Zheng-Hua Tan, Jesper Jensen

    Abstract: Despite their great performance over the years, handcrafted speech features are not necessarily optimal for any particular speech application. Consequently, with greater or lesser success, optimal filterbank learning has been studied for different speech processing tasks. In this paper, we fill in a gap by exploring filterbank learning for keyword spotting (KWS). Two approaches are examined: filte… ▽ More

    Submitted 30 May, 2020; originally announced June 2020.

  8. arXiv:1909.12923  [pdf, other

    eess.IV cs.CV

    End-to-End Deep Residual Learning with Dilated Convolutions for Myocardial Infarction Detection and Localization

    Authors: Iván López-Espejo

    Abstract: In this report, I investigate the use of end-to-end deep residual learning with dilated convolutions for myocardial infarction (MI) detection and localization from electrocardiogram (ECG) signals. Although deep residual learning has already been applied to MI detection and localization, I propose a more accurate system that distinguishes among a higher number (i.e., six) of MI locations. Inspired… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

  9. arXiv:1906.09417  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    Keyword Spotting for Hearing Assistive Devices Robust to External Speakers

    Authors: Iván López-Espejo, Zheng-Hua Tan, Jesper Jensen

    Abstract: Keyword spotting (KWS) is experiencing an upswing due to the pervasiveness of small electronic devices that allow interaction with them via speech. Often, KWS systems are speaker-independent, which means that any person --user or not-- might trigger them. For applications like KWS for hearing assistive devices this is unacceptable, as only the user must be allowed to handle them. In this paper we… ▽ More

    Submitted 26 June, 2019; v1 submitted 22 June, 2019; originally announced June 2019.