Skip to main content

Showing 1–10 of 10 results for author: Desplanques, B

Searching in archive eess. Search in all archives.
.
  1. Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition Models

    Authors: Chenyang Gao, Brecht Desplanques, Chelsea J. -T. Ju, Aman Chadha, Andreas Stolcke

    Abstract: Automated speaker identification (SID) is a crucial step for the personalization of a wide range of speech-enabled services. Typical SID systems use a symmetric enrollment-verification framework with a single model to derive embeddings both offline for voice profiles extracted from enrollment utterances, and online from runtime utterances. Due to the distinct circumstances of enrollment and runtim… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: Accepted to ICASSP 2024

  2. Tackling the Score Shift in Cross-Lingual Speaker Verification by Exploiting Language Information

    Authors: Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck

    Abstract: This paper contains a post-challenge performance analysis on cross-lingual speaker verification of the IDLab submission to the VoxCeleb Speaker Recognition Challenge 2021 (VoxSRC-21). We show that current speaker embedding extractors consistently underestimate speaker similarity in within-speaker cross-lingual trials. Consequently, the typical training and scoring protocols do not put enough empha… ▽ More

    Submitted 19 June, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

    Comments: proceedings of ICASSP 2022

  3. arXiv:2109.04070  [pdf, other

    eess.AS cs.SD

    The IDLAB VoxCeleb Speaker Recognition Challenge 2021 System Description

    Authors: Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck

    Abstract: This technical report describes the IDLab submission for track 1 and 2 of the VoxCeleb Speaker Recognition Challenge 2021 (VoxSRC-21). This speaker verification competition focuses on short duration test recordings and cross-lingual trials. Currently, both Time Delay Neural Networks (TDNNs) and ResNets achieve state-of-the-art results in speaker verification. We opt to use a system fusion of hybri… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2104.02370

  4. arXiv:2108.00912  [pdf, other

    eess.AS cs.SD

    Robust Acoustic Scene Classification in the Presence of Active Foreground Speech

    Authors: Siyuan Song, Brecht Desplanques, Celest De Moor, Kris Demuynck, Nilesh Madhu

    Abstract: We present an iVector based Acoustic Scene Classification (ASC) system suited for real life settings where active foreground speech can be present. In the proposed system, each recording is represented by a fixed-length iVector that models the recording's important properties. A regularized Gaussian backend classifier with class-specific covariance models is used to extract the relevant acoustic s… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

  5. Integrating Frequency Translational Invariance in TDNNs and Frequency Positional Information in 2D ResNets to Enhance Speaker Verification

    Authors: Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck

    Abstract: This paper describes the IDLab submission for the text-independent task of the Short-duration Speaker Verification Challenge 2021 (SdSVC-21). This speaker verification competition focuses on short duration test recordings and cross-lingual trials, along with the constraint of limited availability of in-domain DeepMine Farsi training data. Currently, both Time Delay Neural Networks (TDNNs) and ResN… ▽ More

    Submitted 9 September, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: proceedings of INTERSPEECH 2021

  6. ECAPA-TDNN Embeddings for Speaker Diarization

    Authors: Nauman Dawalatabad, Mirco Ravanelli, François Grondin, Jenthe Thienpondt, Brecht Desplanques, Hwidong Na

    Abstract: Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural networks can accurately capture speaker discriminative characteristics and popular deep embeddings such as x-vectors are nowadays a fundamental component of modern diarization systems. Recently, some improvements over the standard TDNN architecture used for x-vectors have been proposed. The ECAPA-TDNN model, f… ▽ More

    Submitted 3 April, 2021; originally announced April 2021.

  7. arXiv:2010.12468  [pdf, other

    eess.AS cs.SD

    The IDLAB VoxCeleb Speaker Recognition Challenge 2020 System Description

    Authors: Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck

    Abstract: In this technical report we describe the IDLAB top-scoring submissions for the VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20) in the supervised and unsupervised speaker verification tracks. For the supervised verification tracks we trained 6 state-of-the-art ECAPA-TDNN systems and 4 Resnet34 based systems with architectural variations. On all models we apply a large margin fine-tuning str… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

  8. The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and Quality-Aware Score Calibration in DNN Based Speaker Verification

    Authors: Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck

    Abstract: In this paper we propose and analyse a large margin fine-tuning strategy and a quality-aware score calibration in text-independent speaker verification. Large margin fine-tuning is a secondary training stage for DNN based speaker verification systems trained with margin-based loss functions. It enables the network to create more robust speaker embeddings by enabling the use of longer training utte… ▽ More

    Submitted 6 April, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: proceedings of ICASSP 2021

  9. Cross-Lingual Speaker Verification with Domain-Balanced Hard Prototype Mining and Language-Dependent Score Normalization

    Authors: Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck

    Abstract: In this paper we describe the top-scoring IDLab submission for the text-independent task of the Short-duration Speaker Verification (SdSV) Challenge 2020. The main difficulty of the challenge exists in the large degree of varying phonetic overlap between the potentially cross-lingual trials, along with the limited availability of in-domain DeepMine Farsi training data. We introduce domain-balanced… ▽ More

    Submitted 10 August, 2020; v1 submitted 15 July, 2020; originally announced July 2020.

    Comments: proceedings of INTERSPEECH 2020

  10. ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification

    Authors: Brecht Desplanques, Jenthe Thienpondt, Kris Demuynck

    Abstract: Current speaker verification techniques rely on a neural network to extract speaker representations. The successful x-vector architecture is a Time Delay Neural Network (TDNN) that applies statistics pooling to project variable-length utterances into fixed-length speaker characterizing embeddings. In this paper, we propose multiple enhancements to this architecture based on recent trends in the re… ▽ More

    Submitted 10 August, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: proceedings of INTERSPEECH 2020