Skip to main content

Showing 1–4 of 4 results for author: Bredin, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2306.01506  [pdf, other

    cs.CL eess.AS stat.ML

    BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models

    Authors: Marvin Lavechin, Yaya Sy, Hadrien Titeux, María Andrea Cruz Blandón, Okko Räsänen, Hervé Bredin, Emmanuel Dupoux, Alejandrina Cristia

    Abstract: Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels. In order to fully realize the potential of these approaches and further our understanding of how infants learn language, simulations must closely emulate real-life situations by training on developmentally plausible corpora and b… ▽ More

    Submitted 8 June, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: Proceedings of Interspeech 2023

  2. arXiv:2003.14021  [pdf, ps, other

    cs.LG cs.SD eess.AS stat.ML

    A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification

    Authors: Juan M. Coria, Hervé Bredin, Sahar Ghannay, Sophie Rosset

    Abstract: Despite the growing popularity of metric learning approaches, very little work has attempted to perform a fair comparison of these techniques for speaker verification. We try to fill this gap and compare several metric learning loss functions in a systematic manner on the VoxCeleb dataset. The first family of loss functions is derived from the cross entropy loss (usually used for supervised classi… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

  3. arXiv:1907.10393  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    LSTM based Similarity Measurement with Spectral Clustering for Speaker Diarization

    Authors: Qingjian Lin, Ruiqing Yin, Ming Li, Hervé Bredin, Claude Barras

    Abstract: More and more neural network approaches have achieved considerable improvement upon submodules of speaker diarization system, including speaker change detection and segment-wise speaker embedding extraction. Still, in the clustering stage, traditional algorithms like probabilistic linear discriminant analysis (PLDA) are widely used for scoring the similarity between two speech segments. In this pa… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: Accepted for INTERSPEECH 2019

  4. arXiv:1609.04301  [pdf, other

    cs.SD stat.ML

    TristouNet: Triplet Loss for Speaker Turn Embedding

    Authors: Hervé Bredin

    Abstract: TristouNet is a neural network architecture based on Long Short-Term Memory recurrent networks, meant to project speech sequences into a fixed-dimensional euclidean space. Thanks to the triplet loss paradigm used for training, the resulting sequence embeddings can be compared directly with the euclidean distance, for speaker comparison purposes. Experiments on short (between 500ms and 5s) speech t… ▽ More

    Submitted 11 April, 2017; v1 submitted 14 September, 2016; originally announced September 2016.

    Comments: ICASSP 2017 (42nd IEEE International Conference on Acoustics, Speech and Signal Processing). Code available at http://github.com/hbredin/TristouNet