Skip to main content

Showing 1–3 of 3 results for author: Lebourdais, M

.
  1. arXiv:2406.13385  [pdf, other

    eess.AS cs.AI cs.SD

    Explainable by-design Audio Segmentation through Non-Negative Matrix Factorization and Probing

    Authors: Martin Lebourdais, Théo Mariotte, Antonio Almudévar, Marie Tahon, Alfonso Ortega

    Abstract: Audio segmentation is a key task for many speech technologies, most of which are based on neural networks, usually considered as black boxes, with high-level performances. However, in many domains, among which health or forensics, there is not only a need for good performance but also for explanations about the output decision. Explanations derived directly from latent representations need to sati… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024, 5 pages, 2 figures, 3 tables

  2. arXiv:2307.13012  [pdf, other

    cs.SD cs.AI cs.NE eess.AS eess.SP

    Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains

    Authors: Martin Lebourdais, Théo Mariotte, Marie Tahon, Anthony Larcher, Antoine Laurent, Silvio Montresor, Sylvain Meignier, Jean-Hugh Thomas

    Abstract: Voice activity and overlapped speech detection (respectively VAD and OSD) are key pre-processing tasks for speaker diarization. The final segmentation performance highly relies on the robustness of these sub-tasks. Recent studies have shown VAD and OSD can be trained jointly using a multi-class classification model. However, these works are often restricted to a specific speech domain, lacking inf… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

  3. arXiv:2209.04167  [pdf, other

    cs.SD cs.AI eess.AS

    Overlapped speech and gender detection with WavLM pre-trained features

    Authors: Martin Lebourdais, Marie Tahon, Antoine Laurent, Sylvain Meignier

    Abstract: This article focuses on overlapped speech and gender detection in order to study interactions between women and men in French audiovisual media (Gender Equality Monitoring project). In this application context, we need to automatically segment the speech signal according to speakers gender, and to identify when at least two speakers speak at the same time. We propose to use WavLM model which has t… ▽ More

    Submitted 9 September, 2022; originally announced September 2022.

    Comments: Submitted and accepted to Interspeech 2022