Skip to main content

Showing 1–3 of 3 results for author: Thebaud, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.19355  [pdf, other

    cs.SD cs.CR cs.LG eess.AS

    Unraveling Adversarial Examples against Speaker Identification -- Techniques for Attack Detection and Victim Model Classification

    Authors: Sonal Joshi, Thomas Thebaud, Jesús Villalba, Najim Dehak

    Abstract: Adversarial examples have proven to threaten speaker identification systems, and several countermeasures against them have been proposed. In this paper, we propose a method to detect the presence of adversarial examples, i.e., a binary classifier distinguishing between benign and adversarial examples. We build upon and extend previous work on attack type classification by exploring new architectur… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  2. arXiv:2309.04628  [pdf, other

    eess.AS cs.SD

    Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning

    Authors: Saurabhchand Bhati, Jesús Villalba, Laureano Moro-Velazquez, Thomas Thebaud, Najim Dehak

    Abstract: Visually grounded speech systems learn from paired images and their spoken captions. Recently, there have been attempts to utilize the visually grounded models trained from images and their corresponding text captions, such as CLIP, to improve speech-based visually grounded models' performance. However, the majority of these models only utilize the pretrained image encoder. Cascaded SpeechCLIP att… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

  3. arXiv:2110.05431  [pdf, other

    eess.AS cs.CR cs.LG cs.SD

    On the invertibility of a voice privacy system using embedding alignement

    Authors: Pierre Champion, Thomas Thebaud, Gaël Le Lan, Anthony Larcher, Denis Jouvet

    Abstract: This paper explores various attack scenarios on a voice anonymization system using embeddings alignment techniques. We use Wasserstein-Procrustes (an algorithm initially designed for unsupervised translation) or Procrustes analysis to match two sets of x-vectors, before and after voice anonymization, to mimic this transformation as a rotation function. We compute the optimal rotation and compare t… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

    Journal ref: ASRU 2021 - IEEE Automatic Speech Recognition and Understanding Workshop, Dec 2021, Cartagena, Colombia