Skip to main content

Showing 1–20 of 20 results for author: Bonastre, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2403.01355  [pdf, ps, other

    eess.AS cs.LG

    a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification

    Authors: Hye-** Shim, Jee-weon Jung, Tomi Kinnunen, Nicholas Evans, Jean-Francois Bonastre, Itshak Lapidot

    Abstract: Spoofing detection is today a mainstream research topic. Standard metrics can be applied to evaluate the performance of isolated spoofing detection solutions and others have been proposed to support their evaluation when they are combined with speaker detection. These either have well-known deficiencies or restrict the architectural approach to combine speaker and spoof detectors. In this paper, w… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 8 pages, submitted to Speaker Odyssey 2024

  2. arXiv:2402.16429  [pdf

    cs.IR eess.SP

    Effect of utterance duration and phonetic content on speaker identification using second-order statistical methods

    Authors: Ivan Magrin-Chagnolleau, Jean François Bonastre, Frédéric Bimbot

    Abstract: Second-order statistical methods show very good results for automatic speaker identification in controlled recording conditions. These approaches are generally used on the entire speech material available. In this paper, we study the influence of the content of the test speech material on the performances of such methods, i.e. under a more analytical approach. The goal is to investigate on the kin… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Journal ref: Eurospeech 1995, Sep 1995, Madrid, Spain. pp.337-340

  3. arXiv:2310.05534  [pdf, other

    eess.AS cs.SD

    Thech. Report: Genuinization of Speech waveform PMF for speaker detection spoofing and countermeasures

    Authors: Itshak Lapidot, Jean-Francois Bonastre

    Abstract: In the context of spoofing attacks in speaker recognition systems, we observed that the waveform probability mass function (PMF) of genuine speech differs significantly from the PMF of speech resulting from the attacks. This is true for synthesized or converted speech as well as replayed speech. We also noticed that this observation seems to have a significant impact on spoofing detection performa… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: 17 pages, 11 figures

  4. arXiv:2309.06141  [pdf, other

    cs.SD eess.AS

    SynVox2: Towards a privacy-friendly VoxCeleb2 dataset

    Authors: Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Nicholas Evans, Massimiliano Todisco, Jean-François Bonastre, Mickael Rouvier

    Abstract: The success of deep learning in speaker recognition relies heavily on the use of large datasets. However, the data-hungry nature of deep learning methods has already being questioned on account the ethical, privacy, and legal concerns that arise when using large-scale datasets of natural speech collected from real human speakers. For example, the widely-used VoxCeleb2 dataset for speaker recogniti… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: conference

  5. arXiv:2302.10790  [pdf, other

    eess.AS cs.LG cs.SD

    Federated Learning for ASR based on Wav2vec 2.0

    Authors: Tuan Nguyen, Salima Mdhaffar, Natalia Tomashenko, Jean-François Bonastre, Yannick Estève

    Abstract: This paper presents a study on the use of federated learning to train an ASR model based on a wav2vec 2.0 model pre-trained by self supervision. Carried out on the well-known TED-LIUM 3 dataset, our experiments show that such a model can obtain, with no use of a language model, a word error rate of 10.92% on the official TED-LIUM 3 test set, without sharing any data from the different users. We al… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: 5 pages, accepted in ICASSP 2023

  6. arXiv:2211.16065  [pdf, other

    eess.AS cs.SD

    Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline

    Authors: Paul-Gauthier Noé, Xiaoxiao Miao, Xin Wang, Junichi Yamagishi, Jean-François Bonastre, Driss Matrouf

    Abstract: The use of modern vocoders in an analysis/synthesis pipeline allows us to investigate high-quality voice conversion that can be used for privacy purposes. Here, we propose to transform the speaker embedding and the pitch in order to hide the sex of the speaker. ECAPA-TDNN-based speaker representation fed into a HiFiGAN vocoder is protected using a neural-discriminant analysis approach, which is co… ▽ More

    Submitted 24 March, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: Accepted to ICASSP 2023

  7. arXiv:2205.07123  [pdf, other

    cs.CL cs.CR eess.AS

    The VoicePrivacy 2020 Challenge Evaluation Plan

    Authors: Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-François Bonastre, Paul-Gauthier Noé, Massimiliano Todisco

    Abstract: The VoicePrivacy Challenge aims to promote the development of privacy preservation tools for speech technology by gathering a new community to define the tasks of interest and the evaluation methodology, and benchmarking solutions through a series of challenges. In this document, we formulate the voice anonymization task selected for the VoicePrivacy 2020 Challenge and describe the datasets used f… ▽ More

    Submitted 14 May, 2022; originally announced May 2022.

    Comments: arXiv admin note: text overlap with arXiv:2203.12468

  8. arXiv:2203.12468  [pdf, other

    eess.AS cs.CL cs.CR

    The VoicePrivacy 2022 Challenge Evaluation Plan

    Authors: Natalia Tomashenko, Xin Wang, Xiaoxiao Miao, Hubert Nourtel, Pierre Champion, Massimiliano Todisco, Emmanuel Vincent, Nicholas Evans, Junichi Yamagishi, Jean-François Bonastre

    Abstract: For new participants - Executive summary: (1) The task is to develop a voice anonymization system for speech data which conceals the speaker's voice identity while protecting linguistic content, paralinguistic attributes, intelligibility and naturalness. (2) Training, development and evaluation datasets are provided in addition to 3 different baseline anonymization systems, evaluation scripts, and… ▽ More

    Submitted 28 September, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: the file is unchanged; minor correction in metadata

  9. arXiv:2111.04194  [pdf, other

    cs.CL cs.SD eess.AS

    Retrieving Speaker Information from Personalized Acoustic Models for Speech Recognition

    Authors: Salima Mdhaffar, Jean-François Bonastre, Marc Tommasi, Natalia Tomashenko, Yannick Estève

    Abstract: The widespread of powerful personal devices capable of collecting voice of their users has opened the opportunity to build speaker adapted speech recognition system (ASR) or to participate to collaborative learning of ASR. In both cases, personalized acoustic models (AM), i.e. fine-tuned AM with specific speaker data, can be built. A question that naturally arises is whether the dissemination of p… ▽ More

    Submitted 7 November, 2021; originally announced November 2021.

  10. arXiv:2111.03777  [pdf, other

    cs.CL cs.CR cs.SD eess.AS

    Privacy attacks for automatic speech recognition acoustic models in a federated learning framework

    Authors: Natalia Tomashenko, Salima Mdhaffar, Marc Tommasi, Yannick Estève, Jean-François Bonastre

    Abstract: This paper investigates methods to effectively retrieve speaker information from the personalized speaker adapted neural network acoustic models (AMs) in automatic speech recognition (ASR). This problem is especially important in the context of federated learning of ASR acoustic models where a global model is learnt on the server based on the updates received from multiple clients. We propose an a… ▽ More

    Submitted 14 January, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

    Comments: Submitted to ICASSP 2022

    Journal ref: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 6972-6976

  11. arXiv:2110.06760  [pdf, other

    eess.AS

    Revisiting Speech Content Privacy

    Authors: Jennifer Williams, Junichi Yamagishi, Paul-Gauthier Noe, Cassia Valentini Botinhao, Jean-Francois Bonastre

    Abstract: In this paper, we discuss an important aspect of speech privacy: protecting spoken content. New capabilities from the field of machine learning provide a unique and timely opportunity to revisit speech content protection. There are many different applications of content privacy, even though this area has been under-explored in speech technology research. This paper presents several scenarios that… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

    Comments: Accepted to ISCA Security and Privacy in Speech Communication (1st SPSC Symposium)

  12. arXiv:2110.05840  [pdf, other

    cs.CR eess.AS

    A bridge between features and evidence for binary attribute-driven perfect privacy

    Authors: Paul-Gauthier Noé, Andreas Nautsch, Driss Matrouf, Pierre-Michel Bousquet, Jean-François Bonastre

    Abstract: Attribute-driven privacy aims to conceal a single user's attribute, contrary to anonymisation that tries to hide the full identity of the user in some data. When the attribute to protect from malicious inferences is binary, perfect privacy requires the log-likelihood-ratio to be zero resulting in no strength-of-evidence. This work presents an approach based on normalizing flow that maps a feature… ▽ More

    Submitted 23 January, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: ICASSP 2022

  13. arXiv:2109.00648  [pdf, other

    cs.CL cs.SD eess.AS

    The VoicePrivacy 2020 Challenge: Results and findings

    Authors: Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O'Brien, Anaïs Chanclu, Jean-François Bonastre, Massimiliano Todisco, Mohamed Maouche

    Abstract: This paper presents the results and analyses stemming from the first VoicePrivacy 2020 Challenge which focuses on develo** anonymization solutions for speech technology. We provide a systematic overview of the challenge design with an analysis of submitted systems and evaluation results. In particular, we describe the voice anonymization task and datasets used for system development and evaluati… ▽ More

    Submitted 26 September, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

    Comments: Submitted to the Special Issue on Voice Privacy (Computer Speech and Language Journal - Elsevier); under review

  14. arXiv:2109.00281  [pdf, other

    cs.CR cs.SD eess.AS

    Benchmarking and challenges in security and privacy for voice biometrics

    Authors: Jean-Francois Bonastre, Hector Delgado, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Xuechen Liu, Andreas Nautsch, Paul-Gauthier Noe, Jose Patino, Md Sahidullah, Brij Mohan Lal Srivastava, Massimiliano Todisco, Natalia Tomashenko, Emmanuel Vincent, Xin Wang, Junichi Yamagishi

    Abstract: For many decades, research in speech technologies has focused upon improving reliability. With this now meeting user expectations for a range of diverse applications, speech technology is today omni-present. As result, a focus on security and privacy has now come to the fore. Here, the research effort is in its relative infancy and progress calls for greater, multidisciplinary collaboration with s… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Comments: Submitted to the symposium of the ISCA Security & Privacy in Speech Communications (SPSC) special interest group

  15. arXiv:2012.04454  [pdf, other

    eess.AS cs.AI cs.CR

    Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation

    Authors: Paul-Gauthier Noé, Mohammad Mohammadamini, Driss Matrouf, Titouan Parcollet, Andreas Nautsch, Jean-François Bonastre

    Abstract: In speech technologies, speaker's voice representation is used in many applications such as speech recognition, voice conversion, speech synthesis and, obviously, user authentication. Modern vocal representations of the speaker are based on neural embeddings. In addition to the targeted information, these representations usually contain sensitive information about the speaker, like the age, sex, p… ▽ More

    Submitted 16 June, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: Accepted to Interspeech 2021

  16. arXiv:2008.13144  [pdf, other

    eess.AS cs.CR

    Speech Pseudonymisation Assessment Using Voice Similarity Matrices

    Authors: Paul-Gauthier Noé, Jean-François Bonastre, Driss Matrouf, Natalia Tomashenko, Andreas Nautsch, Nicholas Evans

    Abstract: The proliferation of speech technologies and rising privacy legislation calls for the development of privacy preservation solutions for speech applications. These are essential since speech signals convey a wealth of rich, personal and potentially sensitive information. Anonymisation, the focus of the recent VoicePrivacy initiative, is one strategy to protect speaker identity information. Pseudony… ▽ More

    Submitted 30 August, 2020; originally announced August 2020.

    Comments: Interspeech 2020

  17. The Privacy ZEBRA: Zero Evidence Biometric Recognition Assessment

    Authors: Andreas Nautsch, Jose Patino, Natalia Tomashenko, Junichi Yamagishi, Paul-Gauthier Noe, Jean-Francois Bonastre, Massimiliano Todisco, Nicholas Evans

    Abstract: Mounting privacy legislation calls for the preservation of privacy in speech technology, though solutions are gravely lacking. While evaluation campaigns are long-proven tools to drive progress, the need to consider a privacy adversary implies that traditional approaches to evaluation must be adapted to the assessment of privacy and privacy preservation solutions. This paper presents the first ste… ▽ More

    Submitted 20 May, 2020; v1 submitted 19 May, 2020; originally announced May 2020.

    Comments: submitted to Interspeech 2020

    Journal ref: Proc Interspeech 2020

  18. arXiv:1911.01601  [pdf, other

    eess.AS cs.CR cs.SD eess.SP

    ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech

    Authors: Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Hector Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sebastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika , et al. (15 additional authors not shown)

    Abstract: Automatic speaker verification (ASV) is one of the most natural and convenient means of biometric person recognition. Unfortunately, just like all other biometric systems, ASV is vulnerable to spoofing, also referred to as "presentation attacks." These vulnerabilities are generally unacceptable and call for spoofing countermeasures or "presentation attack detection" systems. In addition to imperso… ▽ More

    Submitted 14 July, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

    Comments: Accepted, Computer Speech and Language. This manuscript version is made available under the CC-BY-NC-ND 4.0. For the published version on Elsevier website, please visit https://doi.org/10.1016/j.csl.2020.101114

  19. arXiv:1905.13561  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    Speaker Anonymization Using X-vector and Neural Waveform Models

    Authors: Fuming Fang, Xin Wang, Junichi Yamagishi, Isao Echizen, Massimiliano Todisco, Nicholas Evans, Jean-Francois Bonastre

    Abstract: The social media revolution has produced a plethora of web services to which users can easily upload and share multimedia documents. Despite the popularity and convenience of such services, the sharing of such inherently personal data, including speech data, raises obvious security and privacy concerns. In particular, a user's speech data may be acquired and used with speech synthesis systems to p… ▽ More

    Submitted 29 May, 2019; originally announced May 2019.

    Comments: Submitted to the 10th ISCA Speech Synthesis Workshop (SSW10)

  20. arXiv:1904.07386  [pdf, other

    eess.AS cs.CL cs.SD

    I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

    Authors: Kong Aik Lee, Ville Hautamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, **g Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Hector Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda , et al. (21 additional authors not shown)

    Abstract: The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the res… ▽ More

    Submitted 15 April, 2019; originally announced April 2019.

    Comments: 5 pages