Skip to main content

Showing 1–50 of 61 results for author: Evans, N

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.07816  [pdf, other

    eess.AS cs.CL cs.SD

    Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio

    Authors: Lin Zhang, Xin Wang, Erica Cooper, Mireia Diez, Federico Landini, Nicholas Evans, Junichi Yamagishi

    Abstract: This paper defines Spoof Diarization as a novel task in the Partial Spoof (PS) scenario. It aims to determine what spoofed when, which includes not only locating spoof regions but also clustering them according to different spoofing methods. As a pioneering study in spoof diarization, we focus on defining the task, establishing evaluation metrics, and proposing a benchmark model, namely the Counte… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  2. arXiv:2406.05339  [pdf, other

    eess.AS cs.AI

    To what extent can ASV systems naturally defend against spoofing attacks?

    Authors: Jee-weon Jung, Xin Wang, Nicholas Evans, Shinji Watanabe, Hye-** Shim, Hemlata Tak, Sidhhant Arora, Junichi Yamagishi, Joon Son Chung

    Abstract: The current automatic speaker verification (ASV) task involves making binary decisions on two types of trials: target and non-target. However, emerging advancements in speech generation technology pose significant threats to the reliability of ASV systems. This study investigates whether ASV effortlessly acquires robustness against spoofing attacks (i.e., zero-shot capability) by systematically ex… ▽ More

    Submitted 14 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: 5 pages, 3 figures, 3 tables, Interspeech 2024

  3. arXiv:2406.03512  [pdf, other

    cs.SD cs.AI eess.AS

    Harder or Different? Understanding Generalization of Audio Deepfake Detection

    Authors: Nicolas M. Müller, Nicholas Evans, Hemlata Tak, Philip Sperl, Konstantin Böttinger

    Abstract: Recent research has highlighted a key issue in speech deepfake detection: models trained on one set of deepfakes perform poorly on others. The question arises: is this due to the continuously improving quality of Text-to-Speech (TTS) models, i.e., are newer DeepFakes just 'harder' to detect? Or, is it because deepfakes generated with one model are fundamentally different to those generated using a… ▽ More

    Submitted 12 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Journal ref: Interspeech 2024

  4. arXiv:2404.17810  [pdf, other

    eess.AS cs.SD

    A Comparison of Differential Performance Metrics for the Evaluation of Automatic Speaker Verification Fairness

    Authors: Oubaida Chouchane, Christoph Busch, Chiara Galdi, Nicholas Evans, Massimiliano Todisco

    Abstract: When decisions are made and when personal data is treated by automated processes, there is an expectation of fairness -- that members of different demographic groups receive equitable treatment. This expectation applies to biometric systems such as automatic speaker verification (ASV). We present a comparison of three candidate fairness metrics and extend previous work performed for face recogniti… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 8 pages, 7 figures

  5. arXiv:2404.02677  [pdf, other

    eess.AS cs.CL cs.CR

    The VoicePrivacy 2024 Challenge Evaluation Plan

    Authors: Natalia Tomashenko, Xiaoxiao Miao, Pierre Champion, Sarina Meyer, Xin Wang, Emmanuel Vincent, Michele Panariello, Nicholas Evans, Junichi Yamagishi, Massimiliano Todisco

    Abstract: The task of the challenge is to develop a voice anonymization system for speech data which conceals the speaker's voice identity while protecting linguistic content and emotional states. The organizers provide development and evaluation datasets and evaluation scripts, as well as baseline anonymization systems and a list of training resources formed on the basis of the participants' requests. Part… ▽ More

    Submitted 12 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: 19 pages, https://www.voiceprivacychallenge.org/. arXiv admin note: substantial text overlap with arXiv:2203.12468

  6. arXiv:2403.01355  [pdf, ps, other

    eess.AS cs.LG

    a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification

    Authors: Hye-** Shim, Jee-weon Jung, Tomi Kinnunen, Nicholas Evans, Jean-Francois Bonastre, Itshak Lapidot

    Abstract: Spoofing detection is today a mainstream research topic. Standard metrics can be applied to evaluate the performance of isolated spoofing detection solutions and others have been proposed to support their evaluation when they are combined with speaker detection. These either have well-known deficiencies or restrict the architectural approach to combine speaker and spoof detectors. In this paper, w… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 8 pages, submitted to Speaker Odyssey 2024

  7. arXiv:2309.14129  [pdf, other

    eess.AS cs.SD

    Speaker anonymization using neural audio codec language models

    Authors: Michele Panariello, Francesco Nespoli, Massimiliano Todisco, Nicholas Evans

    Abstract: The vast majority of approaches to speaker anonymization involve the extraction of fundamental frequency estimates, linguistic features and a speaker embedding which is perturbed to obfuscate the speaker identity before an anonymized speech waveform is resynthesized using a vocoder. Recent work has shown that x-vector transformations are difficult to control consistently: other sources of speaker… ▽ More

    Submitted 12 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: Accepted at ICASSP 2024

  8. arXiv:2309.12237  [pdf, other

    cs.CR cs.LG cs.SD eess.AS eess.IV stat.CO

    t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators

    Authors: Tomi Kinnunen, Kong Aik Lee, Hemlata Tak, Nicholas Evans, Andreas Nautsch

    Abstract: Presentation attack (spoofing) detection (PAD) typically operates alongside biometric verification to improve reliablity in the face of spoofing attacks. Even though the two sub-systems operate in tandem to solve the single task of reliable biometric verification, they address different detection tasks and are hence typically evaluated separately. Evidence shows that this approach is suboptimal. W… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence. For associated codes, see https://github.com/TakHemlata/T-EER (Github) and https://colab.research.google.com/drive/1ga7eiKFP11wOFMuZjThLJlkBcwEG6_4m?usp=sharing (Google Colab)

  9. arXiv:2309.09586  [pdf, ps, other

    cs.CR cs.SD eess.AS

    Spoofing attack augmentation: can differently-trained attack models improve generalisation?

    Authors: Wanying Ge, Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Nicholas Evans

    Abstract: A reliable deepfake detector or spoofing countermeasure (CM) should be robust in the face of unpredictable spoofing attacks. To encourage the learning of more generaliseable artefacts, rather than those specific only to known attacks, CMs are usually exposed to a broad variety of different attacks during training. Even so, the performance of deep-learning-based CM solutions are known to vary, some… ▽ More

    Submitted 8 January, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024

  10. arXiv:2309.06141  [pdf, other

    cs.SD eess.AS

    SynVox2: Towards a privacy-friendly VoxCeleb2 dataset

    Authors: Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Nicholas Evans, Massimiliano Todisco, Jean-François Bonastre, Mickael Rouvier

    Abstract: The success of deep learning in speaker recognition relies heavily on the use of large datasets. However, the data-hungry nature of deep learning methods has already being questioned on account the ethical, privacy, and legal concerns that arise when using large-scale datasets of natural speech collected from real human speakers. For example, the widely-used VoxCeleb2 dataset for speaker recogniti… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: conference

  11. arXiv:2308.14049  [pdf, other

    eess.AS cs.SD

    Fairness and Privacy in Voice Biometrics:A Study of Gender Influences Using wav2vec 2.0

    Authors: Oubaida Chouchane, Michele Panariello, Chiara Galdi, Massimiliano Todisco, Nicholas Evans

    Abstract: This study investigates the impact of gender information on utility, privacy, and fairness in voice biometric systems, guided by the General Data Protection Regulation (GDPR) mandates, which underscore the need for minimizing the processing and storage of private and sensitive data, and ensuring fairness in automated decision-making systems. We adopt an approach that involves the fine-tuning of th… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: 7 pages

  12. arXiv:2307.08403  [pdf, other

    eess.AS cs.LG cs.SD

    Vocoder drift compensation by x-vector alignment in speaker anonymisation

    Authors: Michele Panariello, Massimiliano Todisco, Nicholas Evans

    Abstract: For the most popular x-vector-based approaches to speaker anonymisation, the bulk of the anonymisation can stem from vocoding rather than from the core anonymisation function which is used to substitute an original speaker x-vector with that of a fictitious pseudo-speaker. This phenomenon can impede the design of better anonymisation systems since there is a lack of fine-grained control over the x… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted at the ISCA SPSC Symposium 2023

  13. arXiv:2306.07655  [pdf, other

    eess.AS cs.CR cs.LG

    Malafide: a novel adversarial convolutive noise attack against deepfake and spoofing detection systems

    Authors: Michele Panariello, Wanying Ge, Hemlata Tak, Massimiliano Todisco, Nicholas Evans

    Abstract: We present Malafide, a universal adversarial attack against automatic speaker verification (ASV) spoofing countermeasures (CMs). By introducing convolutional noise using an optimised linear time-invariant filter, Malafide attacks can be used to compromise CM reliability while preserving other speech attributes such as quality and the speaker's voice. In contrast to other adversarial attacks propos… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: Accepted at INTERSPEECH 2023

  14. arXiv:2306.02892  [pdf, other

    eess.AS

    Vocoder drift in x-vector-based speaker anonymization

    Authors: Michele Panariello, Massimiliano Todisco, Nicholas Evans

    Abstract: State-of-the-art approaches to speaker anonymization typically employ some form of perturbation function to conceal speaker information contained within an x-vector embedding, then resynthesize utterances in the voice of a new pseudo-speaker using a vocoder. Strategies to improve the x-vector anonymization function have attracted considerable research effort, whereas vocoder impacts are generally… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted at INTERSPEECH 2023

  15. arXiv:2305.19051  [pdf, other

    eess.AS cs.AI cs.SD

    Towards single integrated spoofing-aware speaker verification embeddings

    Authors: Sung Hwan Mun, Hye-** Shim, Hemlata Tak, Xin Wang, Xuechen Liu, Md Sahidullah, Myeonghun Jeong, Min Hyun Han, Massimiliano Todisco, Kong Aik Lee, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Nam Soo Kim, Jee-weon Jung

    Abstract: This study aims to develop a single integrated spoofing-aware speaker verification (SASV) embeddings that satisfy two aspects. First, rejecting non-target speakers' input as well as target speakers' spoofed inputs should be addressed. Second, competitive performance should be demonstrated compared to the fusion of automatic speaker verification (ASV) and countermeasure (CM) embeddings, which outpe… ▽ More

    Submitted 1 June, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted by INTERSPEECH 2023. Code and models are available in https://github.com/sasv-challenge/ASVSpoof5-SASVBaseline

  16. arXiv:2305.17739  [pdf, other

    cs.SD cs.CL eess.AS

    Range-Based Equal Error Rate for Spoof Localization

    Authors: Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans, Junichi Yamagishi

    Abstract: Spoof localization, also called segment-level detection, is a crucial task that aims to locate spoofs in partially spoofed audio. The equal error rate (EER) is widely used to measure performance for such biometric scenarios. Although EER is the only threshold-free metric, it is usually calculated in a point-based way that uses scores and references with a pre-defined temporal resolution and counts… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted to Interspeech 2023

  17. arXiv:2303.07073  [pdf, other

    eess.AS

    Can spoofing countermeasure and speaker verification systems be jointly optimised?

    Authors: Wanying Ge, Hemlata Tak, Massimiliano Todisco, Nicholas Evans

    Abstract: Spoofing countermeasure (CM) and automatic speaker verification (ASV) sub-systems can be used in tandem with a backend classifier as a solution to the spoofing aware speaker verification (SASV) task. The two sub-systems are typically trained independently to solve different tasks. While our previous work demonstrated the potential of joint optimisation, it also showed a tendency to over-fit to spe… ▽ More

    Submitted 20 December, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: Accepted to ICASSP 2023

  18. arXiv:2210.02437  [pdf, other

    cs.SD cs.CR cs.MM eess.AS

    ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild

    Authors: Xuechen Liu, Xin Wang, Md Sahidullah, Jose Patino, Héctor Delgado, Tomi Kinnunen, Massimiliano Todisco, Junichi Yamagishi, Nicholas Evans, Andreas Nautsch, Kong Aik Lee

    Abstract: Benchmarking initiatives support the meaningful comparison of competing solutions to prominent problems in speech and language processing. Successive benchmarking evaluations typically reflect a progressive evolution from ideal lab conditions towards to those encountered in the wild. ASVspoof, the spoofing and deepfake detection initiative and challenge series, has followed the same trend. This ar… ▽ More

    Submitted 22 June, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: IEEE/ACM Transactions on Audio, Speech, and Language Processing

  19. arXiv:2209.00506  [pdf, other

    eess.AS

    On the potential of jointly-optimised solutions to spoofing attack detection and automatic speaker verification

    Authors: Wanying Ge, Hemlata Tak, Massimiliano Todisco, Nicholas Evans

    Abstract: The spoofing-aware speaker verification (SASV) challenge was designed to promote the study of jointly-optimised solutions to accomplish the traditionally separately-optimised tasks of spoofing detection and speaker verification. Jointly-optimised systems have the potential to operate in synergy as a better performing solution to the single task of reliable speaker verification. However, none of th… ▽ More

    Submitted 7 October, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: Accepted to IberSPEECH 2022 Conference

  20. arXiv:2205.07123  [pdf, other

    cs.CL cs.CR eess.AS

    The VoicePrivacy 2020 Challenge Evaluation Plan

    Authors: Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-François Bonastre, Paul-Gauthier Noé, Massimiliano Todisco

    Abstract: The VoicePrivacy Challenge aims to promote the development of privacy preservation tools for speech technology by gathering a new community to define the tasks of interest and the evaluation methodology, and benchmarking solutions through a series of challenges. In this document, we formulate the voice anonymization task selected for the VoicePrivacy 2020 Challenge and describe the datasets used f… ▽ More

    Submitted 14 May, 2022; originally announced May 2022.

    Comments: arXiv admin note: text overlap with arXiv:2203.12468

  21. arXiv:2204.09976  [pdf, other

    cs.SD eess.AS

    Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion

    Authors: Hye-** Shim, Hemlata Tak, Xuechen Liu, Hee-Soo Heo, Jee-weon Jung, Joon Son Chung, Soo-Whan Chung, Ha-** Yu, Bong-** Lee, Massimiliano Todisco, Héctor Delgado, Kong Aik Lee, Md Sahidullah, Tomi Kinnunen, Nicholas Evans

    Abstract: Deep learning has brought impressive progress in the study of both automatic speaker verification (ASV) and spoofing countermeasures (CM). Although solutions are mutually dependent, they have typically evolved as standalone sub-systems whereby CM solutions are usually designed for a fixed ASV system. The work reported in this paper aims to gauge the improvements in reliability that can be gained f… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

    Comments: 8 pages, accepted by Odyssey 2022

  22. arXiv:2204.05177  [pdf, other

    eess.AS cs.CR cs.SD

    The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance

    Authors: Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans, Junichi Yamagishi

    Abstract: Automatic speaker verification is susceptible to various manipulations and spoofing, such as text-to-speech synthesis, voice conversion, replay, tampering, adversarial attacks, and so on. We consider a new spoofing scenario called "Partial Spoof" (PS) in which synthesized or transformed speech segments are embedded into a bona fide utterance. While existing countermeasures (CMs) can detect fully s… ▽ More

    Submitted 30 January, 2023; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (DOI: 10.1109/TASLP.2022.3233236)

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 813-825, 2023

  23. arXiv:2203.14732  [pdf, other

    eess.AS

    SASV 2022: The First Spoofing-Aware Speaker Verification Challenge

    Authors: Jee-weon Jung, Hemlata Tak, Hye-** Shim, Hee-Soo Heo, Bong-** Lee, Soo-Whan Chung, Ha-** Yu, Nicholas Evans, Tomi Kinnunen

    Abstract: The first spoofing-aware speaker verification (SASV) challenge aims to integrate research efforts in speaker verification and anti-spoofing. We extend the speaker verification scenario by introducing spoofed trials to the usual set of target and impostor trials. In contrast to the established ASVspoof challenge where the focus is upon separate, independently optimised spoofing detection and speake… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: 5 pages, 2 figures, 2 tables, submitted to Interspeech 2022 as a conference paper

  24. arXiv:2203.12468  [pdf, other

    eess.AS cs.CL cs.CR

    The VoicePrivacy 2022 Challenge Evaluation Plan

    Authors: Natalia Tomashenko, Xin Wang, Xiaoxiao Miao, Hubert Nourtel, Pierre Champion, Massimiliano Todisco, Emmanuel Vincent, Nicholas Evans, Junichi Yamagishi, Jean-François Bonastre

    Abstract: For new participants - Executive summary: (1) The task is to develop a voice anonymization system for speech data which conceals the speaker's voice identity while protecting linguistic content, paralinguistic attributes, intelligibility and naturalness. (2) Training, development and evaluation datasets are provided in addition to 3 different baseline anonymization systems, evaluation scripts, and… ▽ More

    Submitted 28 September, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: the file is unchanged; minor correction in metadata

  25. arXiv:2202.13693  [pdf, other

    eess.AS cs.SD

    Explainable deepfake and spoofing detection: an attack analysis using SHapley Additive exPlanations

    Authors: Wanying Ge, Massimiliano Todisco, Nicholas Evans

    Abstract: Despite several years of research in deepfake and spoofing detection for automatic speaker verification, little is known about the artefacts that classifiers use to distinguish between bona fide and spoofed utterances. An understanding of these is crucial to the design of trustworthy, explainable solutions. In this paper we report an extension of our previous work to better understand classifier b… ▽ More

    Submitted 4 May, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: Accepted to Speaker Odyssey Workshop 2022

  26. arXiv:2202.12233  [pdf, other

    eess.AS cs.SD

    Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation

    Authors: Hemlata Tak, Massimiliano Todisco, Xin Wang, Jee-weon Jung, Junichi Yamagishi, Nicholas Evans

    Abstract: The performance of spoofing countermeasure systems depends fundamentally upon the use of sufficiently representative training data. With this usually being limited, current solutions typically lack generalisation to attacks encountered in the wild. Strategies to improve reliability in the face of uncontrolled, unpredictable attacks are hence needed. We report in this paper our efforts to use self-… ▽ More

    Submitted 28 February, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

    Comments: Submitted to Speaker Odyssey Workshop 2022

  27. arXiv:2201.10283  [pdf, ps, other

    cs.SD cs.CR eess.AS

    SASV Challenge 2022: A Spoofing Aware Speaker Verification Challenge Evaluation Plan

    Authors: Jee-weon Jung, Hemlata Tak, Hye-** Shim, Hee-Soo Heo, Bong-** Lee, Soo-Whan Chung, Hong-Goo Kang, Ha-** Yu, Nicholas Evans, Tomi Kinnunen

    Abstract: ASV (automatic speaker verification) systems are intrinsically required to reject both non-target (e.g., voice uttered by different speaker) and spoofed (e.g., synthesised or converted) inputs. However, there is little consideration for how ASV systems themselves should be adapted when they are expected to encounter spoofing attacks, nor when they operate in tandem with CMs (spoofing countermeasur… ▽ More

    Submitted 2 March, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

    Comments: Evaluation plan of the SASV Challenge 2022. See this webpage for more information: https://sasv-challenge.github.io

  28. arXiv:2111.04433  [pdf, other

    eess.AS cs.CR cs.SD eess.SP

    RawBoost: A Raw Data Boosting and Augmentation Method applied to Automatic Speaker Verification Anti-Spoofing

    Authors: Hemlata Tak, Madhu Kamble, Jose Patino, Massimiliano Todisco, Nicholas Evans

    Abstract: This paper introduces RawBoost, a data boosting and augmentation method for the design of more reliable spoofing detection solutions which operate directly upon raw waveform inputs. While RawBoost requires no additional data sources, e.g. noise recordings or impulse responses and is data, application and model agnostic, it is designed for telephony scenarios. Based upon the combination of linear a… ▽ More

    Submitted 22 February, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

    Comments: Accepted to IEEE ICASSP 2022

  29. arXiv:2110.03309  [pdf, other

    eess.AS

    Explaining deep learning models for spoofing and deepfake detection with SHapley Additive exPlanations

    Authors: Wanying Ge, Jose Patino, Massimiliano Todisco, Nicholas Evans

    Abstract: Substantial progress in spoofing and deepfake detection has been made in recent years. Nonetheless, the community has yet to make notable inroads in providing an explanation for how a classifier produces its output. The dominance of black box spoofing detection solutions is at further odds with the drive toward trustworthy, explainable artificial intelligence. This paper describes our use of SHapl… ▽ More

    Submitted 26 April, 2024; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: Accepted to ICASSP 2022

  30. arXiv:2110.01200  [pdf, other

    eess.AS cs.AI cs.LG

    AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks

    Authors: Jee-weon Jung, Hee-Soo Heo, Hemlata Tak, Hye-** Shim, Joon Son Chung, Bong-** Lee, Ha-** Yu, Nicholas Evans

    Abstract: Artefacts that differentiate spoofed from bona-fide utterances can reside in spectral or temporal domains. Their reliable detection usually depends upon computationally demanding ensemble systems where each subsystem is tuned to some specific artefacts. We seek to develop an efficient, single system that can detect a broad range of different spoofing attacks without score-level ensembles. We propo… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

    Comments: 5 pages, 1 figure, 3 tables, submitted to ICASSP2022

  31. arXiv:2109.00648  [pdf, other

    cs.CL cs.SD eess.AS

    The VoicePrivacy 2020 Challenge: Results and findings

    Authors: Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O'Brien, Anaïs Chanclu, Jean-François Bonastre, Massimiliano Todisco, Mohamed Maouche

    Abstract: This paper presents the results and analyses stemming from the first VoicePrivacy 2020 Challenge which focuses on develo** anonymization solutions for speech technology. We provide a systematic overview of the challenge design with an analysis of submitted systems and evaluation results. In particular, we describe the voice anonymization task and datasets used for system development and evaluati… ▽ More

    Submitted 26 September, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

    Comments: Submitted to the Special Issue on Voice Privacy (Computer Speech and Language Journal - Elsevier); under review

  32. arXiv:2109.00537  [pdf, other

    eess.AS cs.CR cs.LG cs.SD

    ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection

    Authors: Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Héctor Delgado

    Abstract: ASVspoof 2021 is the forth edition in the series of bi-annual challenges which aim to promote the study of spoofing and the design of countermeasures to protect automatic speaker verification systems from manipulation. In addition to a continued focus upon logical and physical access tasks in which there are a number of advances compared to previous editions, ASVspoof 2021 introduces a new task in… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Comments: Accepted to the ASVspoof 2021 Workshop

  33. arXiv:2109.00535  [pdf, other

    eess.AS cs.CR cs.LG cs.SD

    ASVspoof 2021: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan

    Authors: Héctor Delgado, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Xuechen Liu, Andreas Nautsch, Jose Patino, Md Sahidullah, Massimiliano Todisco, Xin Wang, Junichi Yamagishi

    Abstract: The automatic speaker verification spoofing and countermeasures (ASVspoof) challenge series is a community-led initiative which aims to promote the consideration of spoofing and the development of countermeasures. ASVspoof 2021 is the 4th in a series of bi-annual, competitive challenges where the goal is to develop countermeasures capable of discriminating between bona fide and spoofed or deepfake… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Comments: http://www.asvspoof.org

  34. arXiv:2109.00281  [pdf, other

    cs.CR cs.SD eess.AS

    Benchmarking and challenges in security and privacy for voice biometrics

    Authors: Jean-Francois Bonastre, Hector Delgado, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Xuechen Liu, Andreas Nautsch, Paul-Gauthier Noe, Jose Patino, Md Sahidullah, Brij Mohan Lal Srivastava, Massimiliano Todisco, Natalia Tomashenko, Emmanuel Vincent, Xin Wang, Junichi Yamagishi

    Abstract: For many decades, research in speech technologies has focused upon improving reliability. With this now meeting user expectations for a range of diverse applications, speech technology is today omni-present. As result, a focus on security and privacy has now come to the fore. Here, the research effort is in its relative infancy and progress calls for greater, multidisciplinary collaboration with s… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Comments: Submitted to the symposium of the ISCA Security & Privacy in Speech Communications (SPSC) special interest group

  35. arXiv:2107.12710  [pdf, other

    eess.AS cs.SD

    End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection

    Authors: Hemlata Tak, Jee-weon Jung, Jose Patino, Madhu Kamble, Massimiliano Todisco, Nicholas Evans

    Abstract: Artefacts that serve to distinguish bona fide speech from spoofed or deepfake speech are known to reside in specific subbands and temporal segments. Various approaches can be used to capture and model such artefacts, however, none works well across a spectrum of diverse spoofing attacks. Reliable detection then often depends upon the fusion of multiple detection systems, each tuned to detect diffe… ▽ More

    Submitted 23 August, 2021; v1 submitted 27 July, 2021; originally announced July 2021.

    Comments: Accepted in ASVspoof 2021 Workshop

  36. arXiv:2107.12212  [pdf, other

    eess.AS

    Raw Differentiable Architecture Search for Speech Deepfake and Spoofing Detection

    Authors: Wanying Ge, Jose Patino, Massimiliano Todisco, Nicholas Evans

    Abstract: End-to-end approaches to anti-spoofing, especially those which operate directly upon the raw signal, are starting to be competitive with their more traditional counterparts. Until recently, all such approaches consider only the learning of network parameters; the network architecture is still hand crafted. This too, however, can also be learned. Described in this paper is our attempt to learn auto… ▽ More

    Submitted 6 October, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: Accepted to ASVspoof 2021 Workshop

  37. arXiv:2106.06362  [pdf, other

    cs.SD cs.LG eess.AS stat.AP

    Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing

    Authors: Tomi Kinnunen, Andreas Nautsch, Md Sahidullah, Nicholas Evans, Xin Wang, Massimiliano Todisco, Héctor Delgado, Junichi Yamagishi, Kong Aik Lee

    Abstract: Whether it be for results summarization, or the analysis of classifier fusion, some means to compare different classifiers can often provide illuminating insight into their behaviour, (dis)similarity or complementarity. We propose a simple method to derive 2D representation from detection scores produced by an arbitrary set of binary classifiers in response to a common dataset. Based upon rank cor… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

    Comments: Accepted to Interspeech 2021. Example code available at https://github.com/asvspoof-challenge/classifier-adjacency

  38. arXiv:2106.04423  [pdf, other

    cs.SD eess.AS

    PANACEA cough sound-based diagnosis of COVID-19 for the DiCOVA 2021 Challenge

    Authors: Madhu R. Kamble, Jose A. Gonzalez-Lopez, Teresa Grau, Juan M. Espin, Lorenzo Cascioli, Yiqing Huang, Alejandro Gomez-Alanis, Jose Patino, Roberto Font, Antonio M. Peinado, Angel M. Gomez, Nicholas Evans, Maria A. Zuluaga, Massimiliano Todisco

    Abstract: The COVID-19 pandemic has led to the saturation of public health services worldwide. In this scenario, the early diagnosis of SARS-Cov-2 infections can help to stop or slow the spread of the virus and to manage the demand upon health services. This is especially important when resources are also being stretched by heightened demand linked to other seasonal diseases, such as the flu. In this contex… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: Accepted in INTERSPEECH 2021

  39. arXiv:2104.03654  [pdf, other

    eess.AS cs.CR cs.SD

    Graph Attention Networks for Anti-Spoofing

    Authors: Hemlata Tak, Jee-weon Jung, Jose Patino, Massimiliano Todisco, Nicholas Evans

    Abstract: The cues needed to detect spoofing attacks against automatic speaker verification are often located in specific spectral sub-bands or temporal segments. Previous works show the potential to learn these using either spectral or temporal self-attention mechanisms but not the relationships between neighbouring sub-bands or segments. This paper reports our use of graph attention networks (GATs) to mod… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

    Comments: Submitted to INTERSPEECH 2021

  40. arXiv:2104.03123  [pdf, other

    cs.LG cs.SD eess.AS

    Partially-Connected Differentiable Architecture Search for Deepfake and Spoofing Detection

    Authors: Wanying Ge, Michele Panariello, Jose Patino, Massimiliano Todisco, Nicholas Evans

    Abstract: This paper reports the first successful application of a differentiable architecture search (DARTS) approach to the deepfake and spoofing detection problems. An example of neural architecture search, DARTS operates upon a continuous, differentiable search space which enables both the architecture and parameters to be optimised via gradient descent. Solutions based on partially-connected DARTS use… ▽ More

    Submitted 30 June, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: Accepted to INTERSPEECH 2021

  41. arXiv:2104.02518  [pdf, other

    eess.AS cs.SD

    An Initial Investigation for Detecting Partially Spoofed Audio

    Authors: Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi, Jose Patino, Nicholas Evans

    Abstract: All existing databases of spoofed speech contain attack data that is spoofed in its entirety. In practice, it is entirely plausible that successful attacks can be mounted with utterances that are only partially spoofed. By definition, partially-spoofed utterances contain a mix of both spoofed and bona fide segments, which will likely degrade the performance of countermeasures trained with entirely… ▽ More

    Submitted 15 June, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: INTERSPEECH 2021

  42. arXiv:2102.05889  [pdf, other

    eess.AS cs.CR cs.SD

    ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech

    Authors: Andreas Nautsch, Xin Wang, Nicholas Evans, Tomi Kinnunen, Ville Vestman, Massimiliano Todisco, Héctor Delgado, Md Sahidullah, Junichi Yamagishi, Kong Aik Lee

    Abstract: The ASVspoof initiative was conceived to spearhead research in anti-spoofing for automatic speaker verification (ASV). This paper describes the third in a series of bi-annual challenges: ASVspoof 2019. With the challenge database and protocols being described elsewhere, the focus of this paper is on results and the top performing single and ensemble system submissions from 62 teams, all of which o… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

    Journal ref: IEEE Transactions on Biometrics, Behavior, and Identity Science 2021

  43. arXiv:2011.01130  [pdf, other

    eess.AS cs.CL

    Speaker anonymisation using the McAdams coefficient

    Authors: Jose Patino, Natalia Tomashenko, Massimiliano Todisco, Andreas Nautsch, Nicholas Evans

    Abstract: Anonymisation has the goal of manipulating speech signals in order to degrade the reliability of automatic approaches to speaker recognition, while preserving other aspects of speech, such as those relating to intelligibility and naturalness. This paper reports an approach to anonymisation that, unlike other current approaches, requires no training data, is based upon well-known signal processing… ▽ More

    Submitted 1 September, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: Accepted at INTERSPEECH 2021

  44. arXiv:2011.01108  [pdf, ps, other

    eess.AS

    End-to-end anti-spoofing with RawNet2

    Authors: Hemlata Tak, Jose Patino, Massimiliano Todisco, Andreas Nautsch, Nicholas Evans, Anthony Larcher

    Abstract: Spoofing countermeasures aim to protect automatic speaker verification systems from attempts to manipulate their reliability with the use of spoofed speech signals. While results from the most recent ASVspoof 2019 evaluation show great potential to detect most forms of attack, some continue to evade detection. This paper reports the first application of RawNet2 to anti-spoofing. RawNet2 ingests ra… ▽ More

    Submitted 16 December, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: Accepted to ICASSP 2021

  45. arXiv:2010.04038  [pdf, ps, other

    cs.SD cs.CV cs.LG eess.AS

    Texture-based Presentation Attack Detection for Automatic Speaker Verification

    Authors: Lazaro J. Gonzalez-Soler, Jose Patino, Marta Gomez-Barrero, Massimiliano Todisco, Christoph Busch, Nicholas Evans

    Abstract: Biometric systems are nowadays employed across a broad range of applications. They provide high security and efficiency and, in many cases, are user friendly. Despite these and other advantages, biometric systems in general and Automatic speaker verification (ASV) systems in particular can be vulnerable to attack presentations. The most recent ASVSpoof 2019 competition showed that most forms of at… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

  46. arXiv:2008.13144  [pdf, other

    eess.AS cs.CR

    Speech Pseudonymisation Assessment Using Voice Similarity Matrices

    Authors: Paul-Gauthier Noé, Jean-François Bonastre, Driss Matrouf, Natalia Tomashenko, Andreas Nautsch, Nicholas Evans

    Abstract: The proliferation of speech technologies and rising privacy legislation calls for the development of privacy preservation solutions for speech applications. These are essential since speech signals convey a wealth of rich, personal and potentially sensitive information. Anonymisation, the focus of the recent VoicePrivacy initiative, is one strategy to protect speaker identity information. Pseudony… ▽ More

    Submitted 30 August, 2020; originally announced August 2020.

    Comments: Interspeech 2020

  47. arXiv:2007.05979  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals

    Authors: Tomi Kinnunen, Héctor Delgado, Nicholas Evans, Kong Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, Xin Wang, Md Sahidullah, Junichi Yamagishi, Douglas A. Reynolds

    Abstract: Recent years have seen growing efforts to develop spoofing countermeasures (CMs) to protect automatic speaker verification (ASV) systems from being deceived by manipulated or artificial inputs. The reliability of spoofing CMs is typically gauged using the equal error rate (EER) metric. The primitive EER fails to reflect application requirements and the impact of spoofing and CMs upon ASV and its u… ▽ More

    Submitted 25 August, 2020; v1 submitted 12 July, 2020; originally announced July 2020.

    Comments: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (doi updated)

  48. arXiv:2005.10393  [pdf, other

    eess.AS cs.SD

    Spoofing Attack Detection using the Non-linear Fusion of Sub-band Classifiers

    Authors: Hemlata Tak, Jose Patino, Andreas Nautsch, Nicholas Evans, Massimiliano Todisco

    Abstract: The threat of spoofing can pose a risk to the reliability of automatic speaker verification. Results from the bi-annual ASVspoof evaluations show that effective countermeasures demand front-ends designed specifically for the detection of spoofing artefacts. Given the diversity in spoofing attacks, ensemble methods are particularly effective. The work in this paper shows that a bank of very simple… ▽ More

    Submitted 20 May, 2020; originally announced May 2020.

    Comments: Submitted to Interspeech 2020 conference, 5 pages

  49. The Privacy ZEBRA: Zero Evidence Biometric Recognition Assessment

    Authors: Andreas Nautsch, Jose Patino, Natalia Tomashenko, Junichi Yamagishi, Paul-Gauthier Noe, Jean-Francois Bonastre, Massimiliano Todisco, Nicholas Evans

    Abstract: Mounting privacy legislation calls for the preservation of privacy in speech technology, though solutions are gravely lacking. While evaluation campaigns are long-proven tools to drive progress, the need to consider a privacy adversary implies that traditional approaches to evaluation must be adapted to the assessment of privacy and privacy preservation solutions. This paper presents the first ste… ▽ More

    Submitted 20 May, 2020; v1 submitted 19 May, 2020; originally announced May 2020.

    Comments: submitted to Interspeech 2020

    Journal ref: Proc Interspeech 2020

  50. arXiv:2005.08245  [pdf

    eess.SP cs.AI cs.LG

    Dampen the Stop-and-Go Traffic with Connected and Automated Vehicles -- A Deep Reinforcement Learning Approach

    Authors: Liming Jiang, Yuanchang Xie, Danjue Chen, Tienan Li, Nicholas G. Evans

    Abstract: Stop-and-go traffic poses many challenges to tranportation system, but its formation and mechanism are still under exploration.however, it has been proved that by introducing Connected Automated Vehicles(CAVs) with carefully designed controllers one could dampen the stop-and-go waves in the vehicle fleet. Instead of using analytical model, this study adopts reinforcement learning to control the be… ▽ More

    Submitted 17 May, 2020; originally announced May 2020.