Skip to main content

Showing 1–9 of 9 results for author: Meyer, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.06403  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Meta Learning Text-to-Speech Synthesis in over 7000 Languages

    Authors: Florian Lux, Sarina Meyer, Lyonel Behringer, Frank Zalkow, Phat Do, Matt Coler, Emanuël A. P. Habets, Ngoc Thang Vu

    Abstract: In this work, we take on the challenging task of building a single text-to-speech synthesis system that is capable of generating speech in over 7000 languages, many of which lack sufficient data for traditional TTS development. By leveraging a novel integration of massively multilingual pretraining and meta learning to approximate language representations, our approach enables zero-shot speech syn… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: accepted at Interspeech 2024

  2. arXiv:2404.02677  [pdf, other

    eess.AS cs.CL cs.CR

    The VoicePrivacy 2024 Challenge Evaluation Plan

    Authors: Natalia Tomashenko, Xiaoxiao Miao, Pierre Champion, Sarina Meyer, Xin Wang, Emmanuel Vincent, Michele Panariello, Nicholas Evans, Junichi Yamagishi, Massimiliano Todisco

    Abstract: The task of the challenge is to develop a voice anonymization system for speech data which conceals the speaker's voice identity while protecting linguistic content and emotional states. The organizers provide development and evaluation datasets and evaluation scripts, as well as baseline anonymization systems and a list of training resources formed on the basis of the participants' requests. Part… ▽ More

    Submitted 12 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: 19 pages, https://www.voiceprivacychallenge.org/. arXiv admin note: substantial text overlap with arXiv:2203.12468

  3. Controllable Generation of Artificial Speaker Embeddings through Discovery of Principal Directions

    Authors: Florian Lux, Pascal Tilli, Sarina Meyer, Ngoc Thang Vu

    Abstract: Customizing voice and speaking style in a speech synthesis system with intuitive and fine-grained controls is challenging, given that little data with appropriate labels is available. Furthermore, editing an existing human's voice also comes with ethical concerns. In this paper, we propose a method to generate artificial speaker embeddings that cannot be linked to a real human while offering intui… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Published at ISCA Interspeech 2023 https://www.isca-speech.org/archive/interspeech_2023/lux23_interspeech.html

  4. arXiv:2310.17499  [pdf, other

    cs.CL cs.LG eess.AS

    The IMS Toucan System for the Blizzard Challenge 2023

    Authors: Florian Lux, Julia Koch, Sarina Meyer, Thomas Bott, Nadja Schauffler, Pavel Denisov, Antje Schweitzer, Ngoc Thang Vu

    Abstract: For our contribution to the Blizzard Challenge 2023, we improved on the system we submitted to the Blizzard Challenge 2021. Our approach entails a rule-based text-to-phoneme processing system that includes rule-based disambiguation of homographs in the French language. It then transforms the phonemes to spectrograms as intermediate representations using a fast and efficient non-autoregressive synt… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Published at the Blizzard Challenge Workshop 2023, colocated with the Speech Synthesis Workshop 2023, a sattelite event of the Interspeech 2023

  5. VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research

    Authors: Sarina Meyer, Xiaoxiao Miao, Ngoc Thang Vu

    Abstract: Speaker anonymization is the task of modifying a speech recording such that the original speaker cannot be identified anymore. Since the first Voice Privacy Challenge in 2020, along with the release of a framework, the popularity of this research topic is continually increasing. However, the comparison and combination of different anonymization approaches remains challenging due to the complexity… ▽ More

    Submitted 21 December, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted by OJSP-ICASSP 2024 https://ieeexplore.ieee.org/document/10365329

  6. arXiv:2305.01871  [pdf

    physics.med-ph eess.IV

    Convolutional neural network-based single-shot speckle tracking for x-ray phase-contrast imaging

    Authors: Serena Qinyun Z. Shi, Nadav Shapira, Peter B. Noël, Sebastian Meyer

    Abstract: X-ray phase-contrast imaging offers enhanced sensitivity for weakly-attenuating materials, such as breast and brain tissue, but has yet to be widely implemented clinically due to high coherence requirements and expensive x-ray optics. Speckle-based phase contrast imaging has been proposed as an affordable and simple alternative; however, obtaining high-quality phase-contrast images requires accura… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  7. arXiv:2210.07002  [pdf, other

    cs.SD cs.CL eess.AS

    Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

    Authors: Sarina Meyer, Pascal Tilli, Pavel Denisov, Florian Lux, Julia Koch, Ngoc Thang Vu

    Abstract: In order to protect the privacy of speech data, speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings. This typically comes with a privacy-utility trade-off between protection of individuals and usability of the data for downstream applications. One of the challenges in this context is to create non-existent voices that sound as natural as possi… ▽ More

    Submitted 20 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: IEEE Spoken Language Technology Workshop 2022

  8. arXiv:2207.04834  [pdf, other

    cs.SD cs.CR cs.LG eess.AS

    Speaker Anonymization with Phonetic Intermediate Representations

    Authors: Sarina Meyer, Florian Lux, Pavel Denisov, Julia Koch, Pascal Tilli, Ngoc Thang Vu

    Abstract: In this work, we propose a speaker anonymization pipeline that leverages high quality automatic speech recognition and synthesis systems to generate speech conditioned on phonetic transcriptions and anonymized speaker embeddings. Using phones as the intermediate representation ensures near complete elimination of speaker identity information from the input while preserving the original phonetic co… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: Accepted at Interspeech 2022

  9. arXiv:1907.07805  [pdf, other

    eess.SP

    Novel Receiver for Superparamagnetic Iron Oxide Nanoparticles in a Molecular Communication Setting

    Authors: Max Bartunik, Maximilian Lübke, Harald Unterweger, Christoph Alexiou, Sebastian Meyer, Doaa Ahmed, Georg Fischer, Wayan Wicke, Vahid Jamali Kooshkghazi, Robert Schober, Jens Kirchner

    Abstract: Superparamagnetic iron oxide nanoparticles (SPIONs) have recently been introduced as information carriers in a testbed for molecular communication (MC) in duct flow. Here, a new receiver for this testbed is presented, based on the concept of a bridge circuit. The capability for a reliable transmission using the testbed and detection of the proposed receiver was evaluated by sending a text message… ▽ More

    Submitted 17 July, 2019; originally announced July 2019.

    Comments: ACM NanoCom 2019