Showing 1–2 of 2 results for author: Schauffler, N

Search v0.5.6 released 2020-02-24

arXiv:2310.17499 [pdf, other]

cs.CL cs.LG eess.AS

The IMS Toucan System for the Blizzard Challenge 2023

Authors: Florian Lux, Julia Koch, Sarina Meyer, Thomas Bott, Nadja Schauffler, Pavel Denisov, Antje Schweitzer, Ngoc Thang Vu

Abstract: For our contribution to the Blizzard Challenge 2023, we improved on the system we submitted to the Blizzard Challenge 2021. Our approach entails a rule-based text-to-phoneme processing system that includes rule-based disambiguation of homographs in the French language. It then transforms the phonemes to spectrograms as intermediate representations using a fast and efficient non-autoregressive synt… ▽ More For our contribution to the Blizzard Challenge 2023, we improved on the system we submitted to the Blizzard Challenge 2021. Our approach entails a rule-based text-to-phoneme processing system that includes rule-based disambiguation of homographs in the French language. It then transforms the phonemes to spectrograms as intermediate representations using a fast and efficient non-autoregressive synthesis architecture based on Conformer and Glow. A GAN based neural vocoder that combines recent state-of-the-art approaches converts the spectrogram to the final wave. We carefully designed the data processing, training, and inference procedures for the challenge data. Our system identifier is G. Open source code and demo are available. △ Less

Submitted 26 October, 2023; originally announced October 2023.

Comments: Published at the Blizzard Challenge Workshop 2023, colocated with the Speech Synthesis Workshop 2023, a sattelite event of the Interspeech 2023
arXiv:2207.05549 [pdf, other]

eess.AS cs.CL cs.LG cs.SD

PoeticTTS -- Controllable Poetry Reading for Literary Studies

Authors: Julia Koch, Florian Lux, Nadja Schauffler, Toni Bernhart, Felix Dieterle, Jonas Kuhn, Sandra Richter, Gabriel Viehhauser, Ngoc Thang Vu

Abstract: Speech synthesis for poetry is challenging due to specific intonation patterns inherent to poetic speech. In this work, we propose an approach to synthesise poems with almost human like naturalness in order to enable literary scholars to systematically examine hypotheses on the interplay between text, spoken realisation, and the listener's perception of poems. To meet these special requirements fo… ▽ More Speech synthesis for poetry is challenging due to specific intonation patterns inherent to poetic speech. In this work, we propose an approach to synthesise poems with almost human like naturalness in order to enable literary scholars to systematically examine hypotheses on the interplay between text, spoken realisation, and the listener's perception of poems. To meet these special requirements for literary studies, we resynthesise poems by cloning prosodic values from a human reference recitation, and afterwards make use of fine-grained prosody control to manipulate the synthetic speech in a human-in-the-loop setting to alter the recitation w.r.t. specific phenomena. We find that finetuning our TTS model on poetry captures poetic intonation patterns to a large extent which is beneficial for prosody cloning and manipulation and verify the success of our approach both in an objective evaluation as well as in human studies. △ Less

Submitted 18 October, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

Comments: Presented at Interspeech 2022

Search v0.5.6 released 2020-02-24