Skip to main content

Showing 1–5 of 5 results for author: Malisz, Z

Searching in archive eess. Search in all archives.
.
  1. arXiv:2404.10112  [pdf, other

    cs.CL cs.SD eess.AS

    PRODIS -- a speech database and a phoneme-based language model for the study of predictability effects in Polish

    Authors: Zofia Malisz, Jan Foremski, Małgorzata Kul

    Abstract: We present a speech database and a phoneme-level language model of Polish. The database and model are designed for the analysis of prosodic and discourse factors and their impact on acoustic parameters in interaction with predictability effects. The database is also the first large, publicly available Polish speech corpus of excellent acoustic quality that can be used for phonetic analysis and tra… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: To appear in the proceedings of LREC2024: Language Resources and Evaluation Conference 2024, Turin, Italy

  2. arXiv:2307.03168  [pdf, other

    eess.AS

    Recovering implicit pitch contours from formants in whispered speech

    Authors: Pablo Pérez Zarazaga, Zofia Malisz

    Abstract: Whispered speech is characterised by a noise-like excitation that results in the lack of fundamental frequency. Considering that prosodic phenomena such as intonation are perceived through f0 variation, the perception of whispered prosody is relatively difficult. At the same time, studies have shown that speakers do attempt to produce intonation when whispering and that prosodic variability is bei… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: 5 pages, 3 figures, 2 tables, Accepted at ICPhS 2023

  3. arXiv:2306.01957  [pdf, other

    eess.AS

    Speaker-independent neural formant synthesis

    Authors: Pablo Pérez Zarazaga, Zofia Malisz, Gustav Eje Henter, Lauri Juvela

    Abstract: We describe speaker-independent speech synthesis driven by a small set of phonetically meaningful speech parameters such as formant frequencies. The intention is to leverage deep-learning advances to provide a highly realistic signal generator that includes control affordances required for stimulus creation in the speech sciences. Our approach turns input speech parameters into predicted mel-spect… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: 5 pages, 4 figures. Article accepted at INTERSPEECH 2023

  4. arXiv:2303.07442  [pdf, other

    eess.AS cs.SD

    A processing framework to access large quantities of whispered speech found in ASMR

    Authors: Pablo Perez Zarazaga, Gustav Eje Henter, Zofia Malisz

    Abstract: Whispering is a ubiquitous mode of communication that humans use daily. Despite this, whispered speech has been poorly served by existing speech technology due to a shortage of resources and processing methodology. To remedy this, this paper provides a processing framework that enables access to large and unique data of high-quality whispered speech. We obtain the data from recordings submitted to… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Accepted at ICASSP 2023, 5 pages, 2 figures, 2 tables

  5. arXiv:2202.10973  [pdf, other

    eess.AS cs.HC cs.LG cs.SD

    Wavebender GAN: An architecture for phonetically meaningful speech manipulation

    Authors: Gustavo Teodoro Döhler Beck, Ulme Wennberg, Zofia Malisz, Gustav Eje Henter

    Abstract: Deep learning has revolutionised synthetic speech quality. However, it has thus far delivered little value to the speech science community. The new methods do not meet the controllability demands that practitioners in this area require e.g.: in listening tests with manipulated speech stimuli. Instead, control of different speech properties in such stimuli is achieved by using legacy signal-process… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

    Comments: 5 pages, 4 figures; to appear at ICASSP 2022

    MSC Class: 68T07 ACM Class: I.2.7; I.2.6; J.5; H.5.5