Skip to main content

Showing 1–4 of 4 results for author: Harrison, P M C

.
  1. arXiv:2203.15379  [pdf, other

    cs.SD cs.HC eess.AS

    VoiceMe: Personalized voice generation in TTS

    Authors: Pol van Rijn, Silvan Mertes, Dominik Schiller, Piotr Dura, Hubert Siuzdak, Peter M. C. Harrison, Elisabeth André, Nori Jacoby

    Abstract: Novel text-to-speech systems can generate entirely new voices that were not seen during training. However, it remains a difficult task to efficiently create personalized voices from a high-dimensional speaker space. In this work, we use speaker embeddings from a state-of-the-art speaker verification model (SpeakerNet) trained on thousands of speakers to condition a TTS model. We employ a human sam… ▽ More

    Submitted 11 July, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: Accepted to Interspeech'22. Audio and video samples are available at: https://polvanrijn.github.io/VoiceMe/

  2. Exploring emotional prototypes in a high dimensional TTS latent space

    Authors: Pol van Rijn, Silvan Mertes, Dominik Schiller, Peter M. C. Harrison, Pauline Larrouy-Maestri, Elisabeth André, Nori Jacoby

    Abstract: Recent TTS systems are able to generate prosodically varied and realistic speech. However, it is unclear how this prosodic variation contributes to the perception of speakers' emotional states. Here we use the recent psychological paradigm 'Gibbs Sampling with People' to search the prosodic latent space in a trained GST Tacotron model to explore prototypes of emotional prosody. Participants are re… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Comments: Submitted to INTERSPEECH'21

  3. arXiv:2008.02595  [pdf, other

    q-bio.NC cs.AI cs.CV stat.AP

    Gibbs Sampling with People

    Authors: Peter M. C. Harrison, Raja Marjieh, Federico Adolfi, Pol van Rijn, Manuel Anglada-Tort, Ofer Tchernichovski, Pauline Larrouy-Maestri, Nori Jacoby

    Abstract: A core problem in cognitive science and machine learning is to understand how humans derive semantic representations from perceptual objects, such as color from an apple, pleasantness from a musical chord, or seriousness from a face. Markov Chain Monte Carlo with People (MCMCP) is a prominent method for studying such representations, in which participants are presented with binary choice trials co… ▽ More

    Submitted 2 November, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: Accepted for oral presentation at NeurIPS 2020

  4. arXiv:1807.00790  [pdf, other

    cs.SD eess.AS

    An energy-based generative sequence model for testing sensory theories of Western harmony

    Authors: Peter M. C. Harrison, Marcus T. Pearce

    Abstract: The relationship between sensory consonance and Western harmony is an important topic in music theory and psychology. We introduce new methods for analysing this relationship, and apply them to large corpora representing three prominent genres of Western music: classical, popular, and jazz music. These methods centre on a generative sequence model with an exponential-family energy-based form that… ▽ More

    Submitted 2 July, 2018; originally announced July 2018.

    Comments: 8 pages, 2 figures. To appear in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), Paris, France, 2018