Skip to main content

Showing 1–11 of 11 results for author: van Rijn, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.04278  [pdf, other

    cs.CL cs.HC

    Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People

    Authors: Dun-Ming Huang, Pol Van Rijn, Ilia Sucholutsky, Raja Marjieh, Nori Jacoby

    Abstract: Conversational tones -- the manners and attitudes in which speakers communicate -- are essential to effective communication. Amidst the increasing popularization of Large Language Models (LLMs) over recent years, it becomes necessary to characterize the divergences in their conversational tones relative to humans. However, existing investigations of conversational modalities rely on pre-existing t… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted to Main Conference at ACL 2024

  2. arXiv:2402.06992  [pdf, other

    q-bio.NC cs.AI cs.CL stat.AP

    A Rational Analysis of the Speech-to-Song Illusion

    Authors: Raja Marjieh, Pol van Rijn, Ilia Sucholutsky, Harin Lee, Thomas L. Griffiths, Nori Jacoby

    Abstract: The speech-to-song illusion is a robust psychological phenomenon whereby a spoken sentence sounds increasingly more musical as it is repeated. Despite decades of research, a complete formal account of this transformation is still lacking, and some of its nuanced characteristics, namely, that certain phrases appear to transform while others do not, is not well understood. Here we provide a formal a… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: 7 pages, 5 figures

  3. Giving Robots a Voice: Human-in-the-Loop Voice Creation and open-ended Labeling

    Authors: Pol van Rijn, Silvan Mertes, Kathrin Janowski, Katharina Weitz, Nori Jacoby, Elisabeth André

    Abstract: Speech is a natural interface for humans to interact with robots. Yet, aligning a robot's voice to its appearance is challenging due to the rich vocabulary of both modalities. Previous research has explored a few labels to describe robots and tested them on a limited number of robots and existing voices. Here, we develop a robot-voice creation tool followed by large-scale behavioral human experime… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted to CHI 2024, May 11 to 16, 2024, Honolulu, HI, USA

  4. arXiv:2302.01614  [pdf, other

    cs.CL

    Around the world in 60 words: A generative vocabulary test for online research

    Authors: Pol van Rijn, Yue Sun, Harin Lee, Raja Marjieh, Ilia Sucholutsky, Francesca Lanzarini, Elisabeth André, Nori Jacoby

    Abstract: Conducting experiments with diverse participants in their native languages can uncover insights into culture, cognition, and language that may not be revealed otherwise. However, conducting these experiments online makes it difficult to validate self-reported language proficiency. Furthermore, existing proficiency tests are small and cover only a few languages. We present an automated pipeline to… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

  5. arXiv:2302.01308  [pdf, other

    cs.CL cs.LG stat.ML

    Large language models predict human sensory judgments across six modalities

    Authors: Raja Marjieh, Ilia Sucholutsky, Pol van Rijn, Nori Jacoby, Thomas L. Griffiths

    Abstract: Determining the extent to which the perceptual world can be recovered from language is a longstanding problem in philosophy and cognitive science. We show that state-of-the-art large language models can unlock new insights into this problem by providing a lower bound on the amount of perceptual information that can be extracted from language. Specifically, we elicit pairwise similarity judgments f… ▽ More

    Submitted 15 June, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Comments: 9 pages, 3 figures

  6. arXiv:2206.04105  [pdf, other

    cs.CL cs.LG stat.ML

    Words are all you need? Language as an approximation for human similarity judgments

    Authors: Raja Marjieh, Pol van Rijn, Ilia Sucholutsky, Theodore R. Sumers, Harin Lee, Thomas L. Griffiths, Nori Jacoby

    Abstract: Human similarity judgments are a powerful supervision signal for machine learning applications based on techniques such as contrastive learning, information retrieval, and model alignment, but classical methods for collecting human similarity judgments are too expensive to be used at scale. Recent methods propose using pre-trained deep neural networks (DNNs) to approximate human similarity, but pr… ▽ More

    Submitted 23 February, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Accepted to ICLR 2023, final revision. https://openreview.net/forum?id=O-G91-4cMdv

  7. arXiv:2205.04820  [pdf, other

    cs.CL

    Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody

    Authors: Pol van Rijn, Harin Lee, Nori Jacoby

    Abstract: The human voice effectively communicates a range of emotions with nuanced variations in acoustics. Existing emotional speech corpora are limited in that they are either (a) highly curated to induce specific emotions with predefined categories that may not capture the full extent of emotional experiences, or (b) entangled in their semantic and prosodic cues, limiting the ability to study these cues… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: Accepted to CogSci'22

  8. arXiv:2203.16930  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    WavThruVec: Latent speech representation as intermediate features for neural speech synthesis

    Authors: Hubert Siuzdak, Piotr Dura, Pol van Rijn, Nori Jacoby

    Abstract: Recent advances in neural text-to-speech research have been dominated by two-stage pipelines utilizing low-level intermediate speech representation such as mel-spectrograms. However, such predetermined features are fundamentally limited, because they do not allow to exploit the full potential of a data-driven approach through learning hidden representations. For this reason, several end-to-end met… ▽ More

    Submitted 11 July, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Accepted to INTERSPEECH 2022. Audio samples are available at: https://charactr-platform.github.io/WavThruVec/

  9. arXiv:2203.15379  [pdf, other

    cs.SD cs.HC eess.AS

    VoiceMe: Personalized voice generation in TTS

    Authors: Pol van Rijn, Silvan Mertes, Dominik Schiller, Piotr Dura, Hubert Siuzdak, Peter M. C. Harrison, Elisabeth André, Nori Jacoby

    Abstract: Novel text-to-speech systems can generate entirely new voices that were not seen during training. However, it remains a difficult task to efficiently create personalized voices from a high-dimensional speaker space. In this work, we use speaker embeddings from a state-of-the-art speaker verification model (SpeakerNet) trained on thousands of speakers to condition a TTS model. We employ a human sam… ▽ More

    Submitted 11 July, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: Accepted to Interspeech'22. Audio and video samples are available at: https://polvanrijn.github.io/VoiceMe/

  10. Exploring emotional prototypes in a high dimensional TTS latent space

    Authors: Pol van Rijn, Silvan Mertes, Dominik Schiller, Peter M. C. Harrison, Pauline Larrouy-Maestri, Elisabeth André, Nori Jacoby

    Abstract: Recent TTS systems are able to generate prosodically varied and realistic speech. However, it is unclear how this prosodic variation contributes to the perception of speakers' emotional states. Here we use the recent psychological paradigm 'Gibbs Sampling with People' to search the prosodic latent space in a trained GST Tacotron model to explore prototypes of emotional prosody. Participants are re… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Comments: Submitted to INTERSPEECH'21

  11. arXiv:2008.02595  [pdf, other

    q-bio.NC cs.AI cs.CV stat.AP

    Gibbs Sampling with People

    Authors: Peter M. C. Harrison, Raja Marjieh, Federico Adolfi, Pol van Rijn, Manuel Anglada-Tort, Ofer Tchernichovski, Pauline Larrouy-Maestri, Nori Jacoby

    Abstract: A core problem in cognitive science and machine learning is to understand how humans derive semantic representations from perceptual objects, such as color from an apple, pleasantness from a musical chord, or seriousness from a face. Markov Chain Monte Carlo with People (MCMCP) is a prominent method for studying such representations, in which participants are presented with binary choice trials co… ▽ More

    Submitted 2 November, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: Accepted for oral presentation at NeurIPS 2020