Skip to main content

Showing 1–6 of 6 results for author: Lord, C

.
  1. arXiv:2307.16398  [pdf, other

    eess.AS

    Robust Self Supervised Speech Embeddings for Child-Adult Classification in Interactions involving Children with Autism

    Authors: Rimita Lahiri, Tiantian Feng, Rajat Hebbar, Catherine Lord, So Hyun Kim, Shrikanth Narayanan

    Abstract: We address the problem of detecting who spoke when in child-inclusive spoken interactions i.e., automatic child-adult speaker classification. Interactions involving children are richly heterogeneous due to developmental differences. The presence of neurodiversity e.g., due to Autism, contributes additional variability. We investigate the impact of additional pre-training with more unlabelled child… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

  2. arXiv:2211.03279  [pdf, other

    eess.AS cs.SD

    A Context-Aware Computational Approach for Measuring Vocal Entrainment in Dyadic Conversations

    Authors: Rimita Lahiri, Md Nasir, Catherine Lord, So Hyun Kim, Shrikanth Narayanan

    Abstract: Vocal entrainment is a social adaptation mechanism in human interaction, knowledge of which can offer useful insights to an individual's cognitive-behavioral characteristics. We propose a context-aware approach for measuring vocal entrainment in dyadic conversations. We use conformers(a combination of convolutional network and transformer) for capturing both short-term and long-term conversational… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

  3. arXiv:2109.01568  [pdf, other

    eess.AS cs.SD

    Phone Duration Modeling for Speaker Age Estimation in Children

    Authors: Prashanth Gurunath Shivakumar, Somer Bishop, Catherine Lord, Shrikanth Narayanan

    Abstract: Automatic inference of important paralinguistic information such as age from speech is an important area of research with numerous spoken language technology based applications. Speaker age estimation has applications in enabling personalization and age-appropriate curation of information and content. However, research in speaker age estimation in children is especially challenging due to paucity… ▽ More

    Submitted 3 September, 2021; originally announced September 2021.

  4. arXiv:2007.09635  [pdf, other

    eess.AS cs.SD

    Meta-learning with Latent Space Clustering in Generative Adversarial Network for Speaker Diarization

    Authors: Monisankha Pal, Manoj Kumar, Raghuveer Peri, Tae ** Park, So Hyun Kim, Catherine Lord, Somer Bishop, Shrikanth Narayanan

    Abstract: The performance of most speaker diarization systems with x-vector embeddings is both vulnerable to noisy environments and lacks domain robustness. Earlier work on speaker diarization using generative adversarial network (GAN) with an encoder network (ClusterGAN) to project input x-vectors into a latent space has shown promising performance on meeting data. In this paper, we extend the ClusterGAN n… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

    Comments: Submitted to IEEE/ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

  5. arXiv:1910.11400  [pdf, other

    eess.AS cs.SD

    Meta-learning for robust child-adult classification from speech

    Authors: Nithin Rao Koluguri, Manoj Kumar, So Hyun Kim, Catherine Lord, Shrikanth Narayanan

    Abstract: Computational modeling of naturalistic conversations in clinical applications has seen growing interest in the past decade. An important use-case involves child-adult interactions within the autism diagnosis and intervention domain. In this paper, we address a specific sub-problem of speaker diarization, namely child-adult speaker classification in such dyadic conversations with specified roles. T… ▽ More

    Submitted 28 October, 2019; v1 submitted 24 October, 2019; originally announced October 2019.

  6. arXiv:1910.11398  [pdf, ps, other

    eess.AS cs.SD

    Speaker diarization using latent space clustering in generative adversarial network

    Authors: Monisankha Pal, Manoj Kumar, Raghuveer Peri, Tae ** Park, So Hyun Kim, Catherine Lord, Somer Bishop, Shrikanth Narayanan

    Abstract: In this work, we propose deep latent space clustering for speaker diarization using generative adversarial network (GAN) backprojection with the help of an encoder network. The proposed diarization system is trained jointly with GAN loss, latent variable recovery loss, and a clustering-specific loss. It uses x-vector speaker embeddings at the input, while the latent variables are sampled from a co… ▽ More

    Submitted 24 October, 2019; originally announced October 2019.

    Comments: Submitted to ICASSP 2020