Skip to main content

Showing 1–6 of 6 results for author: Siriwardena, Y M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.15136  [pdf, other

    eess.SP cs.MM cs.SD eess.AS eess.IV

    A multi-modal approach for identifying schizophrenia using cross-modal attention

    Authors: Gowtham Premananth, Yashish M. Siriwardena, Philip Resnik, Carol Espy-Wilson

    Abstract: This study focuses on how different modalities of human communication can be used to distinguish between healthy controls and subjects with schizophrenia who exhibit strong positive symptoms. We developed a multi-modal schizophrenia classification system using audio, video, and text. Facial action units and vocal tract variables were extracted as low-level features from video and audio respectivel… ▽ More

    Submitted 18 April, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

    Comments: Accepted to Annual International Conference of the IEEE Engineering in Medicine and Biology Society 2024

  2. arXiv:2309.09220  [pdf, other

    eess.AS cs.AI cs.SD

    Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables

    Authors: Ahmed Adel Attia, Yashish M. Siriwardena, Carol Espy-Wilson

    Abstract: The performance of deep learning models depends significantly on their capacity to encode input features efficiently and decode them into meaningful outputs. Better input and output representation has the potential to boost models' performance and generalization. In the context of acoustic-to-articulatory speech inversion (SI) systems, we study the impact of utilizing speech representations acquir… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

  3. arXiv:2210.16454  [pdf, ps, other

    eess.AS cs.SD

    Learning to Compute the Articulatory Representations of Speech with the MIRRORNET

    Authors: Yashish M. Siriwardena, Carol Espy-Wilson, Shihab Shamma

    Abstract: Most organisms including humans function by coordinating and integrating sensory signals with motor actions to survive and accomplish desired tasks. Learning these complex sensorimotor map**s proceeds simultaneously and often in an unsupervised or semi-supervised fashion. An autoencoder architecture (MirrorNet) inspired by this sensorimotor learning paradigm is explored in this work to control a… ▽ More

    Submitted 25 May, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

    Comments: Interspeech 2023

    Journal ref: Interspeech 2023

  4. arXiv:2210.16450  [pdf, ps, other

    eess.AS cs.SD

    The Secret Source : Incorporating Source Features to Improve Acoustic-to-Articulatory Speech Inversion

    Authors: Yashish M. Siriwardena, Carol Espy-Wilson

    Abstract: In this work, we incorporated acoustically derived source features, aperiodicity, periodicity and pitch as additional targets to an acoustic-to-articulatory speech inversion (SI) system. We also propose a Temporal Convolution based SI system, which uses auditory spectrograms as the input speech representation, to learn long-range dependencies and complex interactions between the source and vocal t… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  5. The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction

    Authors: Yashish M. Siriwardena, Guilhem Marion, Shihab Shamma

    Abstract: Experiments to understand the sensorimotor neural interactions in the human cortical speech system support the existence of a bidirectional flow of interactions between the auditory and motor regions. Their key function is to enable the brain to `learn' how to control the vocal tract for speech production. This idea is the impetus for the recently proposed "MirrorNet", a constrained autoencoder ar… ▽ More

    Submitted 18 February, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

    Journal ref: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  6. arXiv:2110.04440  [pdf, other

    eess.AS cs.MM cs.SD

    Multimodal Approach for Assessing Neuromotor Coordination in Schizophrenia Using Convolutional Neural Networks

    Authors: Yashish M. Siriwardena, Chris Kitchen, Deanna L. Kelly, Carol Espy-Wilson

    Abstract: This study investigates the speech articulatory coordination in schizophrenia subjects exhibiting strong positive symptoms (e.g. hallucinations and delusions), using two distinct channel-delay correlation methods. We show that the schizophrenic subjects with strong positive symptoms and who are markedly ill pose complex articulatory coordination pattern in facial and speech gestures than what is o… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

    Comments: 5 pages. arXiv admin note: text overlap with arXiv:2102.07054

    Journal ref: Proceedings of the 2021 International Conference on Multimodal Interaction