Skip to main content

Showing 1–9 of 9 results for author: Orozco-Arroyave, J R

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.16128  [pdf, other

    eess.AS

    Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection from Speech in Real-World Operative Conditions

    Authors: Moreno La Quatra, Maria Francesca Turco, Torbjørn Svendsen, Giampiero Salvi, Juan Rafael Orozco-Arroyave, Sabato Marco Siniscalchi

    Abstract: This work is concerned with devising a robust Parkinson's (PD) disease detector from speech in real-world operating conditions using (i) foundational models, and (ii) speech enhancement (SE) methods. To this end, we first fine-tune several foundational-based models on the standard PC-GITA (s-PC-GITA) clean data. Our results demonstrate superior performance to previously proposed models. Second, we… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: Accepted at INTERSPEECH 2024

  2. Federated learning for secure development of AI models for Parkinson's disease detection using speech from different languages

    Authors: Soroosh Tayebi Arasteh, Cristian David Rios-Urrego, Elmar Noeth, Andreas Maier, Seung Hee Yang, Jan Rusz, Juan Rafael Orozco-Arroyave

    Abstract: Parkinson's disease (PD) is a neurological disorder impacting a person's speech. Among automatic PD assessment methods, deep learning models have gained particular interest. Recently, the community has explored cross-pathology and cross-language models which can improve diagnostic accuracy even further. However, strict patient data privacy regulations largely prevent institutions from sharing pati… ▽ More

    Submitted 21 August, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: INTERSPEECH 2023, pp. 5003--5007, Dublin, Ireland

    Journal ref: INTERSPEECH 2023

  3. arXiv:2209.08379  [pdf, other

    eess.AS cs.SD q-bio.QM

    Representation Learning Strategies to Model Pathological Speech: Effect of Multiple Spectral Resolutions

    Authors: Gabriel Figueiredo Miller, Juan Camilo Vásquez-Correa, Juan Rafael Orozco-Arroyave, Elmar Nöth

    Abstract: This paper considers a representation learning strategy to model speech signals from patients with Parkinson's disease and cleft lip and palate. In particular, it compares different parametrized representation types such as wideband and narrowband spectrograms, and wavelet-based scalograms, with the goal of quantifying the representation capacity of each. Methods for quantification include the abi… ▽ More

    Submitted 17 September, 2022; originally announced September 2022.

    Comments: 7 pages, 3 figures

  4. arXiv:2204.01670  [pdf, other

    cs.CL cs.SD eess.AS

    Cross-lingual Self-Supervised Speech Representations for Improved Dysarthric Speech Recognition

    Authors: Abner Hernandez, Paula Andrea Pérez-Toro, Elmar Nöth, Juan Rafael Orozco-Arroyave, Andreas Maier, Seung Hee Yang

    Abstract: State-of-the-art automatic speech recognition (ASR) systems perform well on healthy speech. However, the performance on impaired speech still remains an issue. The current study explores the usefulness of using Wav2Vec self-supervised speech representations as features for training an ASR system for dysarthric speech. Dysarthric speech recognition is particularly difficult as several aspects of sp… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: Submitted for review at Interspeech 2022

  5. arXiv:2201.05912  [pdf, other

    eess.AS cs.LG cs.SD

    Common Phone: A Multilingual Dataset for Robust Acoustic Modelling

    Authors: Philipp Klumpp, Tomás Arias-Vergara, Paula Andrea Pérez-Toro, Elmar Nöth, Juan Rafael Orozco-Arroyave

    Abstract: Current state of the art acoustic models can easily comprise more than 100 million parameters. This growing complexity demands larger training datasets to maintain a decent generalization of the final decision function. An ideal dataset is not necessarily large in size, but large with respect to the amount of unique speakers, utilized hardware and varying recording conditions. This enables a machi… ▽ More

    Submitted 31 January, 2022; v1 submitted 15 January, 2022; originally announced January 2022.

    Comments: Pre-print submitted to LREC 2022 Link to Common Phone: https://zenodo.org/record/5846137

  6. arXiv:2112.11514  [pdf, ps, other

    eess.AS cs.AI cs.LG

    The Phonetic Footprint of Parkinson's Disease

    Authors: Philipp Klumpp, Tomás Arias-Vergara, Juan Camilo Vásquez-Correa, Paula Andrea Pérez-Toro, Juan Rafael Orozco-Arroyave, Anton Batliner, Elmar Nöth

    Abstract: As one of the most prevalent neurodegenerative disorders, Parkinson's disease (PD) has a significant impact on the fine motor skills of patients. The complex interplay of different articulators during speech production and realization of required muscle tension become increasingly difficult, thus leading to a dysarthric speech. Characteristic patterns such as vowel instability, slurred pronunciati… ▽ More

    Submitted 21 December, 2021; originally announced December 2021.

    Comments: https://www.sciencedirect.com/science/article/abs/pii/S0885230821001169

    Journal ref: Elsevier Computer Speech and Language, Volume 72, March 2022

  7. arXiv:2108.11981  [pdf, other

    cs.SD cs.LG eess.AS

    Classification of Emotions and Evaluation of Customer Satisfaction from Speech in Real World Acoustic Environments

    Authors: Luis Felipe Parra-Gallego, Juan Rafael Orozco-Arroyave

    Abstract: This paper focuses on finding suitable features to robustly recognize emotions and evaluate customer satisfaction from speech in real acoustic scenarios. The classification of emotions is based on standard and well-known corpora and the evaluation of customer satisfaction is based on recordings of real opinions given by customers about the received service during phone calls with call-center agent… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

  8. arXiv:2002.05412  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Comparison of user models based on GMM-UBM and i-vectors for speech, handwriting, and gait assessment of Parkinson's disease patients

    Authors: J. C. Vasquez-Correa, T. Bocklet, J. R. Orozco-Arroyave, E. Nöth

    Abstract: Parkinson's disease is a neurodegenerative disorder characterized by the presence of different motor impairments. Information from speech, handwriting, and gait signals have been considered to evaluate the neurological state of the patients. On the other hand, user models based on Gaussian mixture models - universal background models (GMM-UBM) and i-vectors are considered the state-of-the-art in b… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

    Comments: Proceedings of ICASSP (2019)

  9. arXiv:2002.04374  [pdf, other

    cs.LG cs.CL eess.AS stat.ML

    Convolutional Neural Networks and a Transfer Learning Strategy to Classify Parkinson's Disease from Speech in Three Different Languages

    Authors: J. C. Vásquez-Correa, T. Arias-Vergara, C. D. Rios-Urrego, M. Schuster, J. Rusz, J. R. Orozco-Arroyave, E. Nöth

    Abstract: Parkinson's disease patients develop different speech impairments that affect their communication capabilities. The automatic assessment of the speech of the patients allows the development of computer aided tools to support the diagnosis and the evaluation of the disease severity. This paper introduces a methodology to classify Parkinson's disease from speech in three different languages: Spanish… ▽ More

    Submitted 11 February, 2020; originally announced February 2020.

    Journal ref: In Iberoamerican Congress on Pattern Recognition (pp. 697-706) 2019