Skip to main content

Showing 1–10 of 10 results for author: Kadiri, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.08644  [pdf, other

    eess.SP cs.AI cs.SD eess.AS

    Toward Fully-End-to-End Listened Speech Decoding from EEG Signals

    Authors: Jihwan Lee, Aditya Kommineni, Tiantian Feng, Kleanthis Avramidis, Xuan Shi, Sudarsana Kadiri, Shrikanth Narayanan

    Abstract: Speech decoding from EEG signals is a challenging task, where brain activity is modeled to estimate salient characteristics of acoustic stimuli. We propose FESDE, a novel framework for Fully-End-to-end Speech Decoding from EEG signals. Our approach aims to directly reconstruct listened speech waveforms given EEG signals, where no intermediate acoustic feature processing step is required. The propo… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: accepted to Interspeech2024

  2. arXiv:2309.14107  [pdf, other

    eess.AS cs.CL cs.LG cs.SD eess.SP

    Wav2vec-based Detection and Severity Level Classification of Dysarthria from Speech

    Authors: Farhad Javanmardi, Saska Tirronen, Manila Kodali, Sudarsana Reddy Kadiri, Paavo Alku

    Abstract: Automatic detection and severity level classification of dysarthria directly from acoustic speech signals can be used as a tool in medical diagnosis. In this work, the pre-trained wav2vec 2.0 model is studied as a feature extractor to build detection and severity level classification systems for dysarthric speech. The experiments were carried out with the popularly used UA-speech database. In the… ▽ More

    Submitted 17 October, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: copyright 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

    Journal ref: in Proc. ICASSP, Rhodes Island, Greece, June 4-10, 2023

  3. arXiv:2309.14080  [pdf, other

    eess.AS cs.CL cs.LG cs.SD eess.SP

    Analysis and Detection of Pathological Voice using Glottal Source Features

    Authors: Sudarsana Reddy Kadiri, Paavo Alku

    Abstract: Automatic detection of voice pathology enables objective assessment and earlier intervention for the diagnosis. This study provides a systematic analysis of glottal source features and investigates their effectiveness in voice pathology detection. Glottal source features are extracted using glottal flows estimated with the quasi-closed phase (QCP) glottal inverse filtering method, using approximat… ▽ More

    Submitted 17 October, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: Copyright 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

    Journal ref: IEEE Journal of Selected Topics in Signal Processing, Vol. 14, No. 2, pp. 367-379, February 2020

  4. arXiv:2308.16540  [pdf, other

    eess.AS cs.CL cs.SD eess.SP

    Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking in Speech Signals

    Authors: Dhananjaya Gowda, Sudarsana Reddy Kadiri, Brad Story, Paavo Alku

    Abstract: In this paper, we propose a new method for the accurate estimation and tracking of formants in speech signals using time-varying quasi-closed-phase (TVQCP) analysis. Conventional formant tracking methods typically adopt a two-stage estimate-and-track strategy wherein an initial set of formant candidates are estimated using short-time analysis (e.g., 10--50 ms), followed by a tracking stage based o… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 28, pp. 1901-1914, 2020

  5. arXiv:2308.09051  [pdf, other

    eess.AS cs.AI cs.LG cs.SD eess.SP

    Refining a Deep Learning-based Formant Tracker using Linear Prediction Methods

    Authors: Paavo Alku, Sudarsana Reddy Kadiri, Dhananjaya Gowda

    Abstract: In this study, formant tracking is investigated by refining the formants tracked by an existing data-driven tracker, DeepFormants, using the formants estimated in a model-driven manner by linear prediction (LP)-based methods. As LP-based formant estimation methods, conventional covariance analysis (LP-COV) and the recently proposed quasi-closed phase forward-backward (QCP-FB) analysis are used. In… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Computer Speech and Language, Vol. 81, Article 101515, June 2023

  6. arXiv:2308.09042  [pdf, other

    eess.AS cs.AI cs.SD eess.SP

    Severity Classification of Parkinson's Disease from Speech using Single Frequency Filtering-based Features

    Authors: Sudarsana Reddy Kadiri, Manila Kodali, Paavo Alku

    Abstract: Develo** objective methods for assessing the severity of Parkinson's disease (PD) is crucial for improving the diagnosis and treatment. This study proposes two sets of novel features derived from the single frequency filtering (SFF) method: (1) SFF cepstral coefficients (SFFCC) and (2) MFCCs from the SFF (MFCC-SFF) for the severity classification of PD. Prior studies have demonstrated that SFF o… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Accepted by INTERSPEECH 2023

  7. arXiv:2308.03226  [pdf, other

    eess.AS cs.AI cs.CL cs.MM cs.SD

    Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer Signals

    Authors: Sudarsana Reddy Kadiri, Farhad Javanmardi, Paavo Alku

    Abstract: Prior studies in the automatic classification of voice quality have mainly studied the use of the acoustic speech signal as input. Recently, a few studies have been carried out by jointly using both speech and neck surface accelerometer (NSA) signals as inputs, and by extracting MFCCs and glottal source features. This study examines simultaneously-recorded speech and NSA signals in the classificat… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

    Comments: Accepted by Computer Speech & Language

  8. arXiv:2210.15978  [pdf, other

    eess.AS cs.SD

    End-to-end Ensemble-based Feature Selection for Paralinguistics Tasks

    Authors: Tamás Grósz, Mittul Singh, Sudarsana Reddy Kadiri, Hemant Kathania, Mikko Kurimo

    Abstract: The events of recent years have highlighted the importance of telemedicine solutions which could potentially allow remote treatment and diagnosis. Relatedly, Computational Paralinguistics, a unique subfield of Speech Processing, aims to extract information about the speaker and form an important part of telemedicine applications. In this work, we focus on two paralinguistic problems: mask detectio… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  9. arXiv:2201.01525  [pdf, other

    eess.AS cs.LG cs.SD

    Formant Tracking Using Quasi-Closed Phase Forward-Backward Linear Prediction Analysis and Deep Neural Networks

    Authors: Dhananjaya Gowda, Bajibabu Bollepalli, Sudarsana Reddy Kadiri, Paavo Alku

    Abstract: Formant tracking is investigated in this study by using trackers based on dynamic programming (DP) and deep neural nets (DNNs). Using the DP approach, six formant estimation methods were first compared. The six methods include linear prediction (LP) algorithms, weighted LP algorithms and the recently developed quasi-closed phase forward-backward (QCP-FB) method. QCP-FB gave the best performance in… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

    Journal ref: Published in IEEE ACCESS. Vol. 9, 2021, pp. 151631-151640

  10. arXiv:2008.02689  [pdf, ps, other

    eess.AS cs.LG cs.SD

    Aalto's End-to-End DNN systems for the INTERSPEECH 2020 Computational Paralinguistics Challenge

    Authors: Tamás Grósz, Mittul Singh, Sudarsana Reddy Kadiri, Hemant Kathania, Mikko Kurimo

    Abstract: End-to-end neural network models (E2E) have shown significant performance benefits on different INTERSPEECH ComParE tasks. Prior work has applied either a single instance of an E2E model for a task or the same E2E architecture for different tasks. However, applying a single model is unstable or using the same architecture under-utilizes task-specific information. On ComParE 2020 tasks, we investig… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.