Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition

Lee, Jeehyun; Choi, Yerin; Song, Tae-**; Koo, Myoung-Wan

Computer Science > Computation and Language

arXiv:2402.18923 (cs)

[Submitted on 29 Feb 2024]

Title:Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition

Authors:Jeehyun Lee, Yerin Choi, Tae-** Song, Myoung-Wan Koo

View PDF HTML (experimental)

Abstract:Dysarthria, a common issue among stroke patients, severely impacts speech intelligibility. Inappropriate pauses are crucial indicators in severity assessment and speech-language therapy. We propose to extend a large-scale speech recognition model for inappropriate pause detection in dysarthric speech. To this end, we propose task design, labeling strategy, and a speech recognition model with an inappropriate pause prediction layer. First, we treat pause detection as speech recognition, using an automatic speech recognition (ASR) model to convert speech into text with pause tags. According to the newly designed task, we label pause locations at the text level and their appropriateness. We collaborate with speech-language pathologists to establish labeling criteria, ensuring high-quality annotated data. Finally, we extend the ASR model with an inappropriate pause prediction layer for end-to-end inappropriate pause detection. Moreover, we propose a task-tailored metric for evaluating inappropriate pause detection independent of ASR performance. Our experiments show that the proposed method better detects inappropriate pauses in dysarthric speech than baselines. (Inappropriate Pause Error Rate: 14.47%)

Comments:	Accepted to ICASSP 2024
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2402.18923 [cs.CL]
	(or arXiv:2402.18923v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2402.18923

Submission history

From: Yerin Choi [view email]
[v1] Thu, 29 Feb 2024 07:29:42 UTC (672 KB)

Computer Science > Computation and Language

Title:Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators