Skip to main content

Showing 1–2 of 2 results for author: Fuhs, M C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17124  [pdf, other

    cs.SD cs.LG eess.AS

    Investigating Confidence Estimation Measures for Speaker Diarization

    Authors: Anurag Chowdhury, Abhinav Misra, Mark C. Fuhs, Monika Woszczyna

    Abstract: Speaker diarization systems segment a conversation recording based on the speakers' identity. Such systems can misclassify the speaker of a portion of audio due to a variety of factors, such as speech pattern variation, background noise, and overlap** speech. These errors propagate to, and can adversely affect, downstream systems that rely on the speaker's identity, such as speaker-adapted speec… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted in INTERSPEECH 2024

  2. arXiv:2406.08914  [pdf, other

    cs.SD cs.LG eess.AS

    Transcription-Free Fine-Tuning of Speech Separation Models for Noisy and Reverberant Multi-Speaker Automatic Speech Recognition

    Authors: William Ravenscroft, George Close, Stefan Goetze, Thomas Hain, Mohammad Soleymanpour, Anurag Chowdhury, Mark C. Fuhs

    Abstract: One solution to automatic speech recognition (ASR) of overlap** speakers is to separate speech and then perform ASR on the separated signals. Commonly, the separator produces artefacts which often degrade ASR performance. Addressing this issue typically requires reference transcriptions to jointly train the separation and ASR networks. This is often not viable for training on real-world in-domai… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 5 pages, 3 Figures, 3 Tables, Accepted for Interspeech 2024