Skip to main content

Showing 1–5 of 5 results for author: Maciejewski, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2401.15676  [pdf, other

    eess.AS cs.SD

    On Speaker Attribution with SURT

    Authors: Desh Raj, Matthew Wiesner, Matthew Maciejewski, Leibny Paola Garcia-Perera, Daniel Povey, Sanjeev Khudanpur

    Abstract: The Streaming Unmixing and Recognition Transducer (SURT) has recently become a popular framework for continuous, streaming, multi-talker speech recognition (ASR). With advances in architecture, objectives, and mixture simulation methods, it was demonstrated that SURT can be an efficient streaming method for speaker-agnostic transcription of real meetings. In this work, we push this framework furth… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: 8 pages, 6 figures, 6 tables. Submitted to Odyssey 2024

  2. arXiv:2306.13734  [pdf, other

    eess.AS cs.CL cs.SD

    The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

    Authors: Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola Garcia, Matthew Maciejewski, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur

    Abstract: The CHiME challenges have played a significant role in the development and evaluation of robust automatic speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR (DASR) task, within the 7th CHiME challenge. This task comprises joint ASR and diarization in far-field settings with multiple, and possibly heterogeneous, recording devices. Different from previous challenges, we evaluate… ▽ More

    Submitted 14 July, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

  3. arXiv:2010.12430  [pdf, other

    eess.AS cs.SD

    Training Noisy Single-Channel Speech Separation With Noisy Oracle Sources: A Large Gap and A Small Step

    Authors: Matthew Maciejewski, **g Shi, Shinji Watanabe, Sanjeev Khudanpur

    Abstract: As the performance of single-channel speech separation systems has improved, there has been a desire to move to more challenging conditions than the clean, near-field speech that initial systems were developed on. When training deep learning separation models, a need for ground truth leads to training on synthetic mixtures. As such, training in noisy conditions requires either using noise syntheti… ▽ More

    Submitted 22 February, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: Accepted to ICASSP 2021

  4. arXiv:2006.07898  [pdf, other

    eess.AS cs.SD

    The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge

    Authors: Ashish Arora, Desh Raj, Aswin Shanmugam Subramanian, Ke Li, Bar Ben-Yair, Matthew Maciejewski, Piotr Żelasko, Paola García, Shinji Watanabe, Sanjeev Khudanpur

    Abstract: This paper summarizes the JHU team's efforts in tracks 1 and 2 of the CHiME-6 challenge for distant multi-microphone conversational speech diarization and recognition in everyday home environments. We explore multi-array processing techniques at each stage of the pipeline, such as multi-array guided source separation (GSS) for enhancement and acoustic model training data, posterior fusion for spee… ▽ More

    Submitted 14 June, 2020; originally announced June 2020.

    Comments: Presented at the CHiME-6 workshop (colocated with ICASSP 2020)

  5. arXiv:1910.10279  [pdf, ps, other

    cs.SD eess.AS

    WHAMR!: Noisy and Reverberant Single-Channel Speech Separation

    Authors: Matthew Maciejewski, Gordon Wichern, Emmett McQuinn, Jonathan Le Roux

    Abstract: While significant advances have been made with respect to the separation of overlap** speech signals, studies have been largely constrained to mixtures of clean, near anechoic speech, not representative of many real-world scenarios. Although the WHAM! dataset introduced noise to the ubiquitous wsj0-2mix dataset, it did not include reverberation, which is generally present in indoor recordings ou… ▽ More

    Submitted 14 February, 2020; v1 submitted 22 October, 2019; originally announced October 2019.

    Comments: Accepted for publication at ICASSP 2020