-
The Third DIHARD Diarization Challenge
Authors:
Neville Ryant,
Prachi Singh,
Venkat Krishnamohan,
Rajat Varma,
Kenneth Church,
Christopher Cieri,
Jun Du,
Sriram Ganapathy,
Mark Liberman
Abstract:
DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain. Speaker diarization was evaluated under two speech activity conditions (diarization from a reference speech activity vs. diarization from scratch) and 11 diverse domains. The domains span…
▽ More
DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain. Speaker diarization was evaluated under two speech activity conditions (diarization from a reference speech activity vs. diarization from scratch) and 11 diverse domains. The domains span a range of recording conditions and interaction types, including read audio-books, meeting speech, clinical interviews, web videos, and, for the first time, conversational telephone speech. A total of 30 organizations (forming 21teams) from industry and academia submitted 499 valid system outputs. The evaluation results indicate that speaker diarization has improved markedly since DIHARD I, particularly for two-party interactions, but that for many domains (e.g., web video) the problem remains far from solved.
△ Less
Submitted 5 April, 2021; v1 submitted 2 December, 2020;
originally announced December 2020.
-
Third DIHARD Challenge Evaluation Plan
Authors:
Neville Ryant,
Kenneth Church,
Christopher Cieri,
Jun Du,
Sriram Ganapathy,
Mark Liberman
Abstract:
This paper introduces the third DIHARD challenge, the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain. The challenge comprises two tracks evaluating diarization performance when starting from a reference speech segmentation (track 1) and diarization from ra…
▽ More
This paper introduces the third DIHARD challenge, the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain. The challenge comprises two tracks evaluating diarization performance when starting from a reference speech segmentation (track 1) and diarization from raw audio scratch (track 2). We describe the task, metrics, datasets, and evaluation protocol.
△ Less
Submitted 2 December, 2020; v1 submitted 4 June, 2020;
originally announced June 2020.
-
The Second DIHARD Diarization Challenge: Dataset, task, and baselines
Authors:
Neville Ryant,
Kenneth Church,
Christopher Cieri,
Alejandrina Cristia,
Jun Du,
Sriram Ganapathy,
Mark Liberman
Abstract:
This paper introduces the second DIHARD challenge, the second in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain. The challenge comprises four tracks evaluating diarization performance under two input conditions (single channel vs. multi-channel) and two segmentatio…
▽ More
This paper introduces the second DIHARD challenge, the second in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain. The challenge comprises four tracks evaluating diarization performance under two input conditions (single channel vs. multi-channel) and two segmentation conditions (diarization from a reference speech segmentation vs. diarization from scratch). In order to prevent participants from overtuning to a particular combination of recording conditions and conversational domain, recordings are drawn from a variety of sources ranging from read audiobooks to meeting speech, to child language acquisition recordings, to dinner parties, to web video. We describe the task and metrics, challenge design, datasets, and baseline systems for speech enhancement, speech activity detection, and diarization.
△ Less
Submitted 18 June, 2019;
originally announced June 2019.