-
BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
Authors:
P. Mihajlik,
A. Balog,
T. E. Gráczi,
A. Kohári,
B. Tarján,
K. Mády
Abstract:
Hungarian is spoken by 15 million people, still, easily accessible Automatic Speech Recognition (ASR) benchmark datasets - especially for spontaneous speech - have been practically unavailable. In this paper, we introduce BEA-Base, a subset of the BEA spoken Hungarian database comprising mostly spontaneous speech of 140 speakers. It is built specifically to assess ASR, primarily for conversational…
▽ More
Hungarian is spoken by 15 million people, still, easily accessible Automatic Speech Recognition (ASR) benchmark datasets - especially for spontaneous speech - have been practically unavailable. In this paper, we introduce BEA-Base, a subset of the BEA spoken Hungarian database comprising mostly spontaneous speech of 140 speakers. It is built specifically to assess ASR, primarily for conversational AI applications. After defining the speech recognition subsets and task, several baselines - including classic HMM-DNN hybrid and end-to-end approaches augmented by cross-language transfer learning - are developed using open-source toolkits. The best results obtained are based on multilingual self-supervised pretraining, achieving a 45% recognition error rate reduction as compared to the classical approach - without the application of an external language model or additional supervised data. The results show the feasibility of using BEA-Base for training and evaluation of Hungarian speech recognition systems.
△ Less
Submitted 1 February, 2022;
originally announced February 2022.
-
Prosodic entrainment in dialog acts
Authors:
Uwe D. Reichel,
Katalin Mády,
Jennifer Cole
Abstract:
We examined prosodic entrainment in spoken dialogs separately for several dialog acts in cooperative and competitive games. Entrainment was measured for intonation features derived from a superpositional intonation stylization as well as for rhythm features. The found differences can be related to the cooperative or competitive nature of the game, as well as to dialog act properties as its intrins…
▽ More
We examined prosodic entrainment in spoken dialogs separately for several dialog acts in cooperative and competitive games. Entrainment was measured for intonation features derived from a superpositional intonation stylization as well as for rhythm features. The found differences can be related to the cooperative or competitive nature of the game, as well as to dialog act properties as its intrinsic authority, supportiveness and distributional characteristics. In cooperative games dialog acts with a high authority given by knowledge and with a high frequency showed the most entrainment. The results are discussed amongst others with respect to the degree of active entrainment control in cooperative behavior.
△ Less
Submitted 30 October, 2018;
originally announced October 2018.
-
Entrainment profiles: Comparison by gender, role, and feature set
Authors:
Uwe D. Reichel,
Štefan Beňuš,
Katalin Mády
Abstract:
We examine prosodic entrainment in cooperative game dialogs for new feature sets describing register, pitch accent shape, and rhythmic aspects of utterances. For these as well as for established features we present entrainment profiles to detect within- and across-dialog entrainment by the speakers' gender and role in the game. It turned out, that feature sets undergo entrainment in different quan…
▽ More
We examine prosodic entrainment in cooperative game dialogs for new feature sets describing register, pitch accent shape, and rhythmic aspects of utterances. For these as well as for established features we present entrainment profiles to detect within- and across-dialog entrainment by the speakers' gender and role in the game. It turned out, that feature sets undergo entrainment in different quantitative and qualitative ways, which can partly be attributed to their different functions. Furthermore, interactions between speaker gender and role (describer vs. follower) suggest gender-dependent strategies in cooperative solution-oriented interactions: female describers entrain most, male describers least. Our data suggests a slight advantage of the latter strategy on task success.
△ Less
Submitted 29 May, 2018;
originally announced May 2018.