The ACM Multimedia 2023 Computational Paralinguistics Challenge: Emotion Share & Requests
Authors:
Björn W. Schuller,
Anton Batliner,
Shahin Amiriparian,
Alexander Barnhill,
Maurice Gerczuk,
Andreas Triantafyllopoulos,
Alice Baird,
Panagiotis Tzirakis,
Chris Gagne,
Alan S. Cowen,
Nikola Lackovic,
Marie-José Caraty,
Claude Montacié
Abstract:
The ACM Multimedia 2023 Computational Paralinguistics Challenge addresses two different problems for the first time in a research competition under well-defined conditions: In the Emotion Share Sub-Challenge, a regression on speech has to be made; and in the Requests Sub-Challenges, requests and complaints need to be detected. We describe the Sub-Challenges, baseline feature extraction, and classi…
▽ More
The ACM Multimedia 2023 Computational Paralinguistics Challenge addresses two different problems for the first time in a research competition under well-defined conditions: In the Emotion Share Sub-Challenge, a regression on speech has to be made; and in the Requests Sub-Challenges, requests and complaints need to be detected. We describe the Sub-Challenges, baseline feature extraction, and classifiers based on the usual ComPaRE features, the auDeep toolkit, and deep feature extraction from pre-trained CNNs using the DeepSpectRum toolkit; in addition, wav2vec2 models are used.
△ Less
Submitted 1 May, 2023; v1 submitted 28 April, 2023;
originally announced April 2023.
EEV: A Large-Scale Dataset for Studying Evoked Expressions from Video
Authors:
Jennifer J. Sun,
Ting Liu,
Alan S. Cowen,
Florian Schroff,
Hartwig Adam,
Gautam Prasad
Abstract:
Videos can evoke a range of affective responses in viewers. The ability to predict evoked affect from a video, before viewers watch the video, can help in content creation and video recommendation. We introduce the Evoked Expressions from Videos (EEV) dataset, a large-scale dataset for studying viewer responses to videos. Each video is annotated at 6 Hz with 15 continuous evoked expression labels,…
▽ More
Videos can evoke a range of affective responses in viewers. The ability to predict evoked affect from a video, before viewers watch the video, can help in content creation and video recommendation. We introduce the Evoked Expressions from Videos (EEV) dataset, a large-scale dataset for studying viewer responses to videos. Each video is annotated at 6 Hz with 15 continuous evoked expression labels, corresponding to the facial expression of viewers who reacted to the video. We use an expression recognition model within our data collection framework to achieve scalability. In total, there are 36.7 million annotations of viewer facial reactions to 23,574 videos (1,700 hours). We use a publicly available video corpus to obtain a diverse set of video content. We establish baseline performance on the EEV dataset using an existing multimodal recurrent model. Transfer learning experiments show an improvement in performance on the LIRIS-ACCEDE video dataset when pre-trained on EEV. We hope that the size and diversity of the EEV dataset will encourage further explorations in video understanding and affective computing. A subset of EEV is released at https://github.com/google-research-datasets/eev.
△ Less
Submitted 22 February, 2021; v1 submitted 15 January, 2020;
originally announced January 2020.