Skip to main content

Showing 1–11 of 11 results for author: Burkhardt, F

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.01143  [pdf, other

    cs.SD cs.AI eess.AS

    Are you sure? Analysing Uncertainty Quantification Approaches for Real-world Speech Emotion Recognition

    Authors: Oliver Schrüfer, Manuel Milling, Felix Burkhardt, Florian Eyben, Björn Schuller

    Abstract: Uncertainty Quantification (UQ) is an important building block for the reliable use of neural networks in real-world scenarios, as it can be a useful tool in identifying faulty predictions. Speech emotion recognition (SER) models can suffer from particularly many sources of uncertainty, such as the ambiguity of emotions, Out-of-Distribution (OOD) data or, in general, poor recording conditions. Rel… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: accepted for Interspeech 2024, 5 pages

  2. arXiv:2312.06270  [pdf, other

    eess.AS cs.SD

    Testing Speech Emotion Recognition Machine Learning Models

    Authors: Anna Derington, Hagen Wierstorf, Ali Özkil, Florian Eyben, Felix Burkhardt, Björn W. Schuller

    Abstract: Machine learning models for speech emotion recognition (SER) can be trained for different tasks and are usually evaluated on the basis of a few available datasets per task. Tasks could include arousal, valence, dominance, emotional categories, or tone of voice. Those models are mainly evaluated in terms of correlation or recall, and always show some errors in their predictions. The errors manifest… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  3. arXiv:2307.02132  [pdf, other

    cs.SD eess.AS

    Going Retro: Astonishingly Simple Yet Effective Rule-based Prosody Modelling for Speech Synthesis Simulating Emotion Dimensions

    Authors: Felix Burkhardt, Uwe Reichel, Florian Eyben, Björn Schuller

    Abstract: We introduce two rule-based models to modify the prosody of speech synthesis in order to modulate the emotion to be expressed. The prosody modulation is based on speech synthesis markup language (SSML) and can be used with any commercial speech synthesizer. The models as well as the optimization result are evaluated against human emotion annotations. Results indicate that with a very simple method… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: accepted at 34th ESSV 2023, Munich 2023

  4. arXiv:2306.16962  [pdf, other

    cs.SD eess.AS

    Speech-based Age and Gender Prediction with Transformers

    Authors: Felix Burkhardt, Johannes Wagner, Hagen Wierstorf, Florian Eyben, Björn Schuller

    Abstract: We report on the curation of several publicly available datasets for age and gender prediction. Furthermore, we present experiments to predict age and gender with models based on a pre-trained wav2vec 2.0. Depending on the dataset, we achieve an MAE between 7.1 years and 10.8 years for age, and at least 91.1% ACC for gender (female, male, child). Compared to a modelling approach built on handcraft… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: 5 pages, submitted to 15th ITG Conference on Speech Communication

  5. arXiv:2305.14023  [pdf, other

    cs.SD eess.AS

    Happy or Evil Laughter? Analysing a Database of Natural Audio Samples

    Authors: Aljoscha Düsterhöft, Felix Burkhardt, Björn W. Schuller

    Abstract: We conducted a data collection on the basis of the Google AudioSet database by selecting a subset of the samples annotated with \textit{laughter}. The selection criterion was to be present a communicative act with clear connotation of being either positive (laughing with) or negative (being laughed at). On the basis of this annotated data, we performed two experiments: on the one hand, we manually… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  6. arXiv:2303.00645  [pdf, other

    eess.AS cs.SD

    audb -- Sharing and Versioning of Audio and Annotation Data in Python

    Authors: Hagen Wierstorf, Johannes Wagner, Florian Eyben, Felix Burkhardt, Björn W. Schuller

    Abstract: Driven by the need for larger and more diverse datasets to pre-train and fine-tune increasingly complex machine learning models, the number of datasets is rapidly growing. audb is an open-source Python library that supports versioning and documentation of audio datasets. It aims to provide a standardized and simple user-interface to publish, maintain, and access the annotations and audio files of… ▽ More

    Submitted 10 May, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  7. arXiv:2203.07378  [pdf, other

    eess.AS cs.LG cs.SD

    Dawn of the transformer era in speech emotion recognition: closing the valence gap

    Authors: Johannes Wagner, Andreas Triantafyllopoulos, Hagen Wierstorf, Maximilian Schmitt, Felix Burkhardt, Florian Eyben, Björn W. Schuller

    Abstract: Recent advances in transformer-based architectures which are pre-trained in self-supervised manner have shown great promise in several machine learning tasks. In the audio domain, such architectures have also been successfully utilised in the field of speech emotion recognition (SER). However, existing works have not evaluated the influence of model size and pre-training data on downstream perform… ▽ More

    Submitted 7 September, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

    Journal ref: in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 9, pp. 10745-10759, 1 Sept. 2023

  8. arXiv:2004.14146  [pdf, other

    cs.NI eess.SP

    White Paper on Critical and Massive Machine Type Communication Towards 6G

    Authors: Nurul Huda Mahmood, Stefan Böcker, Andrea Munari, Federico Clazzer, Ingrid Moerman, Konstantin Mikhaylov, Onel Lopez, Ok-Sun Park, Eric Mercier, Hannes Bartz, Riku Jäntti, Ravikumar Pragada, Yihua Ma, Elina Annanperä, Christian Wietfeld, Martin Andraud, Gianluigi Liva, Yan Chen, Eduardo Garro, Frank Burkhardt, Hirley Alves, Chen-Feng Liu, Yalcin Sadi, Jean-Baptiste Dore, Eunah Kim , et al. (6 additional authors not shown)

    Abstract: The society as a whole, and many vertical sectors in particular, is becoming increasingly digitalized. Machine Type Communication (MTC), encompassing its massive and critical aspects, and ubiquitous wireless connectivity are among the main enablers of such digitization at large. The recently introduced 5G New Radio is natively designed to support both aspects of MTC to promote the digital transfor… ▽ More

    Submitted 4 May, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: White paper by http://www.6GFlagship.com

  9. arXiv:1906.12149  [pdf, other

    eess.SP

    A Spatially Consistent Geometric D2D Small-Scale Fading Model for Multiple Frequencies

    Authors: Stephan Jaeckel, Leszek Raschkowski, Frank Burkhardt, Lars Thiele

    Abstract: The 3GPP new radio (NR) channel model introduced spatial consistency and a correlation model for multiple frequencies. Future extensions of this model will incorporate mobility at both ends of the link. These features are essential for many emerging wireless technologies in the 5G era. However, the existing small-scale-fading (SSF) model does not integrate these features coherently. To solve this… ▽ More

    Submitted 28 June, 2019; originally announced June 2019.

    Comments: 5 pages, 3 figures, accepted at IEEE VTC Fall '19

  10. arXiv:1906.12145  [pdf, other

    eess.SP

    Industrial Indoor Measurements from 2-6 GHz for the 3GPP-NR and QuaDRiGa Channel Model

    Authors: Stephan Jaeckel, Nick Turay, Leszek Raschkowski, Lars Thiele, Risto Vuohtoniemi, Marko Sonkki, Veikko Hovinen, Frank Burkhardt, Prasanth Karunakaran, Thomas Heyn

    Abstract: Providing reliable low latency wireless links for advanced manufacturing and processing systems is a vision of Industry 4.0. Develo**, testing and rating requires accurate models of the radio propagation channel. The current 3GPP-NR model as well as the QuaDRiGa model lack the propagation parameters for the industrial indoor scenario. To close this gap, measurements were conducted at 2.37 GHz an… ▽ More

    Submitted 28 June, 2019; originally announced June 2019.

    Comments: 7 pages, 3 figures, 3 tables, submitted to IEEE VTC Fall '19

  11. Efficient Sum-of-Sinusoids based Spatial Consistency for the 3GPP New-Radio Channel Model

    Authors: Stephan Jaeckel, Leszek Raschkowski, Frank Burkhardt, Lars Thiele

    Abstract: Spatial consistency was proposed in the 3GPP TR 38.901 channel model to ensure that closely spaced mobile terminals have similar channels. Future extensions of this model might incorporate mobility at both ends of the link. This requires that all random variables in the model must be correlated in 3 (single-mobility) and up to 6 spatial dimensions (dual-mobility). Existing filtering methods cannot… ▽ More

    Submitted 14 August, 2018; originally announced August 2018.

    Journal ref: 2018 IEEE Globecom Workshops (GC Wkshps), Abu Dhabi, United Arab Emirates, 2018, pp. 1-7