Skip to main content

Showing 1–16 of 16 results for author: Gerczuk, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10275  [pdf, other

    cs.CL

    ExHuBERT: Enhancing HuBERT Through Block Extension and Fine-Tuning on 37 Emotion Datasets

    Authors: Shahin Amiriparian, Filip Packań, Maurice Gerczuk, Björn W. Schuller

    Abstract: Foundation models have shown great promise in speech emotion recognition (SER) by leveraging their pre-trained representations to capture emotion patterns in speech signals. To further enhance SER performance across various languages and domains, we propose a novel twofold approach. First, we gather EmoSet++, a comprehensive multi-lingual, multi-cultural speech emotion corpus with 37 datasets, 150… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: accepted at INTERSPEECH 2024

    MSC Class: 68T10 ACM Class: I.2

  2. arXiv:2406.07753  [pdf, ps, other

    cs.AI cs.CL

    The MuSe 2024 Multimodal Sentiment Analysis Challenge: Social Perception and Humor Recognition

    Authors: Shahin Amiriparian, Lukas Christ, Alexander Kathan, Maurice Gerczuk, Niklas Müller, Steffen Klug, Lukas Stappen, Andreas König, Erik Cambria, Björn Schuller, Simone Eulitz

    Abstract: The Multimodal Sentiment Analysis Challenge (MuSe) 2024 addresses two contemporary multimodal affect and sentiment analysis problems: In the Social Perception Sub-Challenge (MuSe-Perception), participants will predict 16 different social attributes of individuals such as assertiveness, dominance, likability, and sincerity based on the provided audio-visual data. The Cross-Cultural Humor Detection… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    MSC Class: 68T10 ACM Class: I.2

  3. arXiv:2406.06355  [pdf, other

    cs.CL

    Sustained Vowels for Pre- vs Post-Treatment COPD Classification

    Authors: Andreas Triantafyllopoulos, Anton Batliner, Wolfgang Mayr, Markus Fendler, Florian Pokorny, Maurice Gerczuk, Shahin Amiriparian, Thomas Berghaus, Björn Schuller

    Abstract: Chronic obstructive pulmonary disease (COPD) is a serious inflammatory lung disease affecting millions of people around the world. Due to an obstructed airflow from the lungs, it also becomes manifest in patients' vocal behaviour. Of particular importance is the detection of an exacerbation episode, which marks an acute phase and often requires hospitalisation and treatment. Previous work has show… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024

  4. arXiv:2404.12132  [pdf, other

    cs.SD cs.CL eess.AS

    Enhancing Suicide Risk Assessment: A Speech-Based Automated Approach in Emergency Medicine

    Authors: Shahin Amiriparian, Maurice Gerczuk, Justina Lutz, Wolfgang Strube, Irina Papazova, Alkomiet Hasan, Alexander Kathan, Björn W. Schuller

    Abstract: The delayed access to specialized psychiatric assessments and care for patients at risk of suicidal tendencies in emergency departments creates a notable gap in timely intervention, hindering the provision of adequate mental health support during critical situations. To address this, we present a non-invasive, speech-based approach for automatic suicide risk assessment. For our study, we have coll… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    ACM Class: I.2

  5. arXiv:2304.14882  [pdf, other

    cs.SD cs.LG eess.AS

    The ACM Multimedia 2023 Computational Paralinguistics Challenge: Emotion Share & Requests

    Authors: Björn W. Schuller, Anton Batliner, Shahin Amiriparian, Alexander Barnhill, Maurice Gerczuk, Andreas Triantafyllopoulos, Alice Baird, Panagiotis Tzirakis, Chris Gagne, Alan S. Cowen, Nikola Lackovic, Marie-José Caraty, Claude Montacié

    Abstract: The ACM Multimedia 2023 Computational Paralinguistics Challenge addresses two different problems for the first time in a research competition under well-defined conditions: In the Emotion Share Sub-Challenge, a regression on speech has to be made; and in the Requests Sub-Challenges, requests and complaints need to be detected. We describe the Sub-Challenges, baseline feature extraction, and classi… ▽ More

    Submitted 1 May, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

    Comments: 5 pages, part of the ACM Multimedia 2023 Grand Challenge "The ACM Multimedia 2023 Computational Paralinguistics Challenge (ComParE 2023). arXiv admin note: text overlap with arXiv:2205.06799

    MSC Class: 68 ACM Class: I.2.7; I.5.0; J.3

  6. arXiv:2301.10477  [pdf, other

    cs.SD cs.CY eess.AS

    HEAR4Health: A blueprint for making computer audition a staple of modern healthcare

    Authors: Andreas Triantafyllopoulos, Alexander Kathan, Alice Baird, Lukas Christ, Alexander Gebhard, Maurice Gerczuk, Vincent Karas, Tobias Hübner, Xin **g, Shuo Liu, Adria Mallol-Ragolta, Manuel Milling, Sandra Ottl, Anastasia Semertzidou, Srividya Tirunellai Rajamani, Tianhao Yan, Zijiang Yang, Judith Dineley, Shahin Amiriparian, Katrin D. Bartl-Pokorny, Anton Batliner, Florian B. Pokorny, Björn W. Schuller

    Abstract: Recent years have seen a rapid increase in digital medicine research in an attempt to transform traditional healthcare systems to their modern, intelligent, and versatile equivalents that are adequately equipped to tackle contemporary challenges. This has led to a wave of applications that utilise AI technologies; first and foremost in the fields of medical imaging, but also in the use of wearable… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

  7. Distinguishing between pre- and post-treatment in the speech of patients with chronic obstructive pulmonary disease

    Authors: Andreas Triantafyllopoulos, Markus Fendler, Anton Batliner, Maurice Gerczuk, Shahin Amiriparian, Thomas M. Berghaus, Björn W. Schuller

    Abstract: Chronic obstructive pulmonary disease (COPD) causes lung inflammation and airflow blockage leading to a variety of respiratory symptoms; it is also a leading cause of death and affects millions of individuals around the world. Patients often require treatment and hospitalisation, while no cure is currently available. As COPD predominantly affects the respiratory system, speech and non-linguistic v… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    Comments: Accepted in INTERSPEECH 2022

    Journal ref: Proc. Interspeech 2022, 3623-3627

  8. arXiv:2205.06799  [pdf, other

    cs.SD cs.LG eess.AS

    The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes

    Authors: Björn W. Schuller, Anton Batliner, Shahin Amiriparian, Christian Bergler, Maurice Gerczuk, Natalie Holz, Pauline Larrouy-Maestri, Sebastian P. Bayerl, Korbinian Riedhammer, Adria Mallol-Ragolta, Maria Pateraki, Harry Coppock, Ivan Kiskin, Marianne Sinka, Stephen Roberts

    Abstract: The ACM Multimedia 2022 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the Vocalisations and Stuttering Sub-Challenges, a classification on human non-verbal vocalisations and speech has to be made; the Activity Sub-Challenge aims at beyond-audio human activity recognition from smartwatch senso… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: 5 pages, part of the ACM Multimedia 2022 Grand Challenge "The ACM Multimedia 2022 Computational Paralinguistics Challenge (ComParE 2022)"

    MSC Class: 68 ACM Class: I.2.7; I.5.0; J.3

  9. arXiv:2205.04343  [pdf, other

    cs.SD cs.LG eess.AS

    Fatigue Prediction in Outdoor Running Conditions using Audio Data

    Authors: Andreas Triantafyllopoulos, Sandra Ottl, Alexander Gebhard, Esther Rituerto-González, Mirko Jaumann, Steffen Hüttner, Valerie Dieter, Patrick Schneeweiß, Inga Krauß, Maurice Gerczuk, Shahin Amiriparian, Björn W. Schuller

    Abstract: Although running is a common leisure activity and a core training regiment for several athletes, between $29\%$ and $79\%$ of runners sustain an overuse injury each year. These injuries are linked to excessive fatigue, which alters how someone runs. In this work, we explore the feasibility of modelling the Borg received perception of exertion (RPE) scale (range: $[6-20]$), a well-validated subject… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

    Comments: Paper accepted at IEEE EMBC 2022. Rights remain with IEEE

  10. arXiv:2202.08981  [pdf, other

    cs.SD cs.LG eess.AS

    A Summary of the ComParE COVID-19 Challenges

    Authors: Harry Coppock, Alican Akman, Christian Bergler, Maurice Gerczuk, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, **g Han, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Panagiotis Tzirakis, Anton Batliner, Cecilia Mascolo, Björn W. Schuller

    Abstract: The COVID-19 pandemic has caused massive humanitarian and economic damage. Teams of scientists from a broad range of disciplines have searched for methods to help governments and communities combat the disease. One avenue from the machine learning field which has been explored is the prospect of a digital mass test which can detect COVID-19 from infected individuals' respiratory sounds. We present… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

    Comments: 18 pages, 13 figures

  11. arXiv:2109.08049  [pdf, other

    cs.LG cs.AI cs.CV

    A Machine Learning Framework for Automatic Prediction of Human Semen Motility

    Authors: Sandra Ottl, Shahin Amiriparian, Maurice Gerczuk, Björn Schuller

    Abstract: In this paper, human semen samples from the visem dataset collected by the Simula Research Laboratory are automatically assessed with machine learning methods for their quality in respect to sperm motility. Several regression models are trained to automatically predict the percentage (0 to 100) of progressive, non-progressive, and immotile spermatozoa in a given sample. The video samples are adopt… ▽ More

    Submitted 17 September, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

    ACM Class: I.2.0; I.4.10

  12. arXiv:2104.11629  [pdf, other

    cs.SD cs.LG eess.AS

    DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing from Decentralised Data

    Authors: Shahin Amiriparian, Tobias Hübner, Maurice Gerczuk, Sandra Ottl, Björn W. Schuller

    Abstract: Deep neural speech and audio processing systems have a large number of trainable parameters, a relatively complex architecture, and require a vast amount of training data and computational power. These constraints make it more challenging to integrate such systems into embedded devices and utilise them for real-time, real-world applications. We tackle these limitations by introducing DeepSpectrumL… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: 5 pages, 1 figure

  13. arXiv:2104.10121  [pdf, other

    cs.SD cs.CL eess.AS

    On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

    Authors: Shahin Amiriparian, Artem Sokolov, Ilhan Aslan, Lukas Christ, Maurice Gerczuk, Tobias Hübner, Dmitry Lamanov, Manuel Milling, Sandra Ottl, Ilya Poduremennykh, Evgeniy Shuranov, Björn W. Schuller

    Abstract: Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the SER systems. Further, more clarification is required for analysing the impact of ASR's word error rate (WER) on linguistic emotion recognition per se and in the… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: 5 pages, 1 figure

    ACM Class: I.2.7; I.5.0

  14. arXiv:2103.08310  [pdf, other

    cs.SD cs.LG eess.AS

    EmoNet: A Transfer Learning Framework for Multi-Corpus Speech Emotion Recognition

    Authors: Maurice Gerczuk, Shahin Amiriparian, Sandra Ottl, Björn Schuller

    Abstract: In this manuscript, the topic of multi-corpus Speech Emotion Recognition (SER) is approached from a deep transfer learning perspective. A large corpus of emotional speech data, EmoSet, is assembled from a number of existing SER corpora. In total, EmoSet contains 84181 audio recordings from 26 SER corpora with a total duration of over 65 hours. The corpus is then utilised to create a novel framewor… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

    Comments: 18 pages, 7 figures

  15. arXiv:2102.13468  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates

    Authors: Björn W. Schuller, Anton Batliner, Christian Bergler, Cecilia Mascolo, **g Han, Iulia Lefter, Heysem Kaya, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Maurice Gerczuk, Panagiotis Tzirakis, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Leon J. M. Rothkrantz, Joeri Zwerts, Jelle Treep, Casper Kaandorp

    Abstract: The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech; in the Escalation SubChallenge, a three-way assessment of the level of es… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: 5 pages

    MSC Class: 68 ACM Class: I.2.7; I.5.0; J.3

  16. arXiv:2005.08722  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    A Novel Fusion of Attention and Sequence to Sequence Autoencoders to Predict Sleepiness From Speech

    Authors: Shahin Amiriparian, Pawel Winokurow, Vincent Karas, Sandra Ottl, Maurice Gerczuk, Björn W. Schuller

    Abstract: Motivated by the attention mechanism of the human visual system and recent developments in the field of machine translation, we introduce our attention-based and recurrent sequence to sequence autoencoders for fully unsupervised representation learning from audio files. In particular, we test the efficacy of our novel approach on the task of speech-based sleepiness recognition. We evaluate the lea… ▽ More

    Submitted 19 May, 2020; v1 submitted 15 May, 2020; originally announced May 2020.

    Comments: 5 pages, 2 figures, submitted to INTERSPEECH 2020