Domain Adaptation for Inertial Measurement Unit-based Human Activity Recognition: A Survey
Authors:
Avijoy Chakma,
Abu Zaher Md Faridee,
Indrajeet Ghosh,
Nirmalya Roy
Abstract:
Machine learning-based wearable human activity recognition (WHAR) models enable the development of various smart and connected community applications such as sleep pattern monitoring, medication reminders, cognitive health assessment, sports analytics, etc. However, the widespread adoption of these WHAR models is impeded by their degraded performance in the presence of data distribution heterogene…
▽ More
Machine learning-based wearable human activity recognition (WHAR) models enable the development of various smart and connected community applications such as sleep pattern monitoring, medication reminders, cognitive health assessment, sports analytics, etc. However, the widespread adoption of these WHAR models is impeded by their degraded performance in the presence of data distribution heterogeneities caused by the sensor placement at different body positions, inherent biases and heterogeneities across devices, and personal and environmental diversities. Various traditional machine learning algorithms and transfer learning techniques have been proposed in the literature to address the underpinning challenges of handling such data heterogeneities. Domain adaptation is one such transfer learning techniques that has gained significant popularity in recent literature. In this paper, we survey the recent progress of domain adaptation techniques in the Inertial Measurement Unit (IMU)-based human activity recognition area, discuss potential future directions.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
Predicting score distribution to improve non-intrusive speech quality estimation
Authors:
Abu Zaher Md Faridee,
Hannes Gamper
Abstract:
Deep noise suppressors (DNS) have become an attractive solution to remove background noise, reverberation, and distortions from speech and are widely used in telephony/voice applications. They are also occasionally prone to introducing artifacts and lowering the perceptual quality of the speech. Subjective listening tests that use multiple human judges to derive a mean opinion score (MOS) are a po…
▽ More
Deep noise suppressors (DNS) have become an attractive solution to remove background noise, reverberation, and distortions from speech and are widely used in telephony/voice applications. They are also occasionally prone to introducing artifacts and lowering the perceptual quality of the speech. Subjective listening tests that use multiple human judges to derive a mean opinion score (MOS) are a popular way to measure these models' performance. Deep neural network based non-intrusive MOS estimation models have recently emerged as a popular cost-efficient alternative to these tests. These models are trained with only the MOS labels, often discarding the secondary statistics of the opinion scores. In this paper, we investigate several ways to integrate the distribution of opinion scores (e.g. variance, histogram information) to improve the MOS estimation performance. Our model is trained on a corpus of 419K denoised samples by 320 different DNS models and model variations and evaluated on 18K test samples from DNSMOS. We show that with very minor modification of a single task MOS estimation pipeline, these freely available labels can provide up to a 0.016 RMSE and 1% SRCC improvement.
△ Less
Submitted 13 April, 2022;
originally announced April 2022.