-
Data Augmentation for Dementia Detection in Spoken Language
Authors:
Anna Hlédiková,
Dominika Woszczyk,
Alican Akman,
Soteris Demetriou,
Björn Schuller
Abstract:
Dementia is a growing problem as our society ages, and detection methods are often invasive and expensive. Recent deep-learning techniques can offer a faster diagnosis and have shown promising results. However, they require large amounts of labelled data which is not easily available for the task of dementia detection. One effective solution to sparse data problems is data augmentation, though the…
▽ More
Dementia is a growing problem as our society ages, and detection methods are often invasive and expensive. Recent deep-learning techniques can offer a faster diagnosis and have shown promising results. However, they require large amounts of labelled data which is not easily available for the task of dementia detection. One effective solution to sparse data problems is data augmentation, though the exact methods need to be selected carefully. To date, there has been no empirical study of data augmentation on Alzheimer's disease (AD) datasets for NLP and speech processing. In this work, we investigate data augmentation techniques for the task of AD detection and perform an empirical evaluation of the different approaches on two kinds of models for both the text and audio domains. We use a transformer-based model for both domains, and SVM and Random Forest models for the text and audio domains, respectively. We generate additional samples using traditional as well as deep learning based methods and show that data augmentation improves performance for both the text- and audio-based models and that such results are comparable to state-of-the-art results on the popular ADReSS set, with carefully crafted architectures and features.
△ Less
Submitted 16 July, 2022; v1 submitted 26 June, 2022;
originally announced June 2022.
-
Characterizing Improper Input Validation Vulnerabilities of Mobile Crowdsourcing Services
Authors:
Sojhal Ismail Khan,
Dominika Woszczyk,
Chengzeng You,
Soteris Demetriou,
Muhammad Naveed
Abstract:
Mobile crowdsourcing services (MCS), enable fast and economical data acquisition at scale and find applications in a variety of domains. Prior work has shown that Foursquare and Waze (a location-based and a navigation MCS) are vulnerable to different kinds of data poisoning attacks. Such attacks can be upsetting and even dangerous especially when they are used to inject improper inputs to mislead…
▽ More
Mobile crowdsourcing services (MCS), enable fast and economical data acquisition at scale and find applications in a variety of domains. Prior work has shown that Foursquare and Waze (a location-based and a navigation MCS) are vulnerable to different kinds of data poisoning attacks. Such attacks can be upsetting and even dangerous especially when they are used to inject improper inputs to mislead users. However, to date, there is no comprehensive study on the extent of improper input validation (IIV) vulnerabilities and the feasibility of their exploits in MCSs across domains. In this work, we leverage the fact that MCS interface with their participants through mobile apps to design tools and new methodologies embodied in an end-to-end feedback-driven analysis framework which we use to study 10 popular and previously unexplored services in five different domains. Using our framework we send tens of thousands of API requests with automatically generated input values to characterize their IIV attack surface. Alarmingly, we found that most of them (8/10) suffer from grave IIV vulnerabilities which allow an adversary to launch data poisoning attacks at scale: 7400 spoofed API requests were successful in faking online posts for robberies, gunshots, and other dangerous incidents, faking fitness activities with supernatural speeds and distances among many others. Lastly, we discuss easy to implement and deploy mitigation strategies which can greatly reduce the IIV attack surface and argue for their use as a necessary complementary measure working toward trustworthy mobile crowdsourcing services.
△ Less
Submitted 18 October, 2021; v1 submitted 16 October, 2021;
originally announced October 2021.
-
Open, Sesame! Introducing Access Control to Voice Services
Authors:
Dominika Woszczyk,
Alvin Lee,
Soteris Demetriou
Abstract:
Personal voice assistants (VAs) are shown to be vulnerable against record-and-replay, and other acoustic attacks which allow an adversary to gain unauthorized control of connected devices within a smart home. Existing defenses either lack detection and management capabilities or are too coarse-grained to enable flexible policies on par with other computing interfaces. In this work, we present Sesa…
▽ More
Personal voice assistants (VAs) are shown to be vulnerable against record-and-replay, and other acoustic attacks which allow an adversary to gain unauthorized control of connected devices within a smart home. Existing defenses either lack detection and management capabilities or are too coarse-grained to enable flexible policies on par with other computing interfaces. In this work, we present Sesame, a lightweight framework for edge devices which is the first to enable fine-grained access control of smart-home voice commands. Sesame combines three components: Automatic Speech Recognition, Natural Language Understanding (NLU) and a Policy module. We implemented Sesame on Android devices and demonstrate that our system can enforce security policies for both Alexa and Google Home in real-time (362ms end-to-end inference time), with a lightweight (<25MB) NLU model which exhibits minimal accuracy loss compared to its non-compact equivalent.
△ Less
Submitted 27 June, 2021;
originally announced June 2021.
-
Domain Adversarial Neural Networks for Dysarthric Speech Recognition
Authors:
Dominika Woszczyk,
Stavros Petridis,
David Millard
Abstract:
Speech recognition systems have improved dramatically over the last few years, however, their performance is significantly degraded for the cases of accented or impaired speech. This work explores domain adversarial neural networks (DANN) for speaker-independent speech recognition on the UAS dataset of dysarthric speech. The classification task on 10 spoken digits is performed using an end-to-end…
▽ More
Speech recognition systems have improved dramatically over the last few years, however, their performance is significantly degraded for the cases of accented or impaired speech. This work explores domain adversarial neural networks (DANN) for speaker-independent speech recognition on the UAS dataset of dysarthric speech. The classification task on 10 spoken digits is performed using an end-to-end CNN taking raw audio as input. The results are compared to a speaker-adaptive (SA) model as well as speaker-dependent (SD) and multi-task learning models (MTL). The experiments conducted in this paper show that DANN achieves an absolute recognition rate of 74.91% and outperforms the baseline by 12.18%. Additionally, the DANN model achieves comparable results to the SA model's recognition rate of 77.65%. We also observe that when labelled dysarthric speech data is available DANN and MTL perform similarly, but when they are not DANN performs better than MTL.
△ Less
Submitted 7 October, 2020;
originally announced October 2020.
-
MaaSim: A Liveability Simulation for Improving the Quality of Life in Cities
Authors:
Dominika Woszczyk,
Gerasimos Spanakis
Abstract:
Urbanism is no longer planned on paper thanks to powerful models and 3D simulation platforms. However, current work is not open to the public and lacks an optimisation agent that could help in decision making. This paper describes the creation of an open-source simulation based on an existing Dutch liveability score with a built-in AI module. Features are selected using feature engineering and Ran…
▽ More
Urbanism is no longer planned on paper thanks to powerful models and 3D simulation platforms. However, current work is not open to the public and lacks an optimisation agent that could help in decision making. This paper describes the creation of an open-source simulation based on an existing Dutch liveability score with a built-in AI module. Features are selected using feature engineering and Random Forests. Then, a modified scoring function is built based on the former liveability classes. The score is predicted using Random Forest for regression and achieved a recall of 0.83 with 10-fold cross-validation. Afterwards, Exploratory Factor Analysis is applied to select the actions present in the model. The resulting indicators are divided into 5 groups, and 12 actions are generated. The performance of four optimisation algorithms is compared, namely NSGA-II, PAES, SPEA2 and eps-MOEA, on three established criteria of quality: cardinality, the spread of the solutions, spacing, and the resulting score and number of turns. Although all four algorithms show different strengths, eps-MOEA is selected to be the most suitable for this problem. Ultimately, the simulation incorporates the model and the selected AI module in a GUI written in the Kivy framework for Python. Tests performed on users show positive responses and encourage further initiatives towards joining technology and public applications.
△ Less
Submitted 13 October, 2018;
originally announced October 2018.