-
Simulating Realistic Post-Stroke Reaching Kinematics with Generative Adversarial Networks
Authors:
Aaron J. Hadley,
Christopher L. Pulliam
Abstract:
The generalizability of machine learning (ML) models for wearable monitoring in stroke rehabilitation is often constrained by the limited scale and heterogeneity of available data. Data augmentation addresses this challenge by adding computationally derived data to real data to enrich the variability represented in the training set. Traditional augmentation methods, such as rotation, permutation,…
▽ More
The generalizability of machine learning (ML) models for wearable monitoring in stroke rehabilitation is often constrained by the limited scale and heterogeneity of available data. Data augmentation addresses this challenge by adding computationally derived data to real data to enrich the variability represented in the training set. Traditional augmentation methods, such as rotation, permutation, and time-war**, have shown some benefits in improving classifier performance, but often fail to produce realistic training examples. This study employs Conditional Generative Adversarial Networks (cGANs) to create synthetic kinematic data from a publicly available dataset, closely mimicking the experimentally measured reaching movements of stroke survivors. This approach not only captures the complex temporal dynamics and common movement patterns after stroke, but also significantly enhances the training dataset. By training deep learning models on both synthetic and experimental data, we achieved a substantial enhancement in task classification accuracy: models incorporating synthetic data attained an overall accuracy of 80.2%, significantly higher than the 63.1% seen in models trained solely with real data. These improvements allow for more precise task classification, offering clinicians the potential to monitor patient progress more accurately and tailor rehabilitation interventions more effectively.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Impressions of the GDMC AI Settlement Generation Challenge in Minecraft
Authors:
Christoph Salge,
Claus Aranha,
Adrian Brightmoore,
Sean Butler,
Rodrigo Canaan,
Michael Cook,
Michael Cerny Green,
Hagen Fischer,
Christian Guckelsberger,
Jupiter Hadley,
Jean-Baptiste Hervé,
Mark R Johnson,
Quinn Kybartas,
David Mason,
Mike Preuss,
Tristan Smith,
Ruck Thawonmas,
Julian Togelius
Abstract:
The GDMC AI settlement generation challenge is a PCG competition about producing an algorithm that can create an "interesting" Minecraft settlement for a given map. This paper contains a collection of written experiences with this competition, by participants, judges, organizers and advisors. We asked people to reflect both on the artifacts themselves, and on the competition in general. The aim of…
▽ More
The GDMC AI settlement generation challenge is a PCG competition about producing an algorithm that can create an "interesting" Minecraft settlement for a given map. This paper contains a collection of written experiences with this competition, by participants, judges, organizers and advisors. We asked people to reflect both on the artifacts themselves, and on the competition in general. The aim of this paper is to offer a shareable and edited collection of experiences and qualitative feedback - which seem to contain a lot of insights on PCG and computational creativity, but would otherwise be lost once the output of the competition is reduced to scalar performance values. We reflect upon some organizational issues for AI competitions, and discuss the future of the GDMC competition.
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
Using Multiple Subwords to Improve English-Esperanto Automated Literary Translation Quality
Authors:
Alberto Poncelas,
Jan Buts,
James Hadley,
Andy Way
Abstract:
Building Machine Translation (MT) systems for low-resource languages remains challenging. For many language pairs, parallel data are not widely available, and in such cases MT models do not achieve results comparable to those seen with high-resource languages.
When data are scarce, it is of paramount importance to make optimal use of the limited material available. To that end, in this paper we…
▽ More
Building Machine Translation (MT) systems for low-resource languages remains challenging. For many language pairs, parallel data are not widely available, and in such cases MT models do not achieve results comparable to those seen with high-resource languages.
When data are scarce, it is of paramount importance to make optimal use of the limited material available. To that end, in this paper we propose employing the same parallel sentences multiple times, only changing the way the words are split each time. For this purpose we use several Byte Pair Encoding models, with various merge operations used in their configuration.
In our experiments, we use this technique to expand the available data and improve an MT system involving a low-resource language pair, namely English-Esperanto.
As an additional contribution, we made available a set of English-Esperanto parallel data in the literary domain.
△ Less
Submitted 28 November, 2020;
originally announced November 2020.
-
The Impact of Indirect Machine Translation on Sentiment Classification
Authors:
Alberto Poncelas,
Pintu Lohar,
Andy Way,
James Hadley
Abstract:
Sentiment classification has been crucial for many natural language processing (NLP) applications, such as the analysis of movie reviews, tweets, or customer feedback. A sufficiently large amount of data is required to build a robust sentiment classification system. However, such resources are not always available for all domains or for all languages.
In this work, we propose employing a machine…
▽ More
Sentiment classification has been crucial for many natural language processing (NLP) applications, such as the analysis of movie reviews, tweets, or customer feedback. A sufficiently large amount of data is required to build a robust sentiment classification system. However, such resources are not always available for all domains or for all languages.
In this work, we propose employing a machine translation (MT) system to translate customer feedback into another language to investigate in which cases translated sentences can have a positive or negative impact on an automatic sentiment classifier. Furthermore, as performing a direct translation is not always possible, we explore the performance of automatic classifiers on sentences that have been translated using a pivot MT system.
We conduct several experiments using the above approaches to analyse the performance of our proposed sentiment classification system and discuss the advantages and drawbacks of classifying translated sentences.
△ Less
Submitted 25 August, 2020;
originally announced August 2020.
-
Multiple Segmentations of Thai Sentences for Neural Machine Translation
Authors:
Alberto Poncelas,
Wichaya Pidchamook,
Chao-Hong Liu,
James Hadley,
Andy Way
Abstract:
Thai is a low-resource language, so it is often the case that data is not available in sufficient quantities to train an Neural Machine Translation (NMT) model which perform to a high level of quality. In addition, the Thai script does not use white spaces to delimit the boundaries between words, which adds more complexity when building sequence to sequence models. In this work, we explore how to…
▽ More
Thai is a low-resource language, so it is often the case that data is not available in sufficient quantities to train an Neural Machine Translation (NMT) model which perform to a high level of quality. In addition, the Thai script does not use white spaces to delimit the boundaries between words, which adds more complexity when building sequence to sequence models. In this work, we explore how to augment a set of English--Thai parallel data by replicating sentence-pairs with different word segmentation methods on Thai, as training data for NMT model training. Using different merge operations of Byte Pair Encoding, different segmentations of Thai sentences can be obtained. The experiments show that combining these datasets, performance is improved for NMT models trained with a dataset that has been split using a supervised splitting tool.
△ Less
Submitted 23 April, 2020;
originally announced April 2020.
-
A Tool for Facilitating OCR Postediting in Historical Documents
Authors:
Alberto Poncelas,
Mohammad Aboomar,
Jan Buts,
James Hadley,
Andy Way
Abstract:
Optical character recognition (OCR) for historical documents is a complex procedure subject to a unique set of material issues, including inconsistencies in typefaces and low quality scanning. Consequently, even the most sophisticated OCR engines produce errors. This paper reports on a tool built for postediting the output of Tesseract, more specifically for correcting common errors in digitized h…
▽ More
Optical character recognition (OCR) for historical documents is a complex procedure subject to a unique set of material issues, including inconsistencies in typefaces and low quality scanning. Consequently, even the most sophisticated OCR engines produce errors. This paper reports on a tool built for postediting the output of Tesseract, more specifically for correcting common errors in digitized historical documents. The proposed tool suggests alternatives for word forms not found in a specified vocabulary. The assumed error is replaced by a presumably correct alternative in the post-edition based on the scores of a Language Model (LM). The tool is tested on a chapter of the book An Essay Towards Regulating the Trade and Employing the Poor of this Kingdom (Cary ,1719). As demonstrated below, the tool is successful in correcting a number of common errors. If sometimes unreliable, it is also transparent and subject to human intervention.
△ Less
Submitted 23 April, 2020;
originally announced April 2020.
-
Cloud Brokerage: A Systematic Survey
Authors:
Abdessalam Elhabbash,
Faiza Samreen,
James Hadley,
Yehia Elkhatib
Abstract:
Background: The proliferation of cloud providers and provisioning levels has opened a space for cloud brokerage services. Brokers intermediate between cloud customers and providers to assist the customer in selecting the most suitable cloud service, hel** to manage the dimensionality, heterogeneity, and uncertainty associated with cloud services. Objective: This paper identifies and classifies a…
▽ More
Background: The proliferation of cloud providers and provisioning levels has opened a space for cloud brokerage services. Brokers intermediate between cloud customers and providers to assist the customer in selecting the most suitable cloud service, hel** to manage the dimensionality, heterogeneity, and uncertainty associated with cloud services. Objective: This paper identifies and classifies approaches to realise cloud brokerage. By doing so, this paper presents an understanding of the state of the art and a novel taxonomy to characterise cloud brokers. Method: We conducted a systematic literature survey to compile studies related to cloud brokerage and explore how cloud brokers are engineered. We analysed the studies from multiple perspectives, such as motivation, functionality, engineering approach, and evaluation methodology. Results: The survey resulted in a knowledge base of current proposals for realising cloud brokers. The survey identified surprising differences between the studies' implementations, with engineering efforts directed at combinations of market-based solutions, middlewares, toolkits, algorithms, semantic frameworks, and conceptual frameworks. Conclusion: Our comprehensive meta-analysis shows that cloud brokerage is still a formative field. There is no doubt that progress has been achieved in the field but considerable challenges remain to be addressed. This survey identifies such challenges and directions for future research.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.