Search | arXiv e-print repository

Simulating Realistic Post-Stroke Reaching Kinematics with Generative Adversarial Networks

Authors: Aaron J. Hadley, Christopher L. Pulliam

Abstract: The generalizability of machine learning (ML) models for wearable monitoring in stroke rehabilitation is often constrained by the limited scale and heterogeneity of available data. Data augmentation addresses this challenge by adding computationally derived data to real data to enrich the variability represented in the training set. Traditional augmentation methods, such as rotation, permutation,… ▽ More The generalizability of machine learning (ML) models for wearable monitoring in stroke rehabilitation is often constrained by the limited scale and heterogeneity of available data. Data augmentation addresses this challenge by adding computationally derived data to real data to enrich the variability represented in the training set. Traditional augmentation methods, such as rotation, permutation, and time-war**, have shown some benefits in improving classifier performance, but often fail to produce realistic training examples. This study employs Conditional Generative Adversarial Networks (cGANs) to create synthetic kinematic data from a publicly available dataset, closely mimicking the experimentally measured reaching movements of stroke survivors. This approach not only captures the complex temporal dynamics and common movement patterns after stroke, but also significantly enhances the training dataset. By training deep learning models on both synthetic and experimental data, we achieved a substantial enhancement in task classification accuracy: models incorporating synthetic data attained an overall accuracy of 80.2%, significantly higher than the 63.1% seen in models trained solely with real data. These improvements allow for more precise task classification, offering clinicians the potential to monitor patient progress more accurately and tailor rehabilitation interventions more effectively. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 8 pages, 6 figures, 2 tables; submitted to IEEE BHI'24

arXiv:2108.02955 [pdf, other]

Impressions of the GDMC AI Settlement Generation Challenge in Minecraft

Authors: Christoph Salge, Claus Aranha, Adrian Brightmoore, Sean Butler, Rodrigo Canaan, Michael Cook, Michael Cerny Green, Hagen Fischer, Christian Guckelsberger, Jupiter Hadley, Jean-Baptiste Hervé, Mark R Johnson, Quinn Kybartas, David Mason, Mike Preuss, Tristan Smith, Ruck Thawonmas, Julian Togelius

Abstract: The GDMC AI settlement generation challenge is a PCG competition about producing an algorithm that can create an "interesting" Minecraft settlement for a given map. This paper contains a collection of written experiences with this competition, by participants, judges, organizers and advisors. We asked people to reflect both on the artifacts themselves, and on the competition in general. The aim of… ▽ More The GDMC AI settlement generation challenge is a PCG competition about producing an algorithm that can create an "interesting" Minecraft settlement for a given map. This paper contains a collection of written experiences with this competition, by participants, judges, organizers and advisors. We asked people to reflect both on the artifacts themselves, and on the competition in general. The aim of this paper is to offer a shareable and edited collection of experiences and qualitative feedback - which seem to contain a lot of insights on PCG and computational creativity, but would otherwise be lost once the output of the competition is reduced to scalar performance values. We reflect upon some organizational issues for AI competitions, and discuss the future of the GDMC competition. △ Less

Submitted 6 August, 2021; originally announced August 2021.

Comments: 28 pages, 5 figures

arXiv:2011.14190 [pdf, other]

Using Multiple Subwords to Improve English-Esperanto Automated Literary Translation Quality

Authors: Alberto Poncelas, Jan Buts, James Hadley, Andy Way

Abstract: Building Machine Translation (MT) systems for low-resource languages remains challenging. For many language pairs, parallel data are not widely available, and in such cases MT models do not achieve results comparable to those seen with high-resource languages. When data are scarce, it is of paramount importance to make optimal use of the limited material available. To that end, in this paper we… ▽ More Building Machine Translation (MT) systems for low-resource languages remains challenging. For many language pairs, parallel data are not widely available, and in such cases MT models do not achieve results comparable to those seen with high-resource languages. When data are scarce, it is of paramount importance to make optimal use of the limited material available. To that end, in this paper we propose employing the same parallel sentences multiple times, only changing the way the words are split each time. For this purpose we use several Byte Pair Encoding models, with various merge operations used in their configuration. In our experiments, we use this technique to expand the available data and improve an MT system involving a low-resource language pair, namely English-Esperanto. As an additional contribution, we made available a set of English-Esperanto parallel data in the literary domain. △ Less

Submitted 28 November, 2020; originally announced November 2020.

Journal ref: The 3rd Workshop on Technologies for MT of Low Resource Languages (LoResMT 2020)

arXiv:2008.11257 [pdf, ps, other]

The Impact of Indirect Machine Translation on Sentiment Classification

Authors: Alberto Poncelas, Pintu Lohar, Andy Way, James Hadley

Abstract: Sentiment classification has been crucial for many natural language processing (NLP) applications, such as the analysis of movie reviews, tweets, or customer feedback. A sufficiently large amount of data is required to build a robust sentiment classification system. However, such resources are not always available for all domains or for all languages. In this work, we propose employing a machine… ▽ More Sentiment classification has been crucial for many natural language processing (NLP) applications, such as the analysis of movie reviews, tweets, or customer feedback. A sufficiently large amount of data is required to build a robust sentiment classification system. However, such resources are not always available for all domains or for all languages. In this work, we propose employing a machine translation (MT) system to translate customer feedback into another language to investigate in which cases translated sentences can have a positive or negative impact on an automatic sentiment classifier. Furthermore, as performing a direct translation is not always possible, we explore the performance of automatic classifiers on sentences that have been translated using a pivot MT system. We conduct several experiments using the above approaches to analyse the performance of our proposed sentiment classification system and discuss the advantages and drawbacks of classifying translated sentences. △ Less

Submitted 25 August, 2020; originally announced August 2020.

Journal ref: Proceedings of Association for Machine Translation in the Americas, AMTA (2020)

arXiv:2004.11472 [pdf, ps, other]

Multiple Segmentations of Thai Sentences for Neural Machine Translation

Authors: Alberto Poncelas, Wichaya Pidchamook, Chao-Hong Liu, James Hadley, Andy Way

Abstract: Thai is a low-resource language, so it is often the case that data is not available in sufficient quantities to train an Neural Machine Translation (NMT) model which perform to a high level of quality. In addition, the Thai script does not use white spaces to delimit the boundaries between words, which adds more complexity when building sequence to sequence models. In this work, we explore how to… ▽ More Thai is a low-resource language, so it is often the case that data is not available in sufficient quantities to train an Neural Machine Translation (NMT) model which perform to a high level of quality. In addition, the Thai script does not use white spaces to delimit the boundaries between words, which adds more complexity when building sequence to sequence models. In this work, we explore how to augment a set of English--Thai parallel data by replicating sentence-pairs with different word segmentation methods on Thai, as training data for NMT model training. Using different merge operations of Byte Pair Encoding, different segmentations of Thai sentences can be obtained. The experiments show that combining these datasets, performance is improved for NMT models trained with a dataset that has been split using a supervised splitting tool. △ Less

Submitted 23 April, 2020; originally announced April 2020.

Journal ref: Spoken Language Technologies for Under-resourced languages and CCURL Collaboration and Computing for Under-Resourced Languages Workshop, SLTU-CCURL (2020)

arXiv:2004.11471 [pdf, other]

A Tool for Facilitating OCR Postediting in Historical Documents

Authors: Alberto Poncelas, Mohammad Aboomar, Jan Buts, James Hadley, Andy Way

Abstract: Optical character recognition (OCR) for historical documents is a complex procedure subject to a unique set of material issues, including inconsistencies in typefaces and low quality scanning. Consequently, even the most sophisticated OCR engines produce errors. This paper reports on a tool built for postediting the output of Tesseract, more specifically for correcting common errors in digitized h… ▽ More Optical character recognition (OCR) for historical documents is a complex procedure subject to a unique set of material issues, including inconsistencies in typefaces and low quality scanning. Consequently, even the most sophisticated OCR engines produce errors. This paper reports on a tool built for postediting the output of Tesseract, more specifically for correcting common errors in digitized historical documents. The proposed tool suggests alternatives for word forms not found in a specified vocabulary. The assumed error is replaced by a presumably correct alternative in the post-edition based on the scores of a Language Model (LM). The tool is tested on a chapter of the book An Essay Towards Regulating the Trade and Employing the Poor of this Kingdom (Cary ,1719). As demonstrated below, the tool is successful in correcting a number of common errors. If sometimes unreliable, it is also transparent and subject to human intervention. △ Less

Submitted 23 April, 2020; originally announced April 2020.

Journal ref: Workshop on Language Technologies for Historical and Ancient Languages, LT4HALA (2020)

arXiv:1805.09018 [pdf, other]

Cloud Brokerage: A Systematic Survey

Authors: Abdessalam Elhabbash, Faiza Samreen, James Hadley, Yehia Elkhatib

Abstract: Background: The proliferation of cloud providers and provisioning levels has opened a space for cloud brokerage services. Brokers intermediate between cloud customers and providers to assist the customer in selecting the most suitable cloud service, hel** to manage the dimensionality, heterogeneity, and uncertainty associated with cloud services. Objective: This paper identifies and classifies a… ▽ More Background: The proliferation of cloud providers and provisioning levels has opened a space for cloud brokerage services. Brokers intermediate between cloud customers and providers to assist the customer in selecting the most suitable cloud service, hel** to manage the dimensionality, heterogeneity, and uncertainty associated with cloud services. Objective: This paper identifies and classifies approaches to realise cloud brokerage. By doing so, this paper presents an understanding of the state of the art and a novel taxonomy to characterise cloud brokers. Method: We conducted a systematic literature survey to compile studies related to cloud brokerage and explore how cloud brokers are engineered. We analysed the studies from multiple perspectives, such as motivation, functionality, engineering approach, and evaluation methodology. Results: The survey resulted in a knowledge base of current proposals for realising cloud brokers. The survey identified surprising differences between the studies' implementations, with engineering efforts directed at combinations of market-based solutions, middlewares, toolkits, algorithms, semantic frameworks, and conceptual frameworks. Conclusion: Our comprehensive meta-analysis shows that cloud brokerage is still a formative field. There is no doubt that progress has been achieved in the field but considerable challenges remain to be addressed. This survey identifies such challenges and directions for future research. △ Less

Submitted 23 May, 2018; originally announced May 2018.

Showing 1–7 of 7 results for author: Hadley, J