Search | arXiv e-print repository

Energy-Aware Decentralized Learning with Intermittent Model Training

Authors: Akash Dhasade, Paolo Dini, Elia Guerra, Anne-Marie Kermarrec, Marco Miozzo, Rafael Pires, Rishi Sharma, Martijn de Vos

Abstract: Decentralized learning (DL) offers a powerful framework where nodes collaboratively train models without sharing raw data and without the coordination of a central server. In the iterative rounds of DL, models are trained locally, shared with neighbors in the topology, and aggregated with other models received from neighbors. Sharing and merging models contribute to convergence towards a consensus… ▽ More Decentralized learning (DL) offers a powerful framework where nodes collaboratively train models without sharing raw data and without the coordination of a central server. In the iterative rounds of DL, models are trained locally, shared with neighbors in the topology, and aggregated with other models received from neighbors. Sharing and merging models contribute to convergence towards a consensus model that generalizes better across the collective data captured at training time. In addition, the energy consumption while sharing and merging model parameters is negligible compared to the energy spent during the training phase. Leveraging this fact, we present SkipTrain, a novel DL algorithm, which minimizes energy consumption in decentralized learning by strategically skip** some training rounds and substituting them with synchronization rounds. These training-silent periods, besides saving energy, also allow models to better mix and finally produce models with superior accuracy than typical DL algorithms that train at every round. Our empirical evaluations with 256 nodes demonstrate that SkipTrain reduces energy consumption by 50% and increases model accuracy by up to 12% compared to D-PSGD, the conventional DL algorithm. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2405.15644 [pdf, other]

Harnessing Increased Client Participation with Cohort-Parallel Federated Learning

Authors: Akash Dhasade, Anne-Marie Kermarrec, Tuan-Anh Nguyen, Rafael Pires, Martijn de Vos

Abstract: Federated Learning (FL) is a machine learning approach where nodes collaboratively train a global model. As more nodes participate in a round of FL, the effectiveness of individual model updates by nodes also diminishes. In this study, we increase the effectiveness of client updates by dividing the network into smaller partitions, or cohorts. We introduce Cohort-Parallel Federated Learning (CPFL):… ▽ More Federated Learning (FL) is a machine learning approach where nodes collaboratively train a global model. As more nodes participate in a round of FL, the effectiveness of individual model updates by nodes also diminishes. In this study, we increase the effectiveness of client updates by dividing the network into smaller partitions, or cohorts. We introduce Cohort-Parallel Federated Learning (CPFL): a novel learning approach where each cohort independently trains a global model using FL, until convergence, and the produced models by each cohort are then unified using one-shot Knowledge Distillation (KD) and a cross-domain, unlabeled dataset. The insight behind CPFL is that smaller, isolated networks converge quicker than in a one-network setting where all nodes participate. Through exhaustive experiments involving realistic traces and non-IID data distributions on the CIFAR-10 and FEMNIST image classification tasks, we investigate the balance between the number of cohorts, model accuracy, training time, and compute and communication resources. Compared to traditional FL, CPFL with four cohorts, non-IID data distribution, and CIFAR-10 yields a 1.9$\times$ reduction in train time and a 1.3$\times$ reduction in resource usage, with a minimal drop in test accuracy. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.07930 [pdf, other]

Improving Multimodal Learning with Multi-Loss Gradient Modulation

Authors: Konstantinos Kontras, Christos Chatzichristos, Matthew Blaschko, Maarten De Vos

Abstract: Learning from multiple modalities, such as audio and video, offers opportunities for leveraging complementary information, enhancing robustness, and improving contextual understanding and performance. However, combining such modalities presents challenges, especially when modalities differ in data structure, predictive contribution, and the complexity of their learning processes. It has been obser… ▽ More Learning from multiple modalities, such as audio and video, offers opportunities for leveraging complementary information, enhancing robustness, and improving contextual understanding and performance. However, combining such modalities presents challenges, especially when modalities differ in data structure, predictive contribution, and the complexity of their learning processes. It has been observed that one modality can potentially dominate the learning process, hindering the effective utilization of information from other modalities and leading to sub-optimal model performance. To address this issue the vast majority of previous works suggest to assess the unimodal contributions and dynamically adjust the training to equalize them. We improve upon previous work by introducing a multi-loss objective and further refining the balancing process, allowing it to dynamically adjust the learning pace of each modality in both directions, acceleration and deceleration, with the ability to phase out balancing effects upon convergence. We achieve superior results across three audio-video datasets: on CREMA-D, models with ResNet backbone encoders surpass the previous best by 1.9% to 12.4%, and Conformer backbone models deliver improvements ranging from 2.8% to 14.1% across different fusion methods. On AVE, improvements range from 2.7% to 7.7%, while on UCF101, gains reach up to 6.1%. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2404.09536 [pdf, other]

Beyond Noise: Privacy-Preserving Decentralized Learning with Virtual Nodes

Authors: Sayan Biswas, Mathieu Even, Anne-Marie Kermarrec, Laurent Massoulie, Rafael Pires, Rishi Sharma, Martijn de Vos

Abstract: Decentralized learning (DL) enables collaborative learning without a server and without training data leaving the users' devices. However, the models shared in DL can still be used to infer training data. Conventional privacy defenses such as differential privacy and secure aggregation fall short in effectively safeguarding user privacy in DL. We introduce Shatter, a novel DL approach in which nod… ▽ More Decentralized learning (DL) enables collaborative learning without a server and without training data leaving the users' devices. However, the models shared in DL can still be used to infer training data. Conventional privacy defenses such as differential privacy and secure aggregation fall short in effectively safeguarding user privacy in DL. We introduce Shatter, a novel DL approach in which nodes create virtual nodes (VNs) to disseminate chunks of their full model on their behalf. This enhances privacy by (i) preventing attackers from collecting full models from other nodes, and (ii) hiding the identity of the original node that produced a given model chunk. We theoretically prove the convergence of Shatter and provide a formal analysis demonstrating how Shatter reduces the efficacy of attacks compared to when exchanging full models between participating nodes. We evaluate the convergence and attack resilience of Shatter with existing DL algorithms, with heterogeneous datasets, and against three standard privacy attacks, including gradient inversion. Our evaluation shows that Shatter not only renders these privacy attacks infeasible when each node operates 16 VNs but also exhibits a positive impact on model convergence compared to standard DL. This enhanced privacy comes with a manageable increase in communication volume. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2403.13066 [pdf]

Multimodal wearable EEG, EMG and accelerometry measurements improve the accuracy of tonic-clonic seizure detection in-hospital

Authors: **gwei Zhang, Lauren Swinnen, Christos Chatzichristos, Victoria Broux, Renee Proost, Katrien Jansen, Benno Mahler, Nicolas Zabler, Nino Epitashvilli, Matthias Dümpelmann, Andreas Schulze-Bonhage, Elisabeth Schriewer, Ummahan Ermis, Stefan Wolking, Florian Linke, Yvonne Weber, Mkael Symmonds, Arjune Sen, Andrea Biondi, Mark P. Richardson, Abuhaiba Sulaiman I, Ana Isabel Silva, Francisco Sales, Gergely Vértes, Wim Van Paesschen , et al. (1 additional authors not shown)

Abstract: Objective: Most current wearable tonic-clonic seizure (TCS) detection systems are based on extra-cerebral signals, such as electromyography (EMG) or accelerometry (ACC). Although many of these devices show good sensitivity in seizure detection, their false positive rates (FPR) are still relatively high. Wearable EEG may improve performance; however, studies investigating this remain scarce. This p… ▽ More Objective: Most current wearable tonic-clonic seizure (TCS) detection systems are based on extra-cerebral signals, such as electromyography (EMG) or accelerometry (ACC). Although many of these devices show good sensitivity in seizure detection, their false positive rates (FPR) are still relatively high. Wearable EEG may improve performance; however, studies investigating this remain scarce. This paper aims 1) to investigate the possibility of detecting TCSs with a behind-the-ear, two-channel wearable EEG, and 2) to evaluate the added value of wearable EEG to other non-EEG modalities in multimodal TCS detection. Method: We included 27 participants with a total of 44 TCSs from the European multicenter study SeizeIT2. The multimodal wearable detection system Sensor Dot (Byteflies) was used to measure two-channel, behind-the-ear EEG, EMG, electrocardiography (ECG), ACC and gyroscope (GYR). First, we evaluated automatic unimodal detection of TCSs, using performance metrics such as sensitivity, precision, FPR and F1-score. Secondly, we fused the different modalities and again assessed performance. Algorithm-labeled segments were then provided to a neurologist and a wearable data expert, who reviewed and annotated the true positive TCSs, and discarded false positives (FPs). Results: Wearable EEG outperformed the other modalities in unimodal TCS detection by achieving a sensitivity of 100.0% and a FPR of 10.3/24h (compared to 97.7% sensitivity and 30.9/24h FPR for EMG; 95.5% sensitivity and 13.9 FPR for ACC). The combination of wearable EEG and EMG achieved overall the most clinically useful performance in offline TCS detection with a sensitivity of 97.7%, a FPR of 0.4/24 h, a precision of 43.0%, and a F1-score of 59.7%. Subsequent visual review of the automated detections resulted in maximal sensitivity and zero FPs. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2402.08522 [pdf, other]

Fairness Auditing with Multi-Agent Collaboration

Authors: Martijn de Vos, Akash Dhasade, Jade Garcia Bourrée, Anne-Marie Kermarrec, Erwan Le Merrer, Benoit Rottembourg, Gilles Tredan

Abstract: Existing work in fairness auditing assumes that each audit is performed independently. In this paper, we consider multiple agents working together, each auditing the same platform for different tasks. Agents have two levers: their collaboration strategy, with or without coordination beforehand, and their strategy for sampling appropriate data points. We theoretically compare the interplay of these… ▽ More Existing work in fairness auditing assumes that each audit is performed independently. In this paper, we consider multiple agents working together, each auditing the same platform for different tasks. Agents have two levers: their collaboration strategy, with or without coordination beforehand, and their strategy for sampling appropriate data points. We theoretically compare the interplay of these levers. Our main findings are that (i) collaboration is generally beneficial for accurate audits, (ii) basic sampling methods often prove to be effective, and (iii) counter-intuitively, extensive coordination on queries often deteriorates audits accuracy as the number of agents increases. Experiments on three large datasets confirm our theoretical results. Our findings motivate collaboration during fairness audits of platforms that use ML models for decision-making. △ Less

Submitted 26 April, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

Comments: 13 pages, 6 figures

arXiv:2311.15603 [pdf, other]

QuickDrop: Efficient Federated Unlearning by Integrated Dataset Distillation

Authors: Akash Dhasade, Yaohong Ding, Song Guo, Anne-marie Kermarrec, Martijn De Vos, Leijie Wu

Abstract: Federated Unlearning (FU) aims to delete specific training data from an ML model trained using Federated Learning (FL). We introduce QuickDrop, an efficient and original FU method that utilizes dataset distillation (DD) to accelerate unlearning and drastically reduces computational overhead compared to existing approaches. In QuickDrop, each client uses DD to generate a compact dataset representat… ▽ More Federated Unlearning (FU) aims to delete specific training data from an ML model trained using Federated Learning (FL). We introduce QuickDrop, an efficient and original FU method that utilizes dataset distillation (DD) to accelerate unlearning and drastically reduces computational overhead compared to existing approaches. In QuickDrop, each client uses DD to generate a compact dataset representative of the original training dataset, called a distilled dataset, and uses this compact dataset during unlearning. To unlearn specific knowledge from the global model, QuickDrop has clients execute Stochastic Gradient Ascent with samples from the distilled datasets, thus significantly reducing computational overhead compared to conventional FU methods. We further increase the efficiency of QuickDrop by ingeniously integrating DD into the FL training process. By reusing the gradient updates produced during FL training for DD, the overhead of creating distilled datasets becomes close to negligible. Evaluations on three standard datasets show that, with comparable accuracy guarantees, QuickDrop reduces the duration of unlearning by 463.8x compared to model retraining from scratch and 65.1x compared to existing FU approaches. We also demonstrate the scalability of QuickDrop with 100 clients and show its effectiveness while handling multiple unlearning operations. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2310.01972 [pdf, other]

Epidemic Learning: Boosting Decentralized Learning with Randomized Communication

Authors: Martijn de Vos, Sadegh Farhadkhani, Rachid Guerraoui, Anne-Marie Kermarrec, Rafael Pires, Rishi Sharma

Abstract: We present Epidemic Learning (EL), a simple yet powerful decentralized learning (DL) algorithm that leverages changing communication topologies to achieve faster model convergence compared to conventional DL approaches. At each round of EL, each node sends its model updates to a random sample of $s$ other nodes (in a system of $n$ nodes). We provide an extensive theoretical analysis of EL, demonst… ▽ More We present Epidemic Learning (EL), a simple yet powerful decentralized learning (DL) algorithm that leverages changing communication topologies to achieve faster model convergence compared to conventional DL approaches. At each round of EL, each node sends its model updates to a random sample of $s$ other nodes (in a system of $n$ nodes). We provide an extensive theoretical analysis of EL, demonstrating that its changing topology culminates in superior convergence properties compared to the state-of-the-art (static and dynamic) topologies. Considering smooth non-convex loss functions, the number of transient iterations for EL, i.e., the rounds required to achieve asymptotic linear speedup, is in $O(n^3/s^2)$ which outperforms the best-known bound $O(n^3)$ by a factor of $s^2$, indicating the benefit of randomized communication for DL. We empirically evaluate EL in a 96-node network and compare its performance with state-of-the-art DL approaches. Our results illustrate that EL converges up to $ 1.7\times$ quicker than baseline DL algorithms and attains $2.2 $\% higher accuracy for the same communication volume. △ Less

Submitted 27 October, 2023; v1 submitted 3 October, 2023; originally announced October 2023.

Comments: Accepted paper at NeurIPS 2023

arXiv:2306.16431 [pdf, other]

Increasing Performance And Sample Efficiency With Model-agnostic Interactive Feature Attributions

Authors: Joran Michiels, Maarten De Vos, Johan Suykens

Abstract: Model-agnostic feature attributions can provide local insights in complex ML models. If the explanation is correct, a domain expert can validate and trust the model's decision. However, if it contradicts the expert's knowledge, related work only corrects irrelevant features to improve the model. To allow for unlimited interaction, in this paper we provide model-agnostic implementations for two pop… ▽ More Model-agnostic feature attributions can provide local insights in complex ML models. If the explanation is correct, a domain expert can validate and trust the model's decision. However, if it contradicts the expert's knowledge, related work only corrects irrelevant features to improve the model. To allow for unlimited interaction, in this paper we provide model-agnostic implementations for two popular explanation methods (Occlusion and Shapley values) to enforce entirely different attributions in the complex model. For a particular set of samples, we use the corrected feature attributions to generate extra local data, which is used to retrain the model to have the right explanation for the samples. Through simulated and real data experiments on a variety of models we show how our proposed approach can significantly improve the model's performance only by augmenting its training dataset based on corrected explanations. Adding our interactive explanations to active learning settings increases the sample efficiency significantly and outperforms existing explanatory interactive strategies. Additionally we explore how a domain expert can provide feature attributions which are sufficiently correct to improve the model. △ Less

Submitted 28 June, 2023; originally announced June 2023.

arXiv:2306.10880 [pdf, other]

Explaining the Model and Feature Dependencies by Decomposition of the Shapley Value

Authors: Joran Michiels, Maarten De Vos, Johan Suykens

Abstract: Shapley values have become one of the go-to methods to explain complex models to end-users. They provide a model agnostic post-hoc explanation with foundations in game theory: what is the worth of a player (in machine learning, a feature value) in the objective function (the output of the complex machine learning model). One downside is that they always require outputs of the model when some featu… ▽ More Shapley values have become one of the go-to methods to explain complex models to end-users. They provide a model agnostic post-hoc explanation with foundations in game theory: what is the worth of a player (in machine learning, a feature value) in the objective function (the output of the complex machine learning model). One downside is that they always require outputs of the model when some features are missing. These are usually computed by taking the expectation over the missing features. This however introduces a non-trivial choice: do we condition on the unknown features or not? In this paper we examine this question and claim that they represent two different explanations which are valid for different end-users: one that explains the model and one that explains the model combined with the feature dependencies in the data. We propose a new algorithmic approach to combine both explanations, removing the burden of choice and enhancing the explanatory power of Shapley values, and show that it achieves intuitive results on simple problems. We apply our method to two real-world datasets and discuss the explanations. Finally, we demonstrate how our method is either equivalent or superior to state-to-of-art Shapley value implementations while simultaneously allowing for increased insight into the model-data structure. △ Less

Submitted 19 June, 2023; originally announced June 2023.

arXiv:2306.04663 [pdf, ps, other]

U-PASS: an Uncertainty-guided deep learning Pipeline for Automated Sleep Staging

Authors: Elisabeth R. M. Heremans, Nabeel Seedat, Bertien Buyse, Dries Testelmans, Mihaela van der Schaar, Maarten De Vos

Abstract: As machine learning becomes increasingly prevalent in critical fields such as healthcare, ensuring the safety and reliability of machine learning systems becomes paramount. A key component of reliability is the ability to estimate uncertainty, which enables the identification of areas of high and low confidence and helps to minimize the risk of error. In this study, we propose a machine learning p… ▽ More As machine learning becomes increasingly prevalent in critical fields such as healthcare, ensuring the safety and reliability of machine learning systems becomes paramount. A key component of reliability is the ability to estimate uncertainty, which enables the identification of areas of high and low confidence and helps to minimize the risk of error. In this study, we propose a machine learning pipeline called U-PASS tailored for clinical applications that incorporates uncertainty estimation at every stage of the process, including data acquisition, training, and model deployment. The training process is divided into a supervised pre-training step and a semi-supervised finetuning step. We apply our uncertainty-guided deep learning pipeline to the challenging problem of sleep staging and demonstrate that it systematically improves performance at every stage. By optimizing the training dataset, actively seeking informative samples, and deferring the most uncertain samples to an expert, we achieve an expert-level accuracy of 85% on a challenging clinical dataset of elderly sleep apnea patients, representing a significant improvement over the baseline accuracy of 75%. U-PASS represents a promising approach to incorporating uncertainty estimation into machine learning pipelines, thereby improving their reliability and unlocking their potential in clinical settings. △ Less

Submitted 7 June, 2023; originally announced June 2023.

arXiv:2304.06678 [pdf]

From the digital twins in healthcare to the Virtual Human Twin: a moon-shot project for digital health research

Authors: Marco Viceconti, Maarten De Vos, Sabato Mellone, Liesbet Geris

Abstract: The idea of a systematic digital representation of the entire known human pathophysiology, which we could call the Virtual Human Twin, has been around for decades. To date, most research groups focused instead on develo** highly specialised, highly focused patient-specific models able to predict specific quantities of clinical relevance. While it has facilitated harvesting the low-hanging fruits… ▽ More The idea of a systematic digital representation of the entire known human pathophysiology, which we could call the Virtual Human Twin, has been around for decades. To date, most research groups focused instead on develo** highly specialised, highly focused patient-specific models able to predict specific quantities of clinical relevance. While it has facilitated harvesting the low-hanging fruits, this narrow focus is, in the long run, leaving some significant challenges that slow the adoption of digital twins in healthcare. This position paper lays the conceptual foundations for develo** the Virtual Human Twin (VHT). The VHT is intended as a distributed and collaborative infrastructure, a collection of technologies and resources (data, models) that enable it, and a collection of Standard Operating Procedures (SOP) that regulate its use. The VHT infrastructure aims to facilitate academic researchers, public organisations, and the biomedical industry in develo** and validating new digital twins in healthcare solutions with the possibility of integrating multiple resources if required by the specific context of use. Healthcare professionals and patients can also use the VHT infrastructure for clinical decision support or personalised health forecasting. As the European Commission launched the EDITH coordination and support action to develop a roadmap for the development of the Virtual Human Twin, this position paper is intended as a starting point for the consensus process and a call to arms for all stakeholders. △ Less

Submitted 12 August, 2023; v1 submitted 27 March, 2023; originally announced April 2023.

arXiv:2304.06485 [pdf, ps, other]

CoRe-Sleep: A Multimodal Fusion Framework for Time Series Robust to Imperfect Modalities

Authors: Konstantinos Kontras, Christos Chatzichristos, Huy Phan, Johan Suykens, Maarten De Vos

Abstract: Sleep abnormalities can have severe health consequences. Automated sleep staging, i.e. labelling the sequence of sleep stages from the patient's physiological recordings, could simplify the diagnostic process. Previous work on automated sleep staging has achieved great results, mainly relying on the EEG signal. However, often multiple sources of information are available beyond EEG. This can be pa… ▽ More Sleep abnormalities can have severe health consequences. Automated sleep staging, i.e. labelling the sequence of sleep stages from the patient's physiological recordings, could simplify the diagnostic process. Previous work on automated sleep staging has achieved great results, mainly relying on the EEG signal. However, often multiple sources of information are available beyond EEG. This can be particularly beneficial when the EEG recordings are noisy or even missing completely. In this paper, we propose CoRe-Sleep, a Coordinated Representation multimodal fusion network that is particularly focused on improving the robustness of signal analysis on imperfect data. We demonstrate how appropriately handling multimodal information can be the key to achieving such robustness. CoRe-Sleep tolerates noisy or missing modalities segments, allowing training on incomplete data. Additionally, it shows state-of-the-art performance when testing on both multimodal and unimodal data using a single model on SHHS-1, the largest publicly available study that includes sleep stage labels. The results indicate that training the model on multimodal data does positively influence performance when tested on unimodal data. This work aims at bridging the gap between automated analysis tools and their clinical utility. △ Less

Submitted 27 March, 2023; originally announced April 2023.

Comments: 10 pages, 4 figures, 2 tables, journal

arXiv:2302.13837 [pdf, other]

Decentralized Learning Made Practical with Client Sampling

Authors: Martijn de Vos, Akash Dhasade, Anne-Marie Kermarrec, Erick Lavoie, Johan Pouwelse, Rishi Sharma

Abstract: Decentralized learning (DL) leverages edge devices for collaborative model training while avoiding coordination by a central server. Due to privacy concerns, DL has become an attractive alternative to centralized learning schemes since training data never leaves the device. In a round of DL, all nodes participate in model training and exchange their model with some other nodes. Performing DL in la… ▽ More Decentralized learning (DL) leverages edge devices for collaborative model training while avoiding coordination by a central server. Due to privacy concerns, DL has become an attractive alternative to centralized learning schemes since training data never leaves the device. In a round of DL, all nodes participate in model training and exchange their model with some other nodes. Performing DL in large-scale heterogeneous networks results in high communication costs and prolonged round durations due to slow nodes, effectively inflating the total training time. Furthermore, current DL algorithms also assume all nodes are available for training and aggregation at all times, diminishing the practicality of DL. This paper presents Plexus, an efficient, scalable, and practical DL system. Plexus (1) avoids network-wide participation by introducing a decentralized peer sampler that selects small subsets of available nodes that train the model each round and, (2) aggregates the trained models produced by nodes every round. Plexus is designed to handle joining and leaving nodes (churn). We extensively evaluate Plexus by incorporating realistic traces for compute speed, pairwise latency, network capacity, and availability of edge devices in our experiments. Our experiments on four common learning tasks empirically show that Plexus reduces time-to-accuracy by 1.2-8.3x, communication volume by 2.4-15.3x and training resources needed for convergence by 6.4-370x compared to baseline DL algorithms. △ Less

Submitted 7 May, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

arXiv:2302.07509 [pdf, other]

doi 10.23919/FUSION49751.2022.9841235

Automated Movement Detection with Dirichlet Process Mixture Models and Electromyography

Authors: Navin Cooray, Zhenglin Li, **zhuo Wang, Christine Lo, Mahnaz Arvaneh, Mkael Symmonds, Michele Hu, Maarten De Vos, Lyudmila S Mihaylova

Abstract: Numerous sleep disorders are characterised by movement during sleep, these include rapid-eye movement sleep behaviour disorder (RBD) and periodic limb movement disorder. The process of diagnosing movement related sleep disorders requires laborious and time-consuming visual analysis of sleep recordings. This process involves sleep clinicians visually inspecting electromyogram (EMG) signals to ident… ▽ More Numerous sleep disorders are characterised by movement during sleep, these include rapid-eye movement sleep behaviour disorder (RBD) and periodic limb movement disorder. The process of diagnosing movement related sleep disorders requires laborious and time-consuming visual analysis of sleep recordings. This process involves sleep clinicians visually inspecting electromyogram (EMG) signals to identify abnormal movements. The distribution of characteristics that represent movement can be diverse and varied, ranging from brief moments of tensing to violent outbursts. This study proposes a framework for automated limb-movement detection by fusing data from two EMG sensors (from the left and right limb) through a Dirichlet process mixture model. Several features are extracted from 10 second mini-epochs, where each mini-epoch has been classified as 'leg-movement' or 'no leg-movement' based on annotations of movement from sleep clinicians. The distributions of the features from each category can be estimated accurately using Gaussian mixture models with the Dirichlet process as a prior. The available dataset includes 36 participants that have all been diagnosed with RBD. The performance of this framework was evaluated by a 10-fold cross validation scheme (participant independent). The study was compared to a random forest model and outperformed it with a mean accuracy, sensitivity, and specificity of 94\%, 48\%, and 95\%, respectively. These results demonstrate the ability of this framework to automate the detection of limb movement for the potential application of assisting clinical diagnosis and decision-making. △ Less

Submitted 15 February, 2023; originally announced February 2023.

Journal ref: 2022 25th International Conference on Information Fusion (FUSION), Linkö**, Sweden, 2022, pp. 01-08

arXiv:2301.04508 [pdf, other]

A Deployment-First Methodology to Mechanism Design and Refinement in Distributed Systems

Authors: Martijn de Vos, Georgy Ishmaev, Johan Pouwelse, Stefanie Roos

Abstract: Catalyzed by the popularity of blockchain technology, there has recently been a renewed interest in the design, implementation and evaluation of decentralized systems. Most of these systems are intended to be deployed at scale and in heterogeneous environments with real users and unpredictable workloads. Nevertheless, most research in this field evaluates such systems in controlled environments th… ▽ More Catalyzed by the popularity of blockchain technology, there has recently been a renewed interest in the design, implementation and evaluation of decentralized systems. Most of these systems are intended to be deployed at scale and in heterogeneous environments with real users and unpredictable workloads. Nevertheless, most research in this field evaluates such systems in controlled environments that poorly reflect the complex conditions of real-world environments. In this work, we argue that deployment is crucial to understanding decentralized mechanisms in a real-world environment and an enabler to building more robust and sustainable systems. We highlight the merits of deployment by comparing this approach with other experimental setups and show how our lab applied a deployment-first methodology. We then outline how we use Tribler, our peer-to-peer file-sharing application, to deploy and monitor decentralized mechanisms at scale. We illustrate the application of our methodology by describing a deployment trial in experimental tokenomics. Finally, we summarize four lessons learned from multiple deployment trials where we applied our methodology. △ Less

Submitted 11 January, 2023; originally announced January 2023.

Comments: Accepted for publication at the PerFail'23 workshop

arXiv:2301.03441 [pdf, ps, other]

L-SeqSleepNet: Whole-cycle Long Sequence Modelling for Automatic Sleep Staging

Authors: Huy Phan, Kristian P. Lorenzen, Elisabeth Heremans, Oliver Y. Chén, Minh C. Tran, Philipp Koch, Alfred Mertins, Mathias Baumert, Kaare Mikkelsen, Maarten De Vos

Abstract: Human sleep is cyclical with a period of approximately 90 minutes, implying long temporal dependency in the sleep data. Yet, exploring this long-term dependency when develo** sleep staging models has remained untouched. In this work, we show that while encoding the logic of a whole sleep cycle is crucial to improve sleep staging performance, the sequential modelling approach in existing state-of… ▽ More Human sleep is cyclical with a period of approximately 90 minutes, implying long temporal dependency in the sleep data. Yet, exploring this long-term dependency when develo** sleep staging models has remained untouched. In this work, we show that while encoding the logic of a whole sleep cycle is crucial to improve sleep staging performance, the sequential modelling approach in existing state-of-the-art deep learning models are inefficient for that purpose. We thus introduce a method for efficient long sequence modelling and propose a new deep learning model, L-SeqSleepNet, which takes into account whole-cycle sleep information for sleep staging. Evaluating L-SeqSleepNet on four distinct databases of various sizes, we demonstrate state-of-the-art performance obtained by the model over three different EEG setups, including scalp EEG in conventional Polysomnography (PSG), in-ear EEG, and around-the-ear EEG (cEEGrid), even with a single EEG channel input. Our analyses also show that L-SeqSleepNet is able to alleviate the predominance of N2 sleep (the major class in terms of classification) to bring down errors in other sleep stages. Moreover the network becomes much more robust, meaning that for all subjects where the baseline method had exceptionally poor performance, their performance are improved significantly. Finally, the computation time only grows at a sub-linear rate when the sequence length increases. △ Less

Submitted 4 August, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

Comments: This article has been published in IEEE Journal of Biomedical and Health Informatics (JBHI). Source code is available at http://github.com/pquochuy/l-seqsleepnet

arXiv:2209.11007 [pdf, other]

Avoiding Post-Processing with Event-Based Detection in Biomedical Signals

Authors: Nick Seeuws, Maarten De Vos, Alexander Bertrand

Abstract: Objective: Finding events of interest is a common task in biomedical signal processing. The detection of epileptic seizures and signal artefacts are two key examples. Epoch-based classification is the typical machine learning framework to detect such signal events because of the straightforward application of classical machine learning techniques. Usually, post-processing is required to achieve go… ▽ More Objective: Finding events of interest is a common task in biomedical signal processing. The detection of epileptic seizures and signal artefacts are two key examples. Epoch-based classification is the typical machine learning framework to detect such signal events because of the straightforward application of classical machine learning techniques. Usually, post-processing is required to achieve good performance and enforce temporal dependencies. Designing the right post-processing scheme to convert these classification outputs into events is a tedious, and labor-intensive element of this framework. Methods: We propose an event-based modeling framework that directly works with events as learning targets, step** away from ad-hoc post-processing schemes to turn model outputs into events. We illustrate the practical power of this framework on simulated data and real-world data, comparing it to epoch-based modeling approaches. Results: We show that event-based modeling (without post-processing) performs on par with or better than epoch-based modeling with extensive post-processing. Conclusion: These results show the power of treating events as direct learning targets, instead of using ad-hoc post-processing to obtain them, severely reducing design effort. Significance: The event-based modeling framework can easily be applied to other event detection problems in signal processing, removing the need for intensive task-specific post-processing. △ Less

Submitted 7 July, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

Comments: This work has been submitted to the IEEE for possible publication

arXiv:2209.09692 [pdf, other]

Personalized Longitudinal Assessment of Multiple Sclerosis Using Smartphones

Authors: Oliver Y. Chén, Florian Lipsmeier, Huy Phan, Frank Dondelinger, Andrew Creagh, Christian Gossens, Michael Lindemann, Maarten de Vos

Abstract: Personalized longitudinal disease assessment is central to quickly diagnosing, appropriately managing, and optimally adapting the therapeutic strategy of multiple sclerosis (MS). It is also important for identifying the idiosyncratic subject-specific disease profiles. Here, we design a novel longitudinal model to map individual disease trajectories in an automated way using sensor data that may co… ▽ More Personalized longitudinal disease assessment is central to quickly diagnosing, appropriately managing, and optimally adapting the therapeutic strategy of multiple sclerosis (MS). It is also important for identifying the idiosyncratic subject-specific disease profiles. Here, we design a novel longitudinal model to map individual disease trajectories in an automated way using sensor data that may contain missing values. First, we collect digital measurements related to gait and balance, and upper extremity functions using sensor-based assessments administered on a smartphone. Next, we treat missing data via imputation. We then discover potential markers of MS by employing a generalized estimation equation. Subsequently, parameters learned from multiple training datasets are ensembled to form a simple, unified longitudinal predictive model to forecast MS over time in previously unseen people with MS. To mitigate potential underestimation for individuals with severe disease scores, the final model incorporates additional subject-specific fine-tuning using data from the first day. The results show that the proposed model is promising to achieve personalized longitudinal MS assessment; they also suggest that features related to gait and balance as well as upper extremity function, remotely collected from sensor-based assessments, may be useful digital markers for predicting MS over time. △ Less

Submitted 20 September, 2022; originally announced September 2022.

MSC Class: 62P10; 62P30; 62H12; 62J02; 62D10

arXiv:2208.11254 [pdf, other]

Gromit: Benchmarking the Performance and Scalability of Blockchain Systems

Authors: Bulat Nasrulin, Martijn De Vos, Georgy Ishmaev, Johan Pouwelse

Abstract: The growing number of implementations of blockchain systems stands in stark contrast with still limited research on a systematic comparison of performance characteristics of these solutions. Such research is crucial for evaluating fundamental trade-offs introduced by novel consensus protocols and their implementations. These performance limitations are commonly analyzed with ad-hoc benchmarking fr… ▽ More The growing number of implementations of blockchain systems stands in stark contrast with still limited research on a systematic comparison of performance characteristics of these solutions. Such research is crucial for evaluating fundamental trade-offs introduced by novel consensus protocols and their implementations. These performance limitations are commonly analyzed with ad-hoc benchmarking frameworks focused on the consensus algorithm of blockchain systems. However, comparative evaluations of design choices require macro-benchmarks for uniform and comprehensive performance evaluations of blockchains at the system level rather than performance metrics of isolated components. To address this research gap, we implement Gromit, a generic framework for analyzing blockchain systems. Gromit treats each system under test as a transaction fabric where clients issue transactions to validators. We use Gromit to conduct the largest blockchain study to date, involving seven representative systems with varying consensus models. We determine the peak performance of these systems with a synthetic workload in terms of transaction throughput and scalability and show that transaction throughput does not scale with the number of validators. We explore how robust the subjected systems are against network delays and reveal that the performance of permissoned blockchain is highly sensitive to network conditions. △ Less

Submitted 23 August, 2022; originally announced August 2022.

arXiv:2207.13734 [pdf, other]

Electric Vehicle Scheduling with Capacitated Charging Stations and Partial Charging

Authors: Marelot de Vos, Rolf Nelson van Lieshout, Twan Dollevoet

Abstract: This paper considers the scheduling of electric vehicles in a public transit system. Our main innovation is that we take into account that charging stations have limited capacity, while also considering partial charging. To solve the problem, we expand a connection-based network in order to track the state of charge of vehicles and model recharging actions. We then formulate the electric vehicle s… ▽ More This paper considers the scheduling of electric vehicles in a public transit system. Our main innovation is that we take into account that charging stations have limited capacity, while also considering partial charging. To solve the problem, we expand a connection-based network in order to track the state of charge of vehicles and model recharging actions. We then formulate the electric vehicle scheduling problem as a path-based binary program, whose linear relaxation we solve using column generation. We find integer feasible solutions using two heuristics: price-and-branch and truncated column generation, including acceleration strategies. We test the approach using data of the concession Gooi en Vechtstreek in the Netherlands, containing up to 816 trips. The truncated column generation outperforms the other heuristic, and solves the entire concession within 28 hours of computation time with an optimality gap less than 3.5 percent. △ Less

Submitted 29 July, 2022; v1 submitted 27 July, 2022; originally announced July 2022.

arXiv:2203.14996 [pdf, other]

Comparing in context: Improving cosine similarity measures with a metric tensor

Authors: Isa M. Apallius de Vos, Ghislaine L. van den Boogerd, Mara D. Fennema, Adriana D. Correia

Abstract: Cosine similarity is a widely used measure of the relatedness of pre-trained word embeddings, trained on a language modeling goal. Datasets such as WordSim-353 and SimLex-999 rate how similar words are according to human annotators, and as such are often used to evaluate the performance of language models. Thus, any improvement on the word similarity task requires an improved word representation.… ▽ More Cosine similarity is a widely used measure of the relatedness of pre-trained word embeddings, trained on a language modeling goal. Datasets such as WordSim-353 and SimLex-999 rate how similar words are according to human annotators, and as such are often used to evaluate the performance of language models. Thus, any improvement on the word similarity task requires an improved word representation. In this paper, we propose instead the use of an extended cosine similarity measure to improve performance on that task, with gains in interpretability. We explore the hypothesis that this approach is particularly useful if the word-similarity pairs share the same context, for which distinct contextualized similarity measures can be learned. We first use the dataset of Richie et al. (2020) to learn contextualized metrics and compare the results with the baseline values obtained using the standard cosine similarity measure, which consistently shows improvement. We also train a contextualized similarity measure for both SimLex-999 and WordSim-353, comparing the results with the corresponding baselines, and using these datasets as independent test sets for the all-context similarity measure learned on the contextualized dataset, obtaining positive results for a number of tests. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: Presented at the 18th International Conference in Natural Language Processing (ICON `21). 11 pages, 3 figures, 6 tables

arXiv:2201.00644 [pdf, ps, other]

Feature matching as improved transfer learning technique for wearable EEG

Authors: Elisabeth R. M. Heremans, Huy Phan, Amir H. Ansari, Pascal Borzée, Bertien Buyse, Dries Testelmans, Maarten De Vos

Abstract: Objective: With the rapid rise of wearable sleep monitoring devices with non-conventional electrode configurations, there is a need for automated algorithms that can perform sleep staging on configurations with small amounts of labeled data. Transfer learning has the ability to adapt neural network weights from a source modality (e.g. standard electrode configuration) to a new target modality (e.g… ▽ More Objective: With the rapid rise of wearable sleep monitoring devices with non-conventional electrode configurations, there is a need for automated algorithms that can perform sleep staging on configurations with small amounts of labeled data. Transfer learning has the ability to adapt neural network weights from a source modality (e.g. standard electrode configuration) to a new target modality (e.g. non-conventional electrode configuration). Methods: We propose feature matching, a new transfer learning strategy as an alternative to the commonly used finetuning approach. This method consists of training a model with larger amounts of data from the source modality and few paired samples of source and target modality. For those paired samples, the model extracts features of the target modality, matching these to the features from the corresponding samples of the source modality. Results: We compare feature matching to finetuning for three different target domains, with two different neural network architectures, and with varying amounts of training data. Particularly on small cohorts (i.e. 2 - 5 labeled recordings in the non-conventional recording setting), feature matching systematically outperforms finetuning with mean relative differences in accuracy ranging from 0.4% to 4.7% for the different scenarios and datasets. Conclusion: Our findings suggest that feature matching outperforms finetuning as a transfer learning approach, especially in very low data regimes. Significance: As such, we conclude that feature matching is a promising new method for wearable sleep staging with novel devices. △ Less

Submitted 29 December, 2021; originally announced January 2022.

Comments: 14 pages, 6 figues

arXiv:2110.11006 [pdf, other]

Bristle: Decentralized Federated Learning in Byzantine, Non-i.i.d. Environments

Authors: Joost Verbraeken, Martijn de Vos, Johan Pouwelse

Abstract: Federated learning (FL) is a privacy-friendly type of machine learning where devices locally train a model on their private data and typically communicate model updates with a server. In decentralized FL (DFL), peers communicate model updates with each other instead. However, DFL is challenging since (1) the training data possessed by different peers is often non-i.i.d. (i.e., distributed differen… ▽ More Federated learning (FL) is a privacy-friendly type of machine learning where devices locally train a model on their private data and typically communicate model updates with a server. In decentralized FL (DFL), peers communicate model updates with each other instead. However, DFL is challenging since (1) the training data possessed by different peers is often non-i.i.d. (i.e., distributed differently between the peers) and (2) malicious, or Byzantine, attackers can share arbitrary model updates with other peers to subvert the training process. We address these two challenges and present Bristle, middleware between the learning application and the decentralized network layer. Bristle leverages transfer learning to predetermine and freeze the non-output layers of a neural network, significantly speeding up model training and lowering communication costs. To securely update the output layer with model updates from other peers, we design a fast distance-based prioritizer and a novel performance-based integrator. Their combined effect results in high resilience to Byzantine attackers and the ability to handle non-i.i.d. classes. We empirically show that Bristle converges to a consistent 95% accuracy in Byzantine environments, outperforming all evaluated baselines. In non-Byzantine environments, Bristle requires 83% fewer iterations to achieve 90% accuracy compared to state-of-the-art methods. We show that when the training classes are non-i.i.d., Bristle significantly outperforms the accuracy of the most Byzantine-resilient baselines by 2.3x while reducing communication costs by 90%. △ Less

Submitted 21 October, 2021; originally announced October 2021.

arXiv:2105.11043 [pdf, ps, other]

doi 10.1109/TBME.2022.3147187

SleepTransformer: Automatic Sleep Staging with Interpretability and Uncertainty Quantification

Authors: Huy Phan, Kaare Mikkelsen, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Maarten De Vos

Abstract: Background: Black-box skepticism is one of the main hindrances impeding deep-learning-based automatic sleep scoring from being used in clinical environments. Methods: Towards interpretability, this work proposes a sequence-to-sequence sleep-staging model, namely SleepTransformer. It is based on the transformer backbone and offers interpretability of the model's decisions at both the epoch and sequ… ▽ More Background: Black-box skepticism is one of the main hindrances impeding deep-learning-based automatic sleep scoring from being used in clinical environments. Methods: Towards interpretability, this work proposes a sequence-to-sequence sleep-staging model, namely SleepTransformer. It is based on the transformer backbone and offers interpretability of the model's decisions at both the epoch and sequence level. We further propose a simple yet efficient method to quantify uncertainty in the model's decisions. The method, which is based on entropy, can serve as a metric for deferring low-confidence epochs to a human expert for further inspection. Results: Making sense of the transformer's self-attention scores for interpretability, at the epoch level, the attention scores are encoded as a heat map to highlight sleep-relevant features captured from the input EEG signal. At the sequence level, the attention scores are visualized as the influence of different neighboring epochs in an input sequence (i.e. the context) to recognition of a target epoch, mimicking the way manual scoring is done by human experts. Conclusion: Additionally, we demonstrate that SleepTransformer performs on par with existing methods on two databases of different sizes. Significance: Equipped with interpretability and the ability of uncertainty quantification, SleepTransformer holds promise for being integrated into clinical settings. △ Less

Submitted 26 January, 2022; v1 submitted 23 May, 2021; originally announced May 2021.

Comments: This article has been published in IEEE Transactions on Biomedical Engineering

arXiv:2105.02175 [pdf, other]

Automatic de-identification of Data Download Packages

Authors: Laura Boeschoten, Roos Voorvaart, Casper Kaandorp, Ruben van den Goorbergh, Martine de Vos

Abstract: The General Data Protection Regulation (GDPR) grants all natural persons the right of access to their personal data if this is being processed by data controllers. The data controllers are obliged to share the data in an electronic format and often provide the data in a so called Data Download Package (DDP). These DDPs contain all data collected by public and private entities during the course of… ▽ More The General Data Protection Regulation (GDPR) grants all natural persons the right of access to their personal data if this is being processed by data controllers. The data controllers are obliged to share the data in an electronic format and often provide the data in a so called Data Download Package (DDP). These DDPs contain all data collected by public and private entities during the course of citizens' digital life and form a treasure trove for social scientists. However, the data can be deeply private. To protect the privacy of research participants while using their DDPs for scientific research, we developed de-identification software that is able to handle typical characteristics of DDPs such as regularly changing file structures, visual and textual content, different file formats, different file structures and accounting for usernames. We investigate the performance of the software and illustrate how the software can be tailored towards specific DDP structures. △ Less

Submitted 4 May, 2021; originally announced May 2021.

arXiv:2104.04567 [pdf, other]

Light-weight sleep monitoring: electrode distance matters more than placement for automatic scoring

Authors: Kaare B. Mikkelsen, Huy Phan, Mike L. Rank, Martin C. Hemmsen, Maarten de Vos, Preben Kidmose

Abstract: Modern sleep monitoring development is shifting towards the use of unobtrusive sensors combined with algorithms for automatic sleep scoring. Many different combinations of wet and dry electrodes, ear-centered, forehead-mounted or headband-inspired designs have been proposed, alongside an ever growing variety of machine learning algorithms for automatic sleep scoring. In this paper, we compare 13 d… ▽ More Modern sleep monitoring development is shifting towards the use of unobtrusive sensors combined with algorithms for automatic sleep scoring. Many different combinations of wet and dry electrodes, ear-centered, forehead-mounted or headband-inspired designs have been proposed, alongside an ever growing variety of machine learning algorithms for automatic sleep scoring. In this paper, we compare 13 different, realistic sensor setups derived from the same data set and analysed with the same pipeline. We find that all setups which include both a lateral and an EOG derivation show similar, state-of-the-art performance, with average Cohen's kappa values of at least 0.80. This indicates that electrode distance, rather than position, is important for accurate sleep scoring. Finally, based on the results presented, we argue that with the current competitive performance of automated staging approaches, there is an urgent need for establishing an improved benchmark beyond current single human rater scoring. △ Less

Submitted 13 April, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

Comments: 8 pages, 8 figures

arXiv:2104.02612 [pdf, other]

ASTANA: Practical String Deobfuscation for Android Applications Using Program Slicing

Authors: Martijn de Vos, Johan Pouwelse

Abstract: Software obfuscation is widely used by Android developers to protect the source code of their applications against adversarial reverse-engineering efforts. A specific type of obfuscation, string obfuscation, transforms the content of all string literals in the source code to non-interpretable text and inserts logic to deobfuscate these string literals at runtime. In this work, we demonstrate that… ▽ More Software obfuscation is widely used by Android developers to protect the source code of their applications against adversarial reverse-engineering efforts. A specific type of obfuscation, string obfuscation, transforms the content of all string literals in the source code to non-interpretable text and inserts logic to deobfuscate these string literals at runtime. In this work, we demonstrate that string obfuscation is easily reversible. We present ASTANA, a practical tool for Android applications to recovers the human-readable content from obfuscated string literals. ASTANA makes minimal assumptions about the obfuscation logic or application structure. The key idea is to execute the deobfuscation logic for a specific (obfuscated) string literal, which yields the original string value. To obtain the relevant deobfuscation logic, we present a lightweight and optimistic algorithm, based on program slicing techniques. By an experimental evaluation with 100 popular real-world financial applications, we demonstrate the practicality of ASTANA. We verify the correctness of our deobfuscation tool and provide insights in the behaviour of string obfuscators applied by the developers of the evaluated Android applications. △ Less

Submitted 6 April, 2021; originally announced April 2021.

arXiv:2103.09171 [pdf, other]

Interpretable Deep Learning for the Remote Characterisation of Ambulation in Multiple Sclerosis using Smartphones

Authors: Andrew P. Creagh, Florian Lipsmeier, Michael Lindemann, Maarten De Vos

Abstract: The emergence of digital technologies such as smartphones in healthcare applications have demonstrated the possibility of develo** rich, continuous, and objective measures of multiple sclerosis (MS) disability that can be administered remotely and out-of-clinic. In this work, deep convolutional neural networks (DCNN) applied to smartphone inertial sensor data were shown to better distinguish hea… ▽ More The emergence of digital technologies such as smartphones in healthcare applications have demonstrated the possibility of develo** rich, continuous, and objective measures of multiple sclerosis (MS) disability that can be administered remotely and out-of-clinic. In this work, deep convolutional neural networks (DCNN) applied to smartphone inertial sensor data were shown to better distinguish healthy from MS participant ambulation, compared to standard Support Vector Machine (SVM) feature-based methodologies. To overcome the typical limitations associated with remotely generated health data, such as low subject numbers, sparsity, and heterogeneous data, a transfer learning (TL) model from similar large open-source datasets was proposed. Our TL framework utilised the ambulatory information learned on Human Activity Recognition (HAR) tasks collected from similar smartphone-based sensor data. A lack of transparency of "black-box" deep networks remains one of the largest stumbling blocks to the wider acceptance of deep learning for clinical applications. Ensuing work therefore aimed to visualise DCNN decisions attributed by relevance heatmaps using Layer-Wise Relevance Propagation (LRP). Through the LRP framework, the patterns captured from smartphone-based inertial sensor data that were reflective of those who are healthy versus persons with MS (PwMS) could begin to be established and understood. Interpretations suggested that cadence-based measures, gait speed, and ambulation-related signal perturbations were distinct characteristics that distinguished MS disability from healthy participants. Robust and interpretable outcomes, generated from high-frequency out-of-clinic assessments, could greatly augment the current in-clinic assessment picture for PwMS, to inform better disease management techniques, and enable the development of better therapeutic interventions. △ Less

Submitted 22 June, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

arXiv:2102.12245 [pdf, other]

Estimation of Continuous Blood Pressure from PPG via a Federated Learning Approach

Authors: Eoin Brophy, Maarten De Vos, Geraldine Boylan, Tomas Ward

Abstract: Ischemic heart disease is the highest cause of mortality globally each year. This not only puts a massive strain on the lives of those affected but also on the public healthcare systems. To understand the dynamics of the healthy and unhealthy heart doctors commonly use electrocardiogram (ECG) and blood pressure (BP) readings. These methods are often quite invasive, in particular when continuous ar… ▽ More Ischemic heart disease is the highest cause of mortality globally each year. This not only puts a massive strain on the lives of those affected but also on the public healthcare systems. To understand the dynamics of the healthy and unhealthy heart doctors commonly use electrocardiogram (ECG) and blood pressure (BP) readings. These methods are often quite invasive, in particular when continuous arterial blood pressure (ABP) readings are taken and not to mention very costly. Using machine learning methods we seek to develop a framework that is capable of inferring ABP from a single optical photoplethysmogram (PPG) sensor alone. We train our framework across distributed models and data sources to mimic a large-scale distributed collaborative learning experiment that could be implemented across low-cost wearables. Our time series-to-time series generative adversarial network (T2TGAN) is capable of high-quality continuous ABP generation from a PPG signal with a mean error of 2.54 mmHg and a standard deviation of 23.7 mmHg when estimating mean arterial pressure on a previously unseen, noisy, independent dataset. To our knowledge, this framework is the first example of a GAN capable of continuous ABP generation from an input PPG signal that also uses a federated learning methodology. △ Less

Submitted 24 February, 2021; originally announced February 2021.

arXiv:2008.09524 [pdf, other]

doi 10.1109/TSP.2021.3087031

Change Point Detection in Time Series Data using Autoencoders with a Time-Invariant Representation

Authors: Tim De Ryck, Maarten De Vos, Alexander Bertrand

Abstract: Change point detection (CPD) aims to locate abrupt property changes in time series data. Recent CPD methods demonstrated the potential of using deep learning techniques, but often lack the ability to identify more subtle changes in the autocorrelation statistics of the signal and suffer from a high false alarm rate. To address these issues, we employ an autoencoder-based methodology with a novel l… ▽ More Change point detection (CPD) aims to locate abrupt property changes in time series data. Recent CPD methods demonstrated the potential of using deep learning techniques, but often lack the ability to identify more subtle changes in the autocorrelation statistics of the signal and suffer from a high false alarm rate. To address these issues, we employ an autoencoder-based methodology with a novel loss function, through which the used autoencoders learn a partially time-invariant representation that is tailored for CPD. The result is a flexible method that allows the user to indicate whether change points should be sought in the time domain, frequency domain or both. Detectable change points include abrupt changes in the slope, mean, variance, autocorrelation function and frequency spectrum. We demonstrate that our proposed method is consistently highly competitive or superior to baseline methods on diverse simulated and real-life benchmark data sets. Finally, we mitigate the issue of false detection alarms through the use of a postprocessing procedure that combines a matched filter and a newly proposed change point score. We show that this combination drastically improves the performance of our method as well as all baseline methods. △ Less

Submitted 10 February, 2021; v1 submitted 21 August, 2020; originally announced August 2020.

arXiv:2007.05492 [pdf, other]

doi 10.1109/TPAMI.2021.3070057

XSleepNet: Multi-View Sequential Model for Automatic Sleep Staging

Authors: Huy Phan, Oliver Y. Chén, Minh C. Tran, Philipp Koch, Alfred Mertins, Maarten De Vos

Abstract: Automating sleep staging is vital to scale up sleep assessment and diagnosis to serve millions experiencing sleep deprivation and disorders and enable longitudinal sleep monitoring in home environments. Learning from raw polysomnography signals and their derived time-frequency image representations has been prevalent. However, learning from multi-view inputs (e.g., both the raw signals and the tim… ▽ More Automating sleep staging is vital to scale up sleep assessment and diagnosis to serve millions experiencing sleep deprivation and disorders and enable longitudinal sleep monitoring in home environments. Learning from raw polysomnography signals and their derived time-frequency image representations has been prevalent. However, learning from multi-view inputs (e.g., both the raw signals and the time-frequency images) for sleep staging is difficult and not well understood. This work proposes a sequence-to-sequence sleep staging model, XSleepNet, that is capable of learning a joint representation from both raw signals and time-frequency images. Since different views may generalize or overfit at different rates, the proposed network is trained such that the learning pace on each view is adapted based on their generalization/overfitting behavior. In simple terms, the learning on a particular view is speeded up when it is generalizing well and slowed down when it is overfitting. View-specific generalization/overfitting measures are computed on-the-fly during the training course and used to derive weights to blend the gradients from different views. As a result, the network is able to retain the representation power of different views in the joint features which represent the underlying distribution better than those learned by each individual view alone. Furthermore, the XSleepNet architecture is principally designed to gain robustness to the amount of training data and to increase the complementarity between the input views. Experimental results on five databases of different sizes show that XSleepNet consistently outperforms the single-view baselines and the multi-view baseline with a simple fusion strategy. Finally, XSleepNet also outperforms prior sleep staging methods and improves previous state-of-the-art results on the experimental databases. △ Less

Submitted 31 March, 2021; v1 submitted 8 July, 2020; originally announced July 2020.

Comments: This article has been published in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

arXiv:2007.02064 [pdf]

Monitoring Depression in Bipolar Disorder using Circadian Measures from Smartphone Accelerometers

Authors: Oliver Carr, Fernando Andreotti, Kate E. A. Saunders, Niclas Palmius, Guy M. Goodwin, Maarten De Vos

Abstract: Current management of bipolar disorder relies on self-reported questionnaires and interviews with clinicians. The development of objective measures of deteriorating mood may also allow for early interventions to take place to avoid transitions into depressive states. The objective of this study was to use acceleration data recorded from smartphones to predict levels of depression in a population o… ▽ More Current management of bipolar disorder relies on self-reported questionnaires and interviews with clinicians. The development of objective measures of deteriorating mood may also allow for early interventions to take place to avoid transitions into depressive states. The objective of this study was to use acceleration data recorded from smartphones to predict levels of depression in a population of participants diagnosed with bipolar disorder. Data were collected from 52 participants, with a mean of 37 weeks of acceleration data with a corresponding depression score recorded per participant. Time varying hidden Markov models were used to extract weekly features of activity, sleep and circadian rhythms. Personalised regression achieved mean absolute errors of 1.00(0.57) from a possible scale of 0 to 27 and was able to classify depression with an accuracy of 0.84(0.16). The results demonstrate features derived from smartphone accelerometers are able to provide objective markers of depression. Low barriers for uptake exist due to the widespread use of smartphones, with personalised models able to account for differences in the behaviour of individuals and provide accurate predictions of depression. △ Less

Submitted 4 July, 2020; originally announced July 2020.

Comments: 8 pages, 3 figures

arXiv:2004.11349 [pdf, ps, other]

doi 10.1088/1361-6579/ab921e

Personalized Automatic Sleep Staging with Single-Night Data: a Pilot Study with KL-Divergence Regularization

Authors: Huy Phan, Kaare Mikkelsen, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Preben Kidmose, Maarten De Vos

Abstract: Brain waves vary between people. An obvious way to improve automatic sleep staging for longitudinal sleep monitoring is personalization of algorithms based on individual characteristics extracted from the first night of data. As a single night is a very small amount of data to train a sleep staging model, we propose a Kullback-Leibler (KL) divergence regularized transfer learning approach to addre… ▽ More Brain waves vary between people. An obvious way to improve automatic sleep staging for longitudinal sleep monitoring is personalization of algorithms based on individual characteristics extracted from the first night of data. As a single night is a very small amount of data to train a sleep staging model, we propose a Kullback-Leibler (KL) divergence regularized transfer learning approach to address this problem. We employ the pretrained SeqSleepNet (i.e. the subject independent model) as a starting point and finetune it with the single-night personalization data to derive the personalized model. This is done by adding the KL divergence between the output of the subject independent model and the output of the personalized model to the loss function during finetuning. In effect, KL-divergence regularization prevents the personalized model from overfitting to the single-night data and straying too far away from the subject independent model. Experimental results on the Sleep-EDF Expanded database with 75 subjects show that sleep staging personalization with a single-night data is possible with help of the proposed KL-divergence regularization. On average, we achieve a personalized sleep staging accuracy of 79.6%, a Cohen's kappa of 0.706, a macro F1-score of 73.0%, a sensitivity of 71.8%, and a specificity of 94.2%. We find both that the approach is robust against overfitting and that it improves the accuracy by 4.5 percentage points compared to non-personalization and 2.2 percentage points compared to personalization without regularization. △ Less

Submitted 11 May, 2020; v1 submitted 23 April, 2020; originally announced April 2020.

Comments: This article has been published in Physiological Measurement

arXiv:2004.05046 [pdf, other]

XChange: A Blockchain-based Mechanism for Generic Asset Trading In Resource-constrained Environments

Authors: Martijn de Vos, Can Umut Ileri, Johan Pouwelse

Abstract: An increasing number of industries rely on Internet-of-Things devices to track physical resources. Blockchain technology provides primitives to represent these resources as digital assets on a secure distributed ledger. Due to the proliferation of blockchain-based assets, there is an increasing need for a generic mechanism to trade assets between isolated platforms. To date, there is no such mecha… ▽ More An increasing number of industries rely on Internet-of-Things devices to track physical resources. Blockchain technology provides primitives to represent these resources as digital assets on a secure distributed ledger. Due to the proliferation of blockchain-based assets, there is an increasing need for a generic mechanism to trade assets between isolated platforms. To date, there is no such mechanism without reliance on a trusted third party. In this work, we address this shortcoming and present XChange. Unlike existing approaches for decentralized asset trading, we decouple trade management and the actual exchange of assets. XChange mediates trade of any digital asset between isolated blockchain platforms while limiting the fraud conducted by adversarial parties. We first describe a generic, five-phase trading protocol that establishes and executes trade between individuals. This protocol accounts full trade specifications on a separate blockchain. We then devise a lightweight system architecture, composed of all required components for a generic asset marketplace. We implement XChange and conduct real-world experimentation. We leverage an existing, lightweight blockchain, TrustChain, to account all orders and full trade specifications. By deploying XChange on multiple low-resource devices, we show that a full trade completes within half a second. To quantify the scalability of our mechanism, we conduct further experiments on our compute cluster. We conclude that the throughput of XChange, in terms of trades per second, scales linearly with the system load. Furthermore, we find that XChange exhibits superior throughput and order fulfil latency compared to related decentralized exchanges, BitShares and Waves. △ Less

Submitted 10 April, 2020; originally announced April 2020.

arXiv:2004.02575 [pdf, other]

A Norm Emergence Framework for Normative MAS -- Position Paper

Authors: Andreasa Morris-Martin, Marina De Vos, Julian Padget

Abstract: Norm emergence is typically studied in the context of multiagent systems (MAS) where norms are implicit, and participating agents use simplistic decision-making mechanisms. These implicit norms are usually unconsciously shared and adopted through agent interaction. A norm is deemed to have emerged when a threshold or predetermined percentage of agents follow the "norm". Conversely, in normative MA… ▽ More Norm emergence is typically studied in the context of multiagent systems (MAS) where norms are implicit, and participating agents use simplistic decision-making mechanisms. These implicit norms are usually unconsciously shared and adopted through agent interaction. A norm is deemed to have emerged when a threshold or predetermined percentage of agents follow the "norm". Conversely, in normative MAS, norms are typically explicit and agents deliberately share norms through communication or are informed about norms by an authority, following which an agent decides whether to adopt the norm or not. The decision to adopt a norm by the agent can happen immediately after recognition or when an applicable situation arises. In this paper, we make the case that, similarly, a norm has emerged in a normative MAS when a percentage of agents adopt the norm. Furthermore, we posit that agents themselves can and should be involved in norm synthesis, and hence influence the norms governing the MAS, in line with Ostrom's eight principles. Consequently, we put forward a framework for the emergence of norms within a normative MAS, that allows participating agents to propose/request changes to the normative system, while special-purpose synthesizer agents formulate new norms or revisions in response to these requests. Synthesizers must collectively agree that the new norm or norm revision should proceed, and then finally be approved by an "Oracle". The normative system is then modified to incorporate the norm. △ Less

Submitted 6 April, 2020; originally announced April 2020.

Comments: 16 pages, 2 figures, pre-print for International Workshop on Coordination, Organizations, Institutions, Norms and Ethics for Governance of Multi-Agent Systems (COINE), co-located with AAMAS 2020

arXiv:2004.01726 [pdf, other]

doi 10.1093/mnras/staa950

LOFAR 144-MHz follow-up observations of GW170817

Authors: J. W. Broderick, T. W. Shimwell, K. Gourdji, A. Rowlinson, S. Nissanke, K. Hotokezaka, P. G. Jonker, C. Tasse, M. J. Hardcastle, J. B. R. Oonk, R. P. Fender, R. A. M. J. Wijers, A. Shulevski, A. J. Stewart, S. ter Veen, V. A. Moss, M. H. D. van der Wiel, D. A. Nichols, A. Piette, M. E. Bell, D. Carbone, S. Corbel, J. Eislöffel, J. -M. Grießmeier, E. F. Keane , et al. (44 additional authors not shown)

Abstract: We present low-radio-frequency follow-up observations of AT 2017gfo, the electromagnetic counterpart of GW170817, which was the first binary neutron star merger to be detected by Advanced LIGO-Virgo. These data, with a central frequency of 144 MHz, were obtained with LOFAR, the Low-Frequency Array. The maximum elevation of the target is just 13.7 degrees when observed with LOFAR, making our observ… ▽ More We present low-radio-frequency follow-up observations of AT 2017gfo, the electromagnetic counterpart of GW170817, which was the first binary neutron star merger to be detected by Advanced LIGO-Virgo. These data, with a central frequency of 144 MHz, were obtained with LOFAR, the Low-Frequency Array. The maximum elevation of the target is just 13.7 degrees when observed with LOFAR, making our observations particularly challenging to calibrate and significantly limiting the achievable sensitivity. On time-scales of 130-138 and 371-374 days after the merger event, we obtain 3$σ$ upper limits for the afterglow component of 6.6 and 19.5 mJy beam$^{-1}$, respectively. Using our best upper limit and previously published, contemporaneous higher-frequency radio data, we place a limit on any potential steepening of the radio spectrum between 610 and 144 MHz: the two-point spectral index $α^{610}_{144} \gtrsim -2.5$. We also show that LOFAR can detect the afterglows of future binary neutron star merger events occurring at more favourable elevations. △ Less

Submitted 3 April, 2020; originally announced April 2020.

Comments: 9 pages, 2 figures, accepted for publication in MNRAS

arXiv:2002.10431 [pdf, other]

doi 10.1051/0004-6361/201936844

Cassiopeia A, Cygnus A, Taurus A, and Virgo A at ultra-low radio frequencies

Authors: F. de Gasperin, J. Vink, J. P. McKean, A. Asgekar, M. J. Bentum, R. Blaauw, A. Bonafede, M. Bruggen, F. Breitling, W. N. Brouw, H. R. Butcher, B. Ciardi, V. Cuciti, M. de Vos, S. Duscha, J. Eisloffel, D. Engels, R. A. Fallows, T. M. O. Franzen, M. A. Garrett, A. W. Gunst, J. Horandel, G. Heald, L. V. E. Koopmans, A. Krankowski , et al. (27 additional authors not shown)

Abstract: The four persistent radio sources in the northern sky with the highest flux density at metre wavelengths are Cassiopeia A, Cygnus A, Taurus A, and Virgo A; collectively they are called the A-team. Their flux densities at ultra-low frequencies (<100 MHz) can reach several thousands of janskys, and they often contaminate observations of the low-frequency sky by interfering with image processing. Fur… ▽ More The four persistent radio sources in the northern sky with the highest flux density at metre wavelengths are Cassiopeia A, Cygnus A, Taurus A, and Virgo A; collectively they are called the A-team. Their flux densities at ultra-low frequencies (<100 MHz) can reach several thousands of janskys, and they often contaminate observations of the low-frequency sky by interfering with image processing. Furthermore, these sources are foreground objects for all-sky observations hampering the study of faint signals, such as the cosmological 21 cm line from the epoch of reionisation. We aim to produce robust models for the surface brightness emission as a function of frequency for the A-team sources at ultra-low frequencies. These models are needed for the calibration and imaging of wide-area surveys of the sky with low-frequency interferometers. This requires obtaining images at an angular resolution better than 15 arcsec with a high dynamic range and good image fidelity. We observed the A-team with the Low Frequency Array (LOFAR) at frequencies between 30 MHz and 77 MHz using the Low Band Antenna (LBA) system. We reduced the datasets and obtained an image for each A-team source. The paper presents the best models to date for the sources Cassiopeia A, Cygnus A, Taurus A, and Virgo A between 30 MHz and 77 MHz. We were able to obtain the aimed resolution and dynamic range in all cases. Owing to its compactness and complexity, observations with the long baselines of the International LOFAR Telescope will be required to improve the source model for Cygnus A further. △ Less

Submitted 24 February, 2020; originally announced February 2020.

Comments: 7 pages, 2 figures, accepted A&A, online data on A&A website

arXiv:2002.07270 [pdf, other]

Thou Shalt Not Reject the P-value

Authors: Oliver Y. Chén, Raúl G. Saraiva, Guy Nagels, Huy Phan, Tom Schwantje, Hengyi Cao, Jiangtao Gou, Jenna M. Reinen, Bin Xiong, Bangdong Zhi, Xiaojun Wang, Maarten de Vos

Abstract: Since its debut in the 18th century, the P-value has been an important part of hypothesis testing-based scientific discoveries. As the statistical engine accelerates, questions are beginning to be raised, asking to what extent scientific discoveries based on P-values are reliable and reproducible, and the voice calling for adjusting the significance level or banning the P-value has been increasing… ▽ More Since its debut in the 18th century, the P-value has been an important part of hypothesis testing-based scientific discoveries. As the statistical engine accelerates, questions are beginning to be raised, asking to what extent scientific discoveries based on P-values are reliable and reproducible, and the voice calling for adjusting the significance level or banning the P-value has been increasingly heard. Inspired by these questions and discussions, here we enquire into the useful roles and misuses of the P-value in scientific studies. For common misuses and misinterpretations, we provide modest recommendations for practitioners. Additionally, we compare statistical significance with clinical relevance. In parallel, we review the Bayesian alternatives for seeking evidence. Finally, we discuss the promises and risks of using meta-analysis to pool P-values from multiple studies to aggregate evidence. Taken together, the P-value underpins a useful probabilistic decision-making system and provides evidence at a continuous scale. But its interpretation must be contextual, considering the scientific question, experimental design (including the model specification, sample size, and significance level), statistical power, effect size, and reproducibility. △ Less

Submitted 28 July, 2022; v1 submitted 17 February, 2020; originally announced February 2020.

arXiv:2001.05532 [pdf, other]

doi 10.1109/LSP.2020.3025020

Improving GANs for Speech Enhancement

Authors: Huy Phan, Ian V. McLoughlin, Lam Pham, Oliver Y. Chén, Philipp Koch, Maarten De Vos, Alfred Mertins

Abstract: Generative adversarial networks (GAN) have recently been shown to be efficient for speech enhancement. However, most, if not all, existing speech enhancement GANs (SEGAN) make use of a single generator to perform one-stage enhancement map**. In this work, we propose to use multiple generators that are chained to perform multi-stage enhancement map**, which gradually refines the noisy input sig… ▽ More Generative adversarial networks (GAN) have recently been shown to be efficient for speech enhancement. However, most, if not all, existing speech enhancement GANs (SEGAN) make use of a single generator to perform one-stage enhancement map**. In this work, we propose to use multiple generators that are chained to perform multi-stage enhancement map**, which gradually refines the noisy input signals in a stage-wise fashion. Furthermore, we study two scenarios: (1) the generators share their parameters and (2) the generators' parameters are independent. The former constrains the generators to learn a common map** that is iteratively applied at all enhancement stages and results in a small model footprint. On the contrary, the latter allows the generators to flexibly learn different enhancement map**s at different stages of the network at the cost of an increased model size. We demonstrate that the proposed multi-stage enhancement approach outperforms the one-stage SEGAN baseline, where the independent generators lead to more favorable results than the tied generators. The source code is available at http://github.com/pquochuy/idsegan. △ Less

Submitted 12 September, 2020; v1 submitted 15 January, 2020; originally announced January 2020.

Comments: This letter has been accepted for publication in IEEE Signal Processing Letters

arXiv:1910.11702 [pdf]

Screening for REM Sleep Behaviour Disorder with Minimal Sensors

Authors: Navin Cooray, Fernando Andreotti, Christine Lo, Mkael Symmonds, Michele T. M. Hu, Maarten De Vos

Abstract: Rapid-Eye-Movement (REM) sleep behaviour disorder (RBD) is an early predictor of Parkinson's disease, dementia with Lewy bodies, and multiple system atrophy. This study investigates a minimal set of sensors to achieve effective screening for RBD in the population, integrating automated sleep staging (three state) followed by RBD detection without the need for cumbersome electroencephalogram (EEG)… ▽ More Rapid-Eye-Movement (REM) sleep behaviour disorder (RBD) is an early predictor of Parkinson's disease, dementia with Lewy bodies, and multiple system atrophy. This study investigates a minimal set of sensors to achieve effective screening for RBD in the population, integrating automated sleep staging (three state) followed by RBD detection without the need for cumbersome electroencephalogram (EEG) sensors. Polysomnography signals from 50 participants with RBD and 50 age-matched healthy controls were used to evaluate this study. Three stage sleep classification was achieved using a Random Forest (RF) classifier and features derived from a combination of cost-effective and easy to use sensors, namely electrocardiogram (ECG), electrooculogram (EOG), and electromyogram (EMG) channels. Subsequently, RBD detection was achieved using established and new metrics derived from ECG and EMG metrics. The EOG and EMG combination provided the best minimalist fully automated performance, achieving $0.57\pm0.19$ kappa (3 stage) for sleep staging and an RBD detection accuracy of $0.90\pm0.11$, (sensitivity, and specificity $0.88\pm0.13$, and $0.92\pm0.098$). A single ECG sensor allowed three state sleep staging with $0.28\pm0.06$ kappa and RBD detection accuracy of $0.62\pm0.10$. This study demonstrated the feasibility of using signals from a single EOG and EMG sensor to detect RBD using fully-automated techniques. This study proposes a cost-effective, practical, and simple RBD identification support tool using only two sensors (EMG and EOG), ideal for screening purposes. △ Less

Submitted 24 October, 2019; originally announced October 2019.

Comments: 21 pages, 6 figures, and 6 tables. arXiv admin note: text overlap with arXiv:1811.04662

arXiv:1909.07646

doi 10.4204/EPTCS.306

Proceedings 35th International Conference on Logic Programming (Technical Communications)

Authors: Bart Bogaerts, Esra Erdem, Paul Fodor, Andrea Formisano, Giovambattista Ianni, Daniela Inclezan, German Vidal, Alicia Villanueva, Marina De Vos, Fangkai Yang

Abstract: Since the first conference held in Marseille in 1982, ICLP has been the premier international event for presenting research in logic programming. Contributions are sought in all areas of logic programming, including but not restricted to: Foundations: Semantics, Formalisms, Nonmonotonic reasoning, Knowledge representation. Languages: Concurrency, Objects, Coordination, Mobility, Higher Order,… ▽ More Since the first conference held in Marseille in 1982, ICLP has been the premier international event for presenting research in logic programming. Contributions are sought in all areas of logic programming, including but not restricted to: Foundations: Semantics, Formalisms, Nonmonotonic reasoning, Knowledge representation. Languages: Concurrency, Objects, Coordination, Mobility, Higher Order, Types, Modes, Assertions, Modules, Meta-programming, Logic-based domain-specific languages, Programming Techniques. Declarative programming: Declarative program development, Analysis, Type and mode inference, Partial evaluation, Abstract interpretation, Transformation, Validation, Verification, Debugging, Profiling, Testing, Execution visualization Implementation: Virtual machines, Compilation, Memory management, Parallel/distributed execution, Constraint handling rules, Tabling, Foreign interfaces, User interfaces. Related Paradigms and Synergies: Inductive and Co-inductive Logic Programming, Constraint Logic Programming, Answer Set Programming, Interaction with SAT, SMT and CSP solvers, Logic programming techniques for type inference and theorem proving, Argumentation, Probabilistic Logic Programming, Relations to object-oriented and Functional programming. Applications: Databases, Big Data, Data integration and federation, Software engineering, Natural language processing, Web and Semantic Web, Agents, Artificial intelligence, Computational life sciences, Education, Cybersecurity, and Robotics. △ Less

Submitted 17 September, 2019; originally announced September 2019.

Journal ref: EPTCS 306, 2019

arXiv:1907.13177 [pdf, ps, other]

doi 10.1109/TBME.2020.3020381

Towards More Accurate Automatic Sleep Staging via Deep Transfer Learning

Authors: Huy Phan, Oliver Y. Chén, Philipp Koch, Zongqing Lu, Ian McLoughlin, Alfred Mertins, Maarten De Vos

Abstract: Background: Despite recent significant progress in the development of automatic sleep staging methods, building a good model still remains a big challenge for sleep studies with a small cohort due to the data-variability and data-inefficiency issues. This work presents a deep transfer learning approach to overcome these issues and enable transferring knowledge from a large dataset to a small cohor… ▽ More Background: Despite recent significant progress in the development of automatic sleep staging methods, building a good model still remains a big challenge for sleep studies with a small cohort due to the data-variability and data-inefficiency issues. This work presents a deep transfer learning approach to overcome these issues and enable transferring knowledge from a large dataset to a small cohort for automatic sleep staging. Methods: We start from a generic end-to-end deep learning framework for sequence-to-sequence sleep staging and derive two networks as the means for transfer learning. The networks are first trained in the source domain (i.e. the large database). The pretrained networks are then finetuned in the target domain (i.e. the small cohort) to complete knowledge transfer. We employ the Montreal Archive of Sleep Studies (MASS) database consisting of 200 subjects as the source domain and study deep transfer learning on three different target domains: the Sleep Cassette subset and the Sleep Telemetry subset of the Sleep-EDF Expanded database, and the Surrey-cEEGrid database. The target domains are purposely adopted to cover different degrees of data mismatch to the source domains. Results: Our experimental results show significant performance improvement on automatic sleep staging on the target domains achieved with the proposed deep transfer learning approach. Conclusions: These results suggest the efficacy of the proposed approach in addressing the above-mentioned data-variability and data-inefficiency issues. Significance: As a consequence, it would enable one to improve the quality of automatic sleep staging models when the amount of data is relatively small. The source code and the pretrained models are available at http://github.com/pquochuy/sleep_transfer_learning. △ Less

Submitted 27 August, 2020; v1 submitted 30 July, 2019; originally announced July 2019.

Comments: This article has been published in IEEE Transactions on Biomedical Engineering

arXiv:1905.12382 [pdf]

NeoGuard: a public, online learning platform for neonatal seizures

Authors: Amir Hossein Ansari, Perumpillichira Joseph Cherian, Alexander Caicedo, Anneleen Dereymaeker, Katrien Jansen, Leen De Wispelaere, Charlotte Dielman, Jan Vervisch, Paul Govaert, Maarten De Vos, Gunnar Naulaers, Sabine Van Huffel

Abstract: Seizures occur in the neonatal period more frequently than other periods of life and usually denote the presence of serious brain dysfunction. The gold standard for detecting seizures is based on visual inspection of continuous electroencephalogram (cEEG) complemented by video analysis, performed by an expert clinical neurophysiologist. Previous studies have reported varying degree of agreement be… ▽ More Seizures occur in the neonatal period more frequently than other periods of life and usually denote the presence of serious brain dysfunction. The gold standard for detecting seizures is based on visual inspection of continuous electroencephalogram (cEEG) complemented by video analysis, performed by an expert clinical neurophysiologist. Previous studies have reported varying degree of agreement between expert EEG readers, with kappa coefficients ranging from 0.4 to 0.85, calling into question the validity of visual scoring. This variability in visual scoring of neonatal seizures may be due to factors such as reader expertise and the nature of expressed patterns. One of the possible reasons for low inter-rater agreement is the absence of any benchmark for the EEG readers to be able to compare their opinions. One way to develop this is to use a shared multi-center neonatal seizure database and use the inputs from multiple experts. This will also improve the teaching of trainees, and help to avoid potential bias from a single expert's opinion. In this paper, we introduce and explain the NeoGuard public learning platform that can be used by trainees, tutors, and expert EEG readers who are interested to test their knowledge and learn from neonatal EEG-polygraphic segments scored by several expert EEG readers. For this platform, 1919 clinically relevant segments, totaling 280h, recorded from 71 term neonates in two centers, including a wide variety of seizures and artifacts were used. These segments were scored by 4 EEG readers from three different centers. Users of this platform can score an arbitrary number of segments and then test their scoring with the experts' opinions. The kappa and joint probability of agreement, is then shown as inter-rater agreement metrics between the user and each of the experts. The platform is publicly available at the NeoGuard website (www.neoguard.net). △ Less

Submitted 29 May, 2019; originally announced May 2019.

arXiv:1904.05945 [pdf, ps, other]

Deep Transfer Learning for Single-Channel Automatic Sleep Staging with Channel Mismatch

Authors: Huy Phan, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Maarten De Vos

Abstract: Many sleep studies suffer from the problem of insufficient data to fully utilize deep neural networks as different labs use different recordings set ups, leading to the need of training automated algorithms on rather small databases, whereas large annotated databases are around but cannot be directly included into these studies for data compensation due to channel mismatch. This work presents a de… ▽ More Many sleep studies suffer from the problem of insufficient data to fully utilize deep neural networks as different labs use different recordings set ups, leading to the need of training automated algorithms on rather small databases, whereas large annotated databases are around but cannot be directly included into these studies for data compensation due to channel mismatch. This work presents a deep transfer learning approach to overcome the channel mismatch problem and transfer knowledge from a large dataset to a small cohort to study automatic sleep staging with single-channel input. We employ the state-of-the-art SeqSleepNet and train the network in the source domain, i.e. the large dataset. Afterwards, the pretrained network is finetuned in the target domain, i.e. the small cohort, to complete knowledge transfer. We study two transfer learning scenarios with slight and heavy channel mismatch between the source and target domains. We also investigate whether, and if so, how finetuning entirely or partially the pretrained network would affect the performance of sleep staging on the target domain. Using the Montreal Archive of Sleep Studies (MASS) database consisting of 200 subjects as the source domain and the Sleep-EDF Expanded database consisting of 20 subjects as the target domain in this study, our experimental results show significant performance improvement on sleep staging achieved with the proposed deep transfer learning approach. Furthermore, these results also reveal the essential of finetuning the feature-learning parts of the pretrained network to be able to bypass the channel mismatch problem. △ Less

Submitted 18 June, 2019; v1 submitted 11 April, 2019; originally announced April 2019.

Comments: Accepted for 27th European Signal Processing Conference (EUSIPCO 2019)

arXiv:1904.03543 [pdf, ps, other]

Spatio-Temporal Attention Pooling for Audio Scene Classification

Authors: Huy Phan, Oliver Y. Chén, Lam Pham, Philipp Koch, Maarten De Vos, Ian McLoughlin, Alfred Mertins

Abstract: Acoustic scenes are rich and redundant in their content. In this work, we present a spatio-temporal attention pooling layer coupled with a convolutional recurrent neural network to learn from patterns that are discriminative while suppressing those that are irrelevant for acoustic scene classification. The convolutional layers in this network learn invariant features from time-frequency input. The… ▽ More Acoustic scenes are rich and redundant in their content. In this work, we present a spatio-temporal attention pooling layer coupled with a convolutional recurrent neural network to learn from patterns that are discriminative while suppressing those that are irrelevant for acoustic scene classification. The convolutional layers in this network learn invariant features from time-frequency input. The bidirectional recurrent layers are then able to encode the temporal dynamics of the resulting convolutional features. Afterwards, a two-dimensional attention mask is formed via the outer product of the spatial and temporal attention vectors learned from two designated attention layers to weigh and pool the recurrent output into a final feature vector for classification. The network is trained with between-class examples generated from between-class data augmentation. Experiments demonstrate that the proposed method not only outperforms a strong convolutional neural network baseline but also sets new state-of-the-art performance on the LITIS Rouen dataset. △ Less

Submitted 28 June, 2019; v1 submitted 6 April, 2019; originally announced April 2019.

Comments: To appear at the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019)

arXiv:1811.04662 [pdf]

Detection of REM Sleep Behaviour Disorder by Automated Polysomnography Analysis

Authors: Navin Cooray, Fernando Andreotti, Christine Lo, Mkael Symmonds, Michele T. M. Hu, Maarten De Vos

Abstract: Evidence suggests Rapid-Eye-Movement (REM) Sleep Behaviour Disorder (RBD) is an early predictor of Parkinson's disease. This study proposes a fully-automated framework for RBD detection consisting of automated sleep staging followed by RBD identification. Analysis was assessed using a limited polysomnography montage from 53 participants with RBD and 53 age-matched healthy controls. Sleep stage cla… ▽ More Evidence suggests Rapid-Eye-Movement (REM) Sleep Behaviour Disorder (RBD) is an early predictor of Parkinson's disease. This study proposes a fully-automated framework for RBD detection consisting of automated sleep staging followed by RBD identification. Analysis was assessed using a limited polysomnography montage from 53 participants with RBD and 53 age-matched healthy controls. Sleep stage classification was achieved using a Random Forest (RF) classifier and 156 features extracted from electroencephalogram (EEG), electrooculogram (EOG) and electromyogram (EMG) channels. For RBD detection, a RF classifier was trained combining established techniques to quantify muscle atonia with additional features that incorporate sleep architecture and the EMG fractal exponent. Automated multi-state sleep staging achieved a 0.62 Cohen's Kappa score. RBD detection accuracy improved by 10% to 96% (compared to individual established metrics) when using manually annotated sleep staging. Accuracy remained high (92%) when using automated sleep staging. This study outperforms established metrics and demonstrates that incorporating sleep architecture and sleep stage transitions can benefit RBD detection. This study also achieved automated sleep staging with a level of accuracy comparable to manual annotation. This study validates a tractable, fully-automated, and sensitive pipeline for RBD identification that could be translated to wearable take-home technology. △ Less

Submitted 12 November, 2018; originally announced November 2018.

Comments: 20 pages, 3 figures

arXiv:1811.01095 [pdf, ps, other]

Beyond Equal-Length Snippets: How Long is Sufficient to Recognize an Audio Scene?

Authors: Huy Phan, Oliver Y. Chén, Philipp Koch, Lam Pham, Ian McLoughlin, Alfred Mertins, Maarten De Vos

Abstract: Due to the variability in characteristics of audio scenes, some scenes can naturally be recognized earlier than others. In this work, rather than using equal-length snippets for all scene categories, as is common in the literature, we study to which temporal extent an audio scene can be reliably recognized given state-of-the-art models. Moreover, as model fusion with deep network ensemble is preva… ▽ More Due to the variability in characteristics of audio scenes, some scenes can naturally be recognized earlier than others. In this work, rather than using equal-length snippets for all scene categories, as is common in the literature, we study to which temporal extent an audio scene can be reliably recognized given state-of-the-art models. Moreover, as model fusion with deep network ensemble is prevalent in audio scene classification, we further study whether, and if so, when model fusion is necessary for this task. To achieve these goals, we employ two single-network systems relying on a convolutional neural network and a recurrent neural network for classification as well as early fusion and late fusion of these networks. Experimental results on the LITIS-Rouen dataset show that some scenes can be reliably recognized with a few seconds while other scenes require significantly longer durations. In addition, model fusion is shown to be the most beneficial when the signal length is short. △ Less

Submitted 8 May, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

Comments: Accepted to 2019 AES Conference on Audio Forensics

arXiv:1811.01092 [pdf, ps, other]

Unifying Isolated and Overlap** Audio Event Detection with Multi-Label Multi-Task Convolutional Recurrent Neural Networks

Authors: Huy Phan, Oliver Y. Chén, Philipp Koch, Lam Pham, Ian McLoughlin, Alfred Mertins, Maarten De Vos

Abstract: We propose a multi-label multi-task framework based on a convolutional recurrent neural network to unify detection of isolated and overlap** audio events. The framework leverages the power of convolutional recurrent neural network architectures; convolutional layers learn effective features over which higher recurrent layers perform sequential modelling. Furthermore, the output layer is designed… ▽ More We propose a multi-label multi-task framework based on a convolutional recurrent neural network to unify detection of isolated and overlap** audio events. The framework leverages the power of convolutional recurrent neural network architectures; convolutional layers learn effective features over which higher recurrent layers perform sequential modelling. Furthermore, the output layer is designed to handle arbitrary degrees of event overlap. At each time step in the recurrent output sequence, an output triple is dedicated to each event category of interest to jointly model event occurrence and temporal boundaries. That is, the network jointly determines whether an event of this category occurs, and when it occurs, by estimating onset and offset positions at each recurrent time step. We then introduce three sequential losses for network training: multi-label classification loss, distance estimation loss, and confidence loss. We demonstrate good generalization on two datasets: ITC-Irst for isolated audio event detection, and TUT-SED-Synthetic-2016 for overlap** audio event detection. △ Less

Submitted 18 February, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

Comments: Accepted for the 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2019)

arXiv:1809.10932 [pdf, ps, other]

doi 10.1109/TNSRE.2019.2896659

SeqSleepNet: End-to-End Hierarchical Recurrent Neural Network for Sequence-to-Sequence Automatic Sleep Staging

Authors: Huy Phan, Fernando Andreotti, Navin Cooray, Oliver Y. Chén, Maarten De Vos

Abstract: Automatic sleep staging has been often treated as a simple classification problem that aims at determining the label of individual target polysomnography (PSG) epochs one at a time. In this work, we tackle the task as a sequence-to-sequence classification problem that receives a sequence of multiple epochs as input and classifies all of their labels at once. For this purpose, we propose a hierarch… ▽ More Automatic sleep staging has been often treated as a simple classification problem that aims at determining the label of individual target polysomnography (PSG) epochs one at a time. In this work, we tackle the task as a sequence-to-sequence classification problem that receives a sequence of multiple epochs as input and classifies all of their labels at once. For this purpose, we propose a hierarchical recurrent neural network named SeqSleepNet. At the epoch processing level, the network consists of a filterbank layer tailored to learn frequency-domain filters for preprocessing and an attention-based recurrent layer designed for short-term sequential modelling. At the sequence processing level, a recurrent layer placed on top of the learned epoch-wise features for long-term modelling of sequential epochs. The classification is then carried out on the output vectors at every time step of the top recurrent layer to produce the sequence of output labels. Despite being hierarchical, we present a strategy to train the network in an end-to-end fashion. We show that the proposed network outperforms state-of-the-art approaches, achieving an overall accuracy, macro F1-score, and Cohen's kappa of 87.1%, 83.3%, and 0.815 on a publicly available dataset with 200 subjects. △ Less

Submitted 1 February, 2019; v1 submitted 28 September, 2018; originally announced September 2018.

Comments: This article has been published in IEEE Transactions on Neural Systems and Rehabilitation Engineering

Showing 1–50 of 82 results for author: de Vos, M