Search | arXiv e-print repository

BIOT: Cross-data Biosignal Learning in the Wild

Authors: Chaoqi Yang, M. Brandon Westover, Jimeng Sun

Abstract: Biological signals, such as electroencephalograms (EEG), play a crucial role in numerous clinical applications, exhibiting diverse data formats and quality profiles. Current deep learning models for biosignals are typically specialized for specific datasets and clinical settings, limiting their broader applicability. Motivated by the success of large language models in text processing, we explore… ▽ More Biological signals, such as electroencephalograms (EEG), play a crucial role in numerous clinical applications, exhibiting diverse data formats and quality profiles. Current deep learning models for biosignals are typically specialized for specific datasets and clinical settings, limiting their broader applicability. Motivated by the success of large language models in text processing, we explore the development of foundational models that are trained from multiple data sources and can be fine-tuned on different downstream biosignal tasks. To overcome the unique challenges associated with biosignals of various formats, such as mismatched channels, variable sample lengths, and prevalent missing values, we propose a Biosignal Transformer (\method). The proposed \method model can enable cross-data learning with mismatched channels, variable lengths, and missing values by tokenizing diverse biosignals into unified "biosignal sentences". Specifically, we tokenize each channel into fixed-length segments containing local signal features, flattening them to form consistent "sentences". Channel embeddings and {\em relative} position embeddings are added to preserve spatio-temporal features. The \method model is versatile and applicable to various biosignal learning settings across different datasets, including joint pre-training for larger models. Comprehensive evaluations on EEG, electrocardiogram (ECG), and human activity sensory signals demonstrate that \method outperforms robust baselines in common settings and facilitates learning across multiple datasets with different formats. Use CHB-MIT seizure detection task as an example, our vanilla \method model shows 3\% improvement over baselines in balanced accuracy, and the pre-trained \method models (optimized from other data sources) can further bring up to 4\% improvements. △ Less

Submitted 10 May, 2023; originally announced May 2023.

Comments: expect the codebases and pre-trained models to be released in https://github.com/ycq091044/BIOT

arXiv:2110.15278 [pdf]

Self-supervised EEG Representation Learning for Automatic Sleep Staging

Authors: Chaoqi Yang, Danica Xiao, M. Brandon Westover, Jimeng Sun

Abstract: Background: Deep learning models have shown great success in automating tasks in sleep medicine by learning from carefully annotated Electroencephalogram (EEG) data. However, effectively utilizing a large amount of raw EEG remains a challenge. Objective: In this paper, we aim to learn robust vector representations from massive unlabeled EEG signals, such that the learned vectorized features (1)… ▽ More Background: Deep learning models have shown great success in automating tasks in sleep medicine by learning from carefully annotated Electroencephalogram (EEG) data. However, effectively utilizing a large amount of raw EEG remains a challenge. Objective: In this paper, we aim to learn robust vector representations from massive unlabeled EEG signals, such that the learned vectorized features (1) are expressive enough to replace the raw signals in the sleep staging task; and (2) provide better predictive performance than supervised models in scenarios of fewer labels and noisy samples. Methods: We propose a self-supervised model, named Contrast with the World Representation (ContraWR), for EEG signal representation learning, which uses global statistics from the dataset to distinguish signals associated with different sleep stages. The ContraWR model is evaluated on three real-world EEG datasets that include both at-home and in-lab EEG recording settings. Results: ContraWR outperforms 4 recent self-supervised learning methods on the sleep staging task across 3 large EEG datasets. ContraWR also beats supervised learning when fewer training labels are available (e.g., 4% accuracy improvement when less than 2% data is labeled). Moreover, the model provides informative representative feature structures in 2D projection. Conclusions: We show that ContraWR is robust to noise and can provide high-quality EEG representations for downstream prediction tasks. The proposed model can be generalized to other unsupervised physiological signal learning tasks. Future directions include exploring task-specific data augmentations and combining self-supervised with supervised methods, building upon the initial success of self-supervised learning in this paper. △ Less

Submitted 12 February, 2023; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: Preprocessing and Code in Github: https://github.com/ycq091044/ContraWR

arXiv:2102.13473 [pdf]

Sleep Apnea and Respiratory Anomaly Detection from a Wearable Band and Oxygen Saturation

Authors: Wolfgang Ganglberger, Abigail A. Bucklin, Ryan A. Tesh, Madalena Da Silva Cardoso, Haoqi Sun, Michael J. Leone, Luis Paixao, Ezhil Panneerselvam, Elissa M. Ye, B. Taylor Thompson, Oluwaseun Akeju, David Kuller, Robert J. Thomas, M. Brandon Westover

Abstract: Objective: Sleep related respiratory abnormalities are typically detected using polysomnography. There is a need in general medicine and critical care for a more convenient method to automatically detect sleep apnea from a simple, easy-to-wear device. The objective is to automatically detect abnormal respiration and estimate the Apnea-Hypopnea-Index (AHI) with a wearable respiratory device, compar… ▽ More Objective: Sleep related respiratory abnormalities are typically detected using polysomnography. There is a need in general medicine and critical care for a more convenient method to automatically detect sleep apnea from a simple, easy-to-wear device. The objective is to automatically detect abnormal respiration and estimate the Apnea-Hypopnea-Index (AHI) with a wearable respiratory device, compared to an SpO2 signal or polysomnography using a large (n = 412) dataset serving as ground truth. Methods: Simultaneously recorded polysomnographic (PSG) and wearable respiratory effort data were used to train and evaluate models in a cross-validation fashion. Time domain and complexity features were extracted, important features were identified, and a random forest model employed to detect events and predict AHI. Four models were trained: one each using the respiratory features only, a feature from the SpO2 (%)-signal only, and two additional models that use the respiratory features and the SpO2 (%)-feature, one allowing a time lag of 30 seconds between the two signals. Results: Event-based classification resulted in areas under the receiver operating characteristic curves of 0.94, 0.86, 0.82, and areas under the precision-recall curves of 0.48, 0.32, 0.51 for the models using respiration and SpO2, respiration-only, and SpO2-only respectively. Correlation between expert-labelled and predicted AHI was 0.96, 0.78, and 0.93, respectively. Conclusions: A wearable respiratory effort signal with or without SpO2 predicted AHI accurately. Given the large dataset and rigorous testing design, we expect our models are generalizable to evaluating respiration in a variety of environments, such as at home and in critical care. △ Less

Submitted 23 February, 2021; originally announced February 2021.

Comments: Co-First Authors: Wolfgang Ganglberger, Abigail A. Bucklin Co-Senior Authors: Robert J. Thomas, M. Brandon Westover

arXiv:1910.06100 [pdf, other]

SLEEPER: interpretable Sleep staging via Prototypes from Expert Rules

Authors: Irfan Al-Hussaini, Cao Xiao, M. Brandon Westover, Jimeng Sun

Abstract: Sleep staging is a crucial task for diagnosing sleep disorders. It is tedious and complex as it can take a trained expert several hours to annotate just one patient's polysomnogram (PSG) from a single night. Although deep learning models have demonstrated state-of-the-art performance in automating sleep staging, interpretability which defines other desiderata, has largely remained unexplored. In t… ▽ More Sleep staging is a crucial task for diagnosing sleep disorders. It is tedious and complex as it can take a trained expert several hours to annotate just one patient's polysomnogram (PSG) from a single night. Although deep learning models have demonstrated state-of-the-art performance in automating sleep staging, interpretability which defines other desiderata, has largely remained unexplored. In this study, we propose Sleep staging via Prototypes from Expert Rules (SLEEPER), which combines deep learning models with expert defined rules using a prototype learning framework to generate simple interpretable models. In particular, SLEEPER utilizes sleep scoring rules and expert defined features to derive prototypes which are embeddings of PSG data fragments via convolutional neural networks. The final models are simple interpretable models like a shallow decision tree defined over those phenotypes. We evaluated SLEEPER using two PSG datasets collected from sleep studies and demonstrated that SLEEPER could provide accurate sleep stage classification comparable to human experts and deep neural networks with about 85% ROC-AUC and .7 kappa. △ Less

Submitted 14 October, 2019; originally announced October 2019.

Comments: Machine Learning for Healthcare Conference (MLHC) 2019. Proceedings of Machine Learning Research 106

Journal ref: PMLR 106:721-739, 2019

arXiv:1908.11463 [pdf]

doi 10.1093/sleep/zsz306

Sleep Staging from Electrocardiography and Respiration with Deep Learning

Authors: Haoqi Sun, Wolfgang Ganglberger, Ezhil Panneerselvam, Michael J. Leone, Syed A. Quadri, Balaji Goparaju, Ryan A. Tesh, Oluwaseun Akeju, Robert J. Thomas, M. Brandon Westover

Abstract: Study Objective: Sleep is reflected not only in the electroencephalogram but also in heart rhythms and breathing patterns. Therefore, we hypothesize that it is possible to accurately stage sleep based on the electrocardiogram (ECG) and respiratory signals. Methods: Using a dataset including 8,682 polysomnographs, we develop deep neural networks to stage sleep from ECG and respiratory signals. Five… ▽ More Study Objective: Sleep is reflected not only in the electroencephalogram but also in heart rhythms and breathing patterns. Therefore, we hypothesize that it is possible to accurately stage sleep based on the electrocardiogram (ECG) and respiratory signals. Methods: Using a dataset including 8,682 polysomnographs, we develop deep neural networks to stage sleep from ECG and respiratory signals. Five deep neural networks consisting of convolutional networks and long short-term memory networks are trained to stage sleep using heart and breathing, including the timing of R peaks from ECG, abdominal and chest respiratory effort, and the combinations of these signals. Results: ECG in combination with the abdominal respiratory effort achieve the best performance for staging all five sleep stages with a Cohen's kappa of 0.600 (95% confidence interval 0.599 -- 0.602); and 0.762 (0.760 -- 0.763) for discriminating awake vs. rapid eye movement vs. non-rapid eye movement sleep. The performance is better for young participants and for those with a low apnea-hypopnea index, while it is robust for commonly used outpatient medications. Conclusions: Our results validate that ECG and respiratory effort provide substantial information about sleep stages in a large population. It opens new possibilities in sleep research and applications where electroencephalography is not readily available or may be infeasible, such as in critically ill patients. △ Less

Submitted 15 September, 2019; v1 submitted 29 August, 2019; originally announced August 2019.

Comments: Contains supplementary material at the end. Sleep 2019

Showing 1–5 of 5 results for author: Westover, M B