Search | arXiv e-print repository

Safe and Interpretable Estimation of Optimal Treatment Regimes

Authors: Harsh Parikh, Quinn Lanners, Zade Akras, Sahar F. Zafar, M. Brandon Westover, Cynthia Rudin, Alexander Volfovsky

Abstract: Recent statistical and reinforcement learning methods have significantly advanced patient care strategies. However, these approaches face substantial challenges in high-stakes contexts, including missing data, inherent stochasticity, and the critical requirements for interpretability and patient safety. Our work operationalizes a safe and interpretable framework to identify optimal treatment regim… ▽ More Recent statistical and reinforcement learning methods have significantly advanced patient care strategies. However, these approaches face substantial challenges in high-stakes contexts, including missing data, inherent stochasticity, and the critical requirements for interpretability and patient safety. Our work operationalizes a safe and interpretable framework to identify optimal treatment regimes. This approach involves matching patients with similar medical and pharmacological characteristics, allowing us to construct an optimal policy via interpolation. We perform a comprehensive simulation study to demonstrate the framework's ability to identify optimal policies even in complex settings. Ultimately, we operationalize our approach to study regimes for treating seizures in critically ill patients. Our findings strongly support personalized treatment strategies based on a patient's medical history and pharmacological features. Notably, we identify that reducing medication doses for patients with mild and brief seizure episodes while adopting aggressive treatment for patients in intensive care unit experiencing intense seizures leads to more favorable outcomes. △ Less

Submitted 1 April, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted for publication in the proceedings of AISTATS 2025

arXiv:2305.10351 [pdf, other]

BIOT: Cross-data Biosignal Learning in the Wild

Authors: Chaoqi Yang, M. Brandon Westover, Jimeng Sun

Abstract: Biological signals, such as electroencephalograms (EEG), play a crucial role in numerous clinical applications, exhibiting diverse data formats and quality profiles. Current deep learning models for biosignals are typically specialized for specific datasets and clinical settings, limiting their broader applicability. Motivated by the success of large language models in text processing, we explore… ▽ More Biological signals, such as electroencephalograms (EEG), play a crucial role in numerous clinical applications, exhibiting diverse data formats and quality profiles. Current deep learning models for biosignals are typically specialized for specific datasets and clinical settings, limiting their broader applicability. Motivated by the success of large language models in text processing, we explore the development of foundational models that are trained from multiple data sources and can be fine-tuned on different downstream biosignal tasks. To overcome the unique challenges associated with biosignals of various formats, such as mismatched channels, variable sample lengths, and prevalent missing values, we propose a Biosignal Transformer (\method). The proposed \method model can enable cross-data learning with mismatched channels, variable lengths, and missing values by tokenizing diverse biosignals into unified "biosignal sentences". Specifically, we tokenize each channel into fixed-length segments containing local signal features, flattening them to form consistent "sentences". Channel embeddings and {\em relative} position embeddings are added to preserve spatio-temporal features. The \method model is versatile and applicable to various biosignal learning settings across different datasets, including joint pre-training for larger models. Comprehensive evaluations on EEG, electrocardiogram (ECG), and human activity sensory signals demonstrate that \method outperforms robust baselines in common settings and facilitates learning across multiple datasets with different formats. Use CHB-MIT seizure detection task as an example, our vanilla \method model shows 3\% improvement over baselines in balanced accuracy, and the pre-trained \method models (optimized from other data sources) can further bring up to 4\% improvements. △ Less

Submitted 10 May, 2023; originally announced May 2023.

Comments: expect the codebases and pre-trained models to be released in https://github.com/ycq091044/BIOT

arXiv:2301.08834 [pdf, other]

ManyDG: Many-domain Generalization for Healthcare Applications

Authors: Chaoqi Yang, M. Brandon Westover, Jimeng Sun

Abstract: The vast amount of health data has been continuously collected for each patient, providing opportunities to support diverse healthcare predictive tasks such as seizure detection and hospitalization prediction. Existing models are mostly trained on other patients data and evaluated on new patients. Many of them might suffer from poor generalizability. One key reason can be overfitting due to the un… ▽ More The vast amount of health data has been continuously collected for each patient, providing opportunities to support diverse healthcare predictive tasks such as seizure detection and hospitalization prediction. Existing models are mostly trained on other patients data and evaluated on new patients. Many of them might suffer from poor generalizability. One key reason can be overfitting due to the unique information related to patient identities and their data collection environments, referred to as patient covariates in the paper. These patient covariates usually do not contribute to predicting the targets but are often difficult to remove. As a result, they can bias the model training process and impede generalization. In healthcare applications, most existing domain generalization methods assume a small number of domains. In this paper, considering the diversity of patient covariates, we propose a new setting by treating each patient as a separate domain (leading to many domains). We develop a new domain generalization method ManyDG, that can scale to such many-domain problems. Our method identifies the patient domain covariates by mutual reconstruction and removes them via an orthogonal projection step. Extensive experiments show that ManyDG can boost the generalization performance on multiple real-world healthcare tasks (e.g., 3.7% Jaccard improvements on MIMIC drug recommendation) and support realistic but challenging settings such as insufficient data and continuous learning. △ Less

Submitted 14 February, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

Comments: The paper has been accepted by ICLR 2023, refer to https://openreview.net/forum?id=lcSfirnflpW. We will release the data and source codes here https://github.com/ycq091044/ManyDG

arXiv:2211.05207 [pdf, other]

Interpretable Machine Learning System to EEG Patterns on the Ictal-Interictal-Injury Continuum

Authors: Alina Jade Barnett, Zhicheng Guo, ** **g, Wendong Ge, Cynthia Rudin, M. Brandon Westover

Abstract: In intensive care units (ICUs), critically ill patients are monitored with electroencephalograms (EEGs) to prevent serious brain injury. The number of patients who can be monitored is constrained by the availability of trained physicians to read EEGs, and EEG interpretation can be subjective and prone to inter-observer variability. Automated deep learning systems for EEG could reduce human bias an… ▽ More In intensive care units (ICUs), critically ill patients are monitored with electroencephalograms (EEGs) to prevent serious brain injury. The number of patients who can be monitored is constrained by the availability of trained physicians to read EEGs, and EEG interpretation can be subjective and prone to inter-observer variability. Automated deep learning systems for EEG could reduce human bias and accelerate the diagnostic process. However, black box deep learning models are untrustworthy, difficult to troubleshoot, and lack accountability in real-world applications, leading to a lack of trust and adoption by clinicians. To address these challenges, we propose a novel interpretable deep learning model that not only predicts the presence of harmful brainwave patterns but also provides high-quality case-based explanations of its decisions. Our model performs better than the corresponding black box model, despite being constrained to be interpretable. The learned 2D embedded space provides the first global overview of the structure of ictal-interictal-injury continuum brainwave patterns. The ability to understand how our model arrived at its decisions will not only help clinicians to diagnose and treat harmful brain activities more accurately but also increase their trust and adoption of machine learning models in clinical practice; this could be an integral component of the ICU neurologists' standard workflow. △ Less

Submitted 11 April, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

Comments: 20 pages including appendices, 7 figures, submitted for peer review

ACM Class: I.2.6; I.4.9; I.5.4

arXiv:2203.04920 [pdf]

doi 10.1016/S2589-7500(23)00088-2

Effects of Epileptiform Activity on Discharge Outcome in Critically Ill Patients

Authors: Harsh Parikh, Kentaro Hoffman, Haoqi Sun, Wendong Ge, ** **g, Rajesh Amerineni, Lin Liu, Jimeng Sun, Sahar Zafar, Aaron Struck, Alexander Volfovsky, Cynthia Rudin, M. Brandon Westover

Abstract: Epileptiform activity (EA) is associated with worse outcomes including increased risk of disability and death. However, the effect of EA on the neurologic outcome is confounded by the feedback between treatment with anti-seizure medications (ASM) and EA burden. A randomized clinical trial is challenging due to the sequential nature of EA-ASM feedback, as well as ethical reasons. However, some mech… ▽ More Epileptiform activity (EA) is associated with worse outcomes including increased risk of disability and death. However, the effect of EA on the neurologic outcome is confounded by the feedback between treatment with anti-seizure medications (ASM) and EA burden. A randomized clinical trial is challenging due to the sequential nature of EA-ASM feedback, as well as ethical reasons. However, some mechanistic knowledge is available, e.g., how drugs are absorbed. This knowledge together with observational data could provide a more accurate effect estimate using causal inference. We performed a retrospective cross-sectional study with 995 patients with the modified Rankin Scale (mRS) at discharge as the outcome and the EA burden defined as the mean or maximum proportion of time spent with EA in six-hour windows in the first 24 hours of electroencephalography as the exposure. We estimated the change in discharge mRS if everyone in the dataset had experienced a certain EA burden and were untreated. We combined pharmacological modeling with an interpretable matching method to account for confounding and EA-ASM feedback. Our matched groups' quality was validated by the neurologists. Having a maximum EA burden greater than 75% when untreated had a 22% increased chance of a poor outcome (severe disability or death), and mild but long-lasting EA increased the risk of a poor outcome by 14%. The effect sizes were heterogeneous depending on pre-admission profile, e.g., patients with hypoxic-ischemic encephalopathy (HIE) or acquired brain injury (ABI) were more affected. Interventions should put a higher priority on patients with an average EA burden higher than 10%, while treatment should be more conservative when the maximum EA burden is low. △ Less

Submitted 11 March, 2023; v1 submitted 9 March, 2022; originally announced March 2022.

Comments: 4 Figures

arXiv:2202.00478 [pdf]

NeuraHealth: An Automated Screening Pipeline to Detect Undiagnosed Cognitive Impairment in Electronic Health Records with Deep Learning and Natural Language Processing

Authors: Tanish Tyagi, Colin G. Magdamo, Ayush Noori, Zhaozhi Li, Xiao Liu, Mayuresh Deodhar, Zhuoqiao Hong, Wendong Ge, Elissa M. Ye, Yi-han Sheu, Haitham Alabsi, Laura Brenner, Gregory K. Robbins, Sahar Zafar, Nicole Benson, Lidia Moura, John Hsu, Alberto Serrano-Pozo, Dimitry Prokopenko, Rudolph E. Tanzi, Bradley T. Hyman, Deborah Blacker, Shibani S. Mukerji, M. Brandon Westover, Sudeshna Das

Abstract: Dementia related cognitive impairment (CI) is a neurodegenerative disorder, affecting over 55 million people worldwide and growing rapidly at the rate of one new case every 3 seconds. 75% cases go undiagnosed globally with up to 90% in low-and-middle-income countries, leading to an estimated annual worldwide cost of USD 1.3 trillion, forecasted to reach 2.8 trillion by 2030. With no cure, a recurr… ▽ More Dementia related cognitive impairment (CI) is a neurodegenerative disorder, affecting over 55 million people worldwide and growing rapidly at the rate of one new case every 3 seconds. 75% cases go undiagnosed globally with up to 90% in low-and-middle-income countries, leading to an estimated annual worldwide cost of USD 1.3 trillion, forecasted to reach 2.8 trillion by 2030. With no cure, a recurring failure of clinical trials, and a lack of early diagnosis, the mortality rate is 100%. Information in electronic health records (EHR) can provide vital clues for early detection of CI, but a manual review by experts is tedious and error prone. Several computational methods have been proposed, however, they lack an enhanced understanding of the linguistic context in complex language structures of EHR. Therefore, I propose a novel and more accurate framework, NeuraHealth, to identify patients who had no earlier diagnosis. In NeuraHealth, using patient EHR from Mass General Brigham BioBank, I fine-tuned a bi-directional attention-based deep learning natural language processing model to classify sequences. The sequence predictions were used to generate structured features as input for a patient level regularized logistic regression model. This two-step framework creates high dimensionality, outperforming all existing state-of-the-art computational methods as well as clinical methods. Further, I integrate the models into a real-world product, a web app, to create an automated EHR screening pipeline for scalable and high-speed discovery of undetected CI in EHR, making early diagnosis viable in medical facilities and in regions with scarce health services. △ Less

Submitted 20 June, 2022; v1 submitted 12 January, 2022; originally announced February 2022.

arXiv:2111.09115 [pdf, other]

Using Deep Learning to Identify Patients with Cognitive Impairment in Electronic Health Records

Authors: Tanish Tyagi, Colin G. Magdamo, Ayush Noori, Zhaozhi Li, Xiao Liu, Mayuresh Deodhar, Zhuoqiao Hong, Wendong Ge, Elissa M. Ye, Yi-han Sheu, Haitham Alabsi, Laura Brenner, Gregory K. Robbins, Sahar Zafar, Nicole Benson, Lidia Moura, John Hsu, Alberto Serrano-Pozo, Dimitry Prokopenko, Rudolph E. Tanzi, Bradley T. Hyman, Deborah Blacker, Shibani S. Mukerji, M. Brandon Westover, Sudeshna Das

Abstract: Dementia is a neurodegenerative disorder that causes cognitive decline and affects more than 50 million people worldwide. Dementia is under-diagnosed by healthcare professionals - only one in four people who suffer from dementia are diagnosed. Even when a diagnosis is made, it may not be entered as a structured International Classification of Diseases (ICD) diagnosis code in a patient's charts. In… ▽ More Dementia is a neurodegenerative disorder that causes cognitive decline and affects more than 50 million people worldwide. Dementia is under-diagnosed by healthcare professionals - only one in four people who suffer from dementia are diagnosed. Even when a diagnosis is made, it may not be entered as a structured International Classification of Diseases (ICD) diagnosis code in a patient's charts. Information relevant to cognitive impairment (CI) is often found within electronic health records (EHR), but manual review of clinician notes by experts is both time consuming and often prone to errors. Automated mining of these notes presents an opportunity to label patients with cognitive impairment in EHR data. We developed natural language processing (NLP) tools to identify patients with cognitive impairment and demonstrate that linguistic context enhances performance for the cognitive impairment classification task. We fine-tuned our attention based deep learning model, which can learn from complex language structures, and substantially improved accuracy (0.93) relative to a baseline NLP model (0.84). Further, we show that deep learning NLP can successfully identify dementia patients without dementia-related ICD codes or medications. △ Less

Submitted 12 November, 2021; originally announced November 2021.

Comments: Machine Learning for Health (ML4H) - Extended Abstract

arXiv:2110.15278 [pdf]

Self-supervised EEG Representation Learning for Automatic Sleep Staging

Authors: Chaoqi Yang, Danica Xiao, M. Brandon Westover, Jimeng Sun

Abstract: Background: Deep learning models have shown great success in automating tasks in sleep medicine by learning from carefully annotated Electroencephalogram (EEG) data. However, effectively utilizing a large amount of raw EEG remains a challenge. Objective: In this paper, we aim to learn robust vector representations from massive unlabeled EEG signals, such that the learned vectorized features (1)… ▽ More Background: Deep learning models have shown great success in automating tasks in sleep medicine by learning from carefully annotated Electroencephalogram (EEG) data. However, effectively utilizing a large amount of raw EEG remains a challenge. Objective: In this paper, we aim to learn robust vector representations from massive unlabeled EEG signals, such that the learned vectorized features (1) are expressive enough to replace the raw signals in the sleep staging task; and (2) provide better predictive performance than supervised models in scenarios of fewer labels and noisy samples. Methods: We propose a self-supervised model, named Contrast with the World Representation (ContraWR), for EEG signal representation learning, which uses global statistics from the dataset to distinguish signals associated with different sleep stages. The ContraWR model is evaluated on three real-world EEG datasets that include both at-home and in-lab EEG recording settings. Results: ContraWR outperforms 4 recent self-supervised learning methods on the sleep staging task across 3 large EEG datasets. ContraWR also beats supervised learning when fewer training labels are available (e.g., 4% accuracy improvement when less than 2% data is labeled). Moreover, the model provides informative representative feature structures in 2D projection. Conclusions: We show that ContraWR is robust to noise and can provide high-quality EEG representations for downstream prediction tasks. The proposed model can be generalized to other unsupervised physiological signal learning tasks. Future directions include exploring task-specific data augmentations and combining self-supervised with supervised methods, building upon the initial success of self-supervised learning in this paper. △ Less

Submitted 12 February, 2023; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: Preprocessing and Code in Github: https://github.com/ycq091044/ContraWR

arXiv:2106.07900 [pdf, other]

ATD: Augmenting CP Tensor Decomposition by Self Supervision

Authors: Chaoqi Yang, Cheng Qian, Navjot Singh, Cao Xiao, M Brandon Westover, Edgar Solomonik, Jimeng Sun

Abstract: Tensor decompositions are powerful tools for dimensionality reduction and feature interpretation of multidimensional data such as signals. Existing tensor decomposition objectives (e.g., Frobenius norm) are designed for fitting raw data under statistical assumptions, which may not align with downstream classification tasks. In practice, raw input tensors can contain irrelevant information while da… ▽ More Tensor decompositions are powerful tools for dimensionality reduction and feature interpretation of multidimensional data such as signals. Existing tensor decomposition objectives (e.g., Frobenius norm) are designed for fitting raw data under statistical assumptions, which may not align with downstream classification tasks. In practice, raw input tensors can contain irrelevant information while data augmentation techniques may be used to smooth out class-irrelevant noise in samples. This paper addresses the above challenges by proposing augmented tensor decomposition (ATD), which effectively incorporates data augmentations and self-supervised learning (SSL) to boost downstream classification. To address the non-convexity of the new augmented objective, we develop an iterative method that enables the optimization to follow an alternating least squares (ALS) fashion. We evaluate our proposed ATD on multiple datasets. It can achieve 0.8% - 2.5% accuracy gain over tensor-based baselines. Also, our ATD model shows comparable or better performance (e.g., up to 15% in accuracy) over self-supervised and autoencoder baselines while using less than 5% of learnable parameters of these baseline models △ Less

Submitted 18 September, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: Improve the nested ALS algorithms in the last version. Accepted into NeurIPS 2022. Code available soon in https://github.com/ycq091044/ATD

arXiv:2103.03945 [pdf, other]

SCRIB: Set-classifier with Class-specific Risk Bounds for Blackbox Models

Authors: Zhen Lin, Cao Xiao, Lucas Glass, M. Brandon Westover, Jimeng Sun

Abstract: Despite deep learning (DL) success in classification problems, DL classifiers do not provide a sound mechanism to decide when to refrain from predicting. Recent works tried to control the overall prediction risk with classification with rejection options. However, existing works overlook the different significance of different classes. We introduce Set-classifier with Class-specific RIsk Bounds (S… ▽ More Despite deep learning (DL) success in classification problems, DL classifiers do not provide a sound mechanism to decide when to refrain from predicting. Recent works tried to control the overall prediction risk with classification with rejection options. However, existing works overlook the different significance of different classes. We introduce Set-classifier with Class-specific RIsk Bounds (SCRIB) to tackle this problem, assigning multiple labels to each example. Given the output of a black-box model on the validation set, SCRIB constructs a set-classifier that controls the class-specific prediction risks with a theoretical guarantee. The key idea is to reject when the set classifier returns more than one label. We validated SCRIB on several medical applications, including sleep staging on electroencephalogram (EEG) data, X-ray COVID image classification, and atrial fibrillation detection based on electrocardiogram (ECG) data. SCRIB obtained desirable class-specific risks, which are 35\%-88\% closer to the target risks than baseline methods. △ Less

Submitted 5 March, 2021; originally announced March 2021.

arXiv:2102.13473 [pdf]

Sleep Apnea and Respiratory Anomaly Detection from a Wearable Band and Oxygen Saturation

Authors: Wolfgang Ganglberger, Abigail A. Bucklin, Ryan A. Tesh, Madalena Da Silva Cardoso, Haoqi Sun, Michael J. Leone, Luis Paixao, Ezhil Panneerselvam, Elissa M. Ye, B. Taylor Thompson, Oluwaseun Akeju, David Kuller, Robert J. Thomas, M. Brandon Westover

Abstract: Objective: Sleep related respiratory abnormalities are typically detected using polysomnography. There is a need in general medicine and critical care for a more convenient method to automatically detect sleep apnea from a simple, easy-to-wear device. The objective is to automatically detect abnormal respiration and estimate the Apnea-Hypopnea-Index (AHI) with a wearable respiratory device, compar… ▽ More Objective: Sleep related respiratory abnormalities are typically detected using polysomnography. There is a need in general medicine and critical care for a more convenient method to automatically detect sleep apnea from a simple, easy-to-wear device. The objective is to automatically detect abnormal respiration and estimate the Apnea-Hypopnea-Index (AHI) with a wearable respiratory device, compared to an SpO2 signal or polysomnography using a large (n = 412) dataset serving as ground truth. Methods: Simultaneously recorded polysomnographic (PSG) and wearable respiratory effort data were used to train and evaluate models in a cross-validation fashion. Time domain and complexity features were extracted, important features were identified, and a random forest model employed to detect events and predict AHI. Four models were trained: one each using the respiratory features only, a feature from the SpO2 (%)-signal only, and two additional models that use the respiratory features and the SpO2 (%)-feature, one allowing a time lag of 30 seconds between the two signals. Results: Event-based classification resulted in areas under the receiver operating characteristic curves of 0.94, 0.86, 0.82, and areas under the precision-recall curves of 0.48, 0.32, 0.51 for the models using respiration and SpO2, respiration-only, and SpO2-only respectively. Correlation between expert-labelled and predicted AHI was 0.96, 0.78, and 0.93, respectively. Conclusions: A wearable respiratory effort signal with or without SpO2 predicted AHI accurately. Given the large dataset and rigorous testing design, we expect our models are generalizable to evaluating respiration in a variety of environments, such as at home and in critical care. △ Less

Submitted 23 February, 2021; originally announced February 2021.

Comments: Co-First Authors: Wolfgang Ganglberger, Abigail A. Bucklin Co-Senior Authors: Robert J. Thomas, M. Brandon Westover

arXiv:2101.04635 [pdf, other]

Automated Respiratory Event Detection Using Deep Neural Networks

Authors: Thijs E Nassi, Wolfgang Ganglberger, Haoqi Sun, Abigail A Bucklin, Siddharth Biswal, Michel J A M van Putten, Robert J Thomas, M Brandon Westover

Abstract: The gold standard to assess respiration during sleep is polysomnography; a technique that is burdensome, expensive (both in analysis time and measurement costs), and difficult to repeat. Automation of respiratory analysis can improve test efficiency and enable accessible implementation opportunities worldwide. Using 9,656 polysomnography recordings from the Massachusetts General Hospital (MGH), we… ▽ More The gold standard to assess respiration during sleep is polysomnography; a technique that is burdensome, expensive (both in analysis time and measurement costs), and difficult to repeat. Automation of respiratory analysis can improve test efficiency and enable accessible implementation opportunities worldwide. Using 9,656 polysomnography recordings from the Massachusetts General Hospital (MGH), we trained a neural network (WaveNet) based on a single respiratory effort belt to detect obstructive apnea, central apnea, hypopnea and respiratory-effort related arousals. Performance evaluation included event-based and recording-based metrics - using an apnea-hypopnea index analysis. The model was further evaluated on a public dataset, the Sleep-Heart-Health-Study-1, containing 8,455 polysomnographic recordings. For binary apnea event detection in the MGH dataset, the neural network obtained an accuracy of 95%, an apnea-hypopnea index $r^2$ of 0.89 and area under the curve for the receiver operating characteristics curve and precision-recall curve of 0.93 and 0.74, respectively. For the multiclass task, we obtained varying performances: 81% of all labeled central apneas were correctly classified, whereas this metric was 46% for obstructive apneas, 29% for respiratory effort related arousals and 16% for hypopneas. The majority of false predictions were misclassifications as another type of respiratory event. Our fully automated method can detect respiratory events and assess the apnea-hypopnea index with sufficient accuracy for clinical utilization. Differentiation of event types is more difficult and may reflect in part the complexity of human respiratory output and some degree of arbitrariness in the clinical thresholds and criteria used during manual annotation. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: 11 pages, 6 figures, 6 tables, \c{opyright}2020 IEEE

arXiv:2011.06489 [pdf, other]

Natural Language Processing to Detect Cognitive Concerns in Electronic Health Records Using Deep Learning

Authors: Zhuoqiao Hong, Colin G. Magdamo, Yi-han Sheu, Prathamesh Mohite, Ayush Noori, Elissa M. Ye, Wendong Ge, Haoqi Sun, Laura Brenner, Gregory Robbins, Shibani Mukerji, Sahar Zafar, Nicole Benson, Lidia Moura, John Hsu, Bradley T. Hyman, Michael B. Westover, Deborah Blacker, Sudeshna Das

Abstract: Dementia is under-recognized in the community, under-diagnosed by healthcare professionals, and under-coded in claims data. Information on cognitive dysfunction, however, is often found in unstructured clinician notes within medical records but manual review by experts is time consuming and often prone to errors. Automated mining of these notes presents a potential opportunity to label patients wi… ▽ More Dementia is under-recognized in the community, under-diagnosed by healthcare professionals, and under-coded in claims data. Information on cognitive dysfunction, however, is often found in unstructured clinician notes within medical records but manual review by experts is time consuming and often prone to errors. Automated mining of these notes presents a potential opportunity to label patients with cognitive concerns who could benefit from an evaluation or be referred to specialist care. In order to identify patients with cognitive concerns in electronic medical records, we applied natural language processing (NLP) algorithms and compared model performance to a baseline model that used structured diagnosis codes and medication data only. An attention-based deep learning model outperformed the baseline model and other simpler models. △ Less

Submitted 12 November, 2020; originally announced November 2020.

Comments: Machine Learning for Health (ML4H) at NeurIPS 2020 - Extended Abstract

MSC Class: I.2.7

arXiv:2006.11689 [pdf, other]

Clinically Relevant Mediation Analysis using Controlled Indirect Effect

Authors: Haoqi Sun, Michael J. Leone, Lin Liu, Shabani S. Mukerji, Gregory K. Robbins, M. Brandon Westover

Abstract: Mediation analysis allows one to use observational data to estimate the importance of each potential mediating pathway involved in the causal effect of an exposure on an outcome. However, current approaches to mediation analysis with multiple mediators either involve assumptions not verifiable by experiments, or estimate the effect when mediators are manipulated jointly which precludes the practic… ▽ More Mediation analysis allows one to use observational data to estimate the importance of each potential mediating pathway involved in the causal effect of an exposure on an outcome. However, current approaches to mediation analysis with multiple mediators either involve assumptions not verifiable by experiments, or estimate the effect when mediators are manipulated jointly which precludes the practical design of experiments due to curse of dimensionality, or are difficult to interpret when arbitrary causal dependencies are present. We propose a method for mediation analysis for multiple manipulable mediators with arbitrary causal dependencies. The proposed method is clinically relevant because the decomposition of the total effect does not involve effects under cross-world assumptions and focuses on the effects after manipulating (i.e. treating) one single mediator, which is more relevant in a clinical scenario. We illustrate the approach using simulated data, the "framing" dataset from political science, and the HIV-Brain Age dataset from a clinical retrospective cohort study. Our results provide potential guidance for clinical practitioners to make justified choices to manipulate one of the mediators to optimize the outcome. △ Less

Submitted 20 June, 2020; originally announced June 2020.

Comments: 15 pages, 2 figures in main text, 1 figure in supplemental, 4 tables in main text

arXiv:2003.01248 [pdf]

Night-to-Night Variability of Sleep Electroencephalography-Based Brain Age Measurements

Authors: Jacob Hogan, Haoqi Sun, Luis Paixao, Mike Westmeijer, Pooja Sikka, **g **, Ryan Tesh, Madalena Cardoso, Sydney S. Cash, Oluwaseun Akeju, Robert Thomas, M. Brandon Westover

Abstract: Objective Brain Age Index (BAI), calculated from sleep electroencephalography (EEG), has been proposed as a biomarker of brain health. This study quantifies night-to-night variability of BAI and establishes probability thresholds for inferring underlying brain pathology based on a patient's BAI. Methods 86 patients with multiple nights of consecutive EEG recordings were selected from Epilepsy Mo… ▽ More Objective Brain Age Index (BAI), calculated from sleep electroencephalography (EEG), has been proposed as a biomarker of brain health. This study quantifies night-to-night variability of BAI and establishes probability thresholds for inferring underlying brain pathology based on a patient's BAI. Methods 86 patients with multiple nights of consecutive EEG recordings were selected from Epilepsy Monitoring Unit patients whose EEGs reported as being within normal limits. BAI was calculated for each 12-hour segment of patient data using a previously described algorithm, and night-to-night variability in BAI was measured. Results The within-patient night-to-night standard deviation in BAI was 7.5 years. Estimates of BAI derived by averaging over 2, 3, and 4 nights had standard deviations of 4.7, 3.7, and 3.0 years, respectively. Conclusions Averaging BAI over n nights reduces night-to-night variability of BAI by a factor of the square root of n, rendering BAI more suitable as a biomarker of brain health at the individual level. Significance With increasing ease of EEG acquisition including wearable technology, BAI has the potential to track brain health and detect deviations from normal physiologic function. In a clinical setting, BAI could be used to identify patients who should undergo further investigation or monitoring. △ Less

Submitted 2 March, 2020; originally announced March 2020.

Comments: 18 pages, 6 figures, 2 tables

arXiv:2002.11701 [pdf, other]

CLARA: Clinical Report Auto-completion

Authors: Siddharth Biswal, Cao Xiao, Lucas M. Glass, M. Brandon Westover, Jimeng Sun

Abstract: Generating clinical reports from raw recordings such as X-rays and electroencephalogram (EEG) is an essential and routine task for doctors. However, it is often time-consuming to write accurate and detailed reports. Most existing methods try to generate the whole reports from the raw input with limited success because 1) generated reports often contain errors that need manual review and correction… ▽ More Generating clinical reports from raw recordings such as X-rays and electroencephalogram (EEG) is an essential and routine task for doctors. However, it is often time-consuming to write accurate and detailed reports. Most existing methods try to generate the whole reports from the raw input with limited success because 1) generated reports often contain errors that need manual review and correction, 2) it does not save time when doctors want to write additional information into the report, and 3) the generated reports are not customized based on individual doctors' preference. We propose {\it CL}inic{\it A}l {\it R}eport {\it A}uto-completion (CLARA), an interactive method that generates reports in a sentence by sentence fashion based on doctors' anchor words and partially completed sentences. CLARA searches for most relevant sentences from existing reports as the template for the current report. The retrieved sentences are sequentially modified by combining with the input feature representations to create the final report. In our experimental evaluation, CLARA achieved 0.393 CIDEr and 0.248 BLEU-4 on X-ray reports and 0.482 CIDEr and 0.491 BLEU-4 for EEG reports for sentence-level generation, which is up to 35% improvement over the best baseline. Also via our qualitative evaluation, CLARA is shown to produce reports which have a significantly higher level of approval by doctors in a user study (3.74 out of 5 for CLARA vs 2.52 out of 5 for the baseline). △ Less

Submitted 4 March, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

arXiv:1910.06100 [pdf, other]

SLEEPER: interpretable Sleep staging via Prototypes from Expert Rules

Authors: Irfan Al-Hussaini, Cao Xiao, M. Brandon Westover, Jimeng Sun

Abstract: Sleep staging is a crucial task for diagnosing sleep disorders. It is tedious and complex as it can take a trained expert several hours to annotate just one patient's polysomnogram (PSG) from a single night. Although deep learning models have demonstrated state-of-the-art performance in automating sleep staging, interpretability which defines other desiderata, has largely remained unexplored. In t… ▽ More Sleep staging is a crucial task for diagnosing sleep disorders. It is tedious and complex as it can take a trained expert several hours to annotate just one patient's polysomnogram (PSG) from a single night. Although deep learning models have demonstrated state-of-the-art performance in automating sleep staging, interpretability which defines other desiderata, has largely remained unexplored. In this study, we propose Sleep staging via Prototypes from Expert Rules (SLEEPER), which combines deep learning models with expert defined rules using a prototype learning framework to generate simple interpretable models. In particular, SLEEPER utilizes sleep scoring rules and expert defined features to derive prototypes which are embeddings of PSG data fragments via convolutional neural networks. The final models are simple interpretable models like a shallow decision tree defined over those phenotypes. We evaluated SLEEPER using two PSG datasets collected from sleep studies and demonstrated that SLEEPER could provide accurate sleep stage classification comparable to human experts and deep neural networks with about 85% ROC-AUC and .7 kappa. △ Less

Submitted 14 October, 2019; originally announced October 2019.

Comments: Machine Learning for Healthcare Conference (MLHC) 2019. Proceedings of Machine Learning Research 106

Journal ref: PMLR 106:721-739, 2019

arXiv:1908.11463 [pdf]

doi 10.1093/sleep/zsz306

Sleep Staging from Electrocardiography and Respiration with Deep Learning

Authors: Haoqi Sun, Wolfgang Ganglberger, Ezhil Panneerselvam, Michael J. Leone, Syed A. Quadri, Balaji Goparaju, Ryan A. Tesh, Oluwaseun Akeju, Robert J. Thomas, M. Brandon Westover

Abstract: Study Objective: Sleep is reflected not only in the electroencephalogram but also in heart rhythms and breathing patterns. Therefore, we hypothesize that it is possible to accurately stage sleep based on the electrocardiogram (ECG) and respiratory signals. Methods: Using a dataset including 8,682 polysomnographs, we develop deep neural networks to stage sleep from ECG and respiratory signals. Five… ▽ More Study Objective: Sleep is reflected not only in the electroencephalogram but also in heart rhythms and breathing patterns. Therefore, we hypothesize that it is possible to accurately stage sleep based on the electrocardiogram (ECG) and respiratory signals. Methods: Using a dataset including 8,682 polysomnographs, we develop deep neural networks to stage sleep from ECG and respiratory signals. Five deep neural networks consisting of convolutional networks and long short-term memory networks are trained to stage sleep using heart and breathing, including the timing of R peaks from ECG, abdominal and chest respiratory effort, and the combinations of these signals. Results: ECG in combination with the abdominal respiratory effort achieve the best performance for staging all five sleep stages with a Cohen's kappa of 0.600 (95% confidence interval 0.599 -- 0.602); and 0.762 (0.760 -- 0.763) for discriminating awake vs. rapid eye movement vs. non-rapid eye movement sleep. The performance is better for young participants and for those with a low apnea-hypopnea index, while it is robust for commonly used outpatient medications. Conclusions: Our results validate that ECG and respiratory effort provide substantial information about sleep stages in a large population. It opens new possibilities in sleep research and applications where electroencephalography is not readily available or may be infeasible, such as in critically ill patients. △ Less

Submitted 15 September, 2019; v1 submitted 29 August, 2019; originally announced August 2019.

Comments: Contains supplementary material at the end. Sleep 2019

arXiv:1805.06391 [pdf]

doi 10.1016/j.neurobiolaging.2018.10.016

Brain Age from the Electroencephalogram of Sleep

Authors: Haoqi Sun, Luis Paixao, Jefferson T. Oliva, Balaji Goparaju, Diego Z. Carvalho, Kicky G. van Leeuwen, Oluwaseun Akeju, Robert Joseph Thomas, Sydney S. Cash, Matt T. Bianchi, M. Brandon Westover

Abstract: The human electroencephalogram (EEG) of sleep undergoes profound changes with age. These changes can be conceptualized as "brain age", which can be compared to an age norm to reflect the deviation from normal aging process. Here, we develop an interpretable machine learning model to predict brain age based on two large sleep EEG datasets: the Massachusetts General Hospital sleep lab dataset (MGH,… ▽ More The human electroencephalogram (EEG) of sleep undergoes profound changes with age. These changes can be conceptualized as "brain age", which can be compared to an age norm to reflect the deviation from normal aging process. Here, we develop an interpretable machine learning model to predict brain age based on two large sleep EEG datasets: the Massachusetts General Hospital sleep lab dataset (MGH, N = 2,621) covering age 18 to 80; and the Sleep Hearth Health Study (SHHS, N = 3,520) covering age 40 to 80. The model obtains a mean absolute deviation of 8.1 years between brain age and chronological age in the healthy participants in the MGH dataset. As validation, we analyze a subset of SHHS containing longitudinal EEGs 5 years apart, which shows a 5.5 years difference in brain age. Participants with neurological and psychiatric diseases, as well as diabetes and hypertension medications show an older brain age compared to chronological age. The findings raise the prospect of using sleep EEG as a biomarker for healthy brain aging. △ Less

Submitted 16 May, 2018; originally announced May 2018.

Journal ref: Neurobiology of aging 74 (2019): 112-120

arXiv:1803.09702 [pdf, other]

HAMLET: Interpretable Human And Machine co-LEarning Technique

Authors: Olivier Deiss, Siddharth Biswal, **g **, Haoqi Sun, M. Brandon Westover, Jimeng Sun

Abstract: Efficient label acquisition processes are key to obtaining robust classifiers. However, data labeling is often challenging and subject to high levels of label noise. This can arise even when classification targets are well defined, if instances to be labeled are more difficult than the prototypes used to define the class, leading to disagreements among the expert community. Here, we enable efficie… ▽ More Efficient label acquisition processes are key to obtaining robust classifiers. However, data labeling is often challenging and subject to high levels of label noise. This can arise even when classification targets are well defined, if instances to be labeled are more difficult than the prototypes used to define the class, leading to disagreements among the expert community. Here, we enable efficient training of deep neural networks. From low-confidence labels, we iteratively improve their quality by simultaneous learning of machines and experts. We call it Human And Machine co-LEarning Technique (HAMLET). Throughout the process, experts become more consistent, while the algorithm provides them with explainable feedback for confirmation. HAMLET uses a neural embedding function and a memory module filled with diverse reference embeddings from different classes. Its output includes classification labels and highly relevant reference embeddings as explanation. We took the study of brain monitoring at intensive care unit (ICU) as an application of HAMLET on continuous electroencephalography (cEEG) data. Although cEEG monitoring yields large volumes of data, labeling costs and difficulty make it hard to build a classifier. Additionally, while experts agree on the labels of clear-cut examples of cEEG patterns, labeling many real-world cEEG data can be extremely challenging. Thus, a large minority of sequences might be mislabeled. HAMLET has shown significant performance gain against deep learning and other baselines, increasing accuracy from 7.03% to 68.75% on challenging inputs. Besides improved performance, clinical experts confirmed the interpretability of those reference embeddings in hel** explaining the classification results by HAMLET. △ Less

Submitted 21 August, 2018; v1 submitted 26 March, 2018; originally announced March 2018.

Comments: Removed KDD template

arXiv:1707.08262 [pdf, other]

SLEEPNET: Automated Sleep Staging System via Deep Learning

Authors: Siddharth Biswal, Joshua Kulas, Haoqi Sun, Balaji Goparaju, M Brandon Westover, Matt T Bianchi, Jimeng Sun

Abstract: Sleep disorders, such as sleep apnea, parasomnias, and hypersomnia, affect 50-70 million adults in the United States (Hillman et al., 2006). Overnight polysomnography (PSG), including brain monitoring using electroencephalography (EEG), is a central component of the diagnostic evaluation for sleep disorders. While PSG is conventionally performed by trained technologists, the recent rise of powerfu… ▽ More Sleep disorders, such as sleep apnea, parasomnias, and hypersomnia, affect 50-70 million adults in the United States (Hillman et al., 2006). Overnight polysomnography (PSG), including brain monitoring using electroencephalography (EEG), is a central component of the diagnostic evaluation for sleep disorders. While PSG is conventionally performed by trained technologists, the recent rise of powerful neural network learning algorithms combined with large physiological datasets offers the possibility of automation, potentially making expert-level sleep analysis more widely available. We propose SLEEPNET (Sleep EEG neural network), a deployed annotation tool for sleep staging. SLEEPNET uses a deep recurrent neural network trained on the largest sleep physiology database assembled to date, consisting of PSGs from over 10,000 patients from the Massachusetts General Hospital (MGH) Sleep Laboratory. SLEEPNET achieves human-level annotation performance on an independent test set of 1,000 EEGs, with an average accuracy of 85.76% and algorithm-expert inter-rater agreement (IRA) of kappa = 79.46%, comparable to expert-expert IRA. △ Less

Submitted 25 July, 2017; originally announced July 2017.

arXiv:cs/0509022 [pdf, ps, other]

Achievable Rates for Pattern Recognition

Authors: M. Brandon Westover, Joseph A. O'Sullivan

Abstract: Biological and machine pattern recognition systems face a common challenge: Given sensory data about an unknown object, classify the object by comparing the sensory data with a library of internal representations stored in memory. In many cases of interest, the number of patterns to be discriminated and the richness of the raw data force recognition systems to internally represent memory and sen… ▽ More Biological and machine pattern recognition systems face a common challenge: Given sensory data about an unknown object, classify the object by comparing the sensory data with a library of internal representations stored in memory. In many cases of interest, the number of patterns to be discriminated and the richness of the raw data force recognition systems to internally represent memory and sensory information in a compressed format. However, these representations must preserve enough information to accommodate the variability and complexity of the environment, or else recognition will be unreliable. Thus, there is an intrinsic tradeoff between the amount of resources devoted to data representation and the complexity of the environment in which a recognition system may reliably operate. In this paper we describe a general mathematical model for pattern recognition systems subject to resource constraints, and show how the aforementioned resource-complexity tradeoff can be characterized in terms of three rates related to number of bits available for representing memory and sensory data, and the number of patterns populating a given statistical environment. We prove single-letter information theoretic bounds governing the achievable rates, and illustrate the theory by analyzing the elementary cases where the pattern data is either binary or Gaussian. △ Less

Submitted 8 September, 2005; originally announced September 2005.

Showing 1–22 of 22 results for author: Westover, M B