-
SHDB-AF: a Japanese Holter ECG database of atrial fibrillation
Authors:
Kenta Tsutsui,
Shany Biton Brimer,
Noam Ben-Moshe,
Jean Marc Sellal,
Julien Oster,
Hitoshi Mori,
Yoshifumi Ikeda,
Takahide Arai,
Shintaro Nakano,
Ritsushi Kato,
Joachim A. Behar
Abstract:
Atrial fibrillation (AF) is a common atrial arrhythmia that impairs quality of life and causes embolic stroke, heart failure and other complications. Recent advancements in machine learning (ML) and deep learning (DL) have shown potential for enhancing diagnostic accuracy. It is essential for DL models to be robust and generalizable across variations in ethnicity, age, sex, and other factors. Alth…
▽ More
Atrial fibrillation (AF) is a common atrial arrhythmia that impairs quality of life and causes embolic stroke, heart failure and other complications. Recent advancements in machine learning (ML) and deep learning (DL) have shown potential for enhancing diagnostic accuracy. It is essential for DL models to be robust and generalizable across variations in ethnicity, age, sex, and other factors. Although a number of ECG database have been made available to the research community, none includes a Japanese population sample. Saitama Heart Database Atrial Fibrillation (SHDB-AF) is a novel open-sourced Holter ECG database from Japan, containing data from 100 unique patients with paroxysmal AF. Each record in SHDB-AF is 24 hours long and sampled at 200 Hz, totaling 24 million seconds of ECG data.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
SleepPPG-Net2: Deep learning generalization for sleep staging from photoplethysmography
Authors:
Shirel Attia,
Revital Shani Hershkovich,
Alissa Tabakhov,
Angeleene Ang,
Sharon Haimov,
Riva Tauman,
Joachim A. Behar
Abstract:
Background: Sleep staging is a fundamental component in the diagnosis of sleep disorders and the management of sleep health. Traditionally, this analysis is conducted in clinical settings and involves a time-consuming scoring procedure. Recent data-driven algorithms for sleep staging, using the photoplethysmogram (PPG) time series, have shown high performance on local test sets but lower performan…
▽ More
Background: Sleep staging is a fundamental component in the diagnosis of sleep disorders and the management of sleep health. Traditionally, this analysis is conducted in clinical settings and involves a time-consuming scoring procedure. Recent data-driven algorithms for sleep staging, using the photoplethysmogram (PPG) time series, have shown high performance on local test sets but lower performance on external datasets due to data drift. Methods: This study aimed to develop a generalizable deep learning model for the task of four class (wake, light, deep, and rapid eye movement (REM)) sleep staging from raw PPG physiological time-series. Six sleep datasets, totaling 2,574 patients recordings, were used. In order to create a more generalizable representation, we developed and evaluated a deep learning model called SleepPPG-Net2, which employs a multi-source domain training approach.SleepPPG-Net2 was benchmarked against two state-of-the-art models. Results: SleepPPG-Net2 showed consistently higher performance over benchmark approaches, with generalization performance (Cohen's kappa) improving by up to 19%. Performance disparities were observed in relation to age, sex, and sleep apnea severity. Conclusion: SleepPPG-Net2 sets a new standard for staging sleep from raw PPG time-series.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
RawECGNet: Deep Learning Generalization for Atrial Fibrillation Detection from the Raw ECG
Authors:
Noam Ben-Moshe,
Kenta Tsutsui,
Shany Biton,
Leif Sörnmo,
Joachim A. Behar
Abstract:
Introduction: Deep learning models for detecting episodes of atrial fibrillation (AF) using rhythm information in long-term, ambulatory ECG recordings have shown high performance. However, the rhythm-based approach does not take advantage of the morphological information conveyed by the different ECG waveforms, particularly the f-waves. As a result, the performance of such models may be inherently…
▽ More
Introduction: Deep learning models for detecting episodes of atrial fibrillation (AF) using rhythm information in long-term, ambulatory ECG recordings have shown high performance. However, the rhythm-based approach does not take advantage of the morphological information conveyed by the different ECG waveforms, particularly the f-waves. As a result, the performance of such models may be inherently limited. Methods: To address this limitation, we have developed a deep learning model, named RawECGNet, to detect episodes of AF and atrial flutter (AFl) using the raw, single-lead ECG. We compare the generalization performance of RawECGNet on two external data sets that account for distribution shifts in geography, ethnicity, and lead position. RawECGNet is further benchmarked against a state-of-the-art deep learning model, named ArNet2, which utilizes rhythm information as input. Results: Using RawECGNet, the results for the different leads in the external test sets in terms of the F1 score were 0.91--0.94 in RBDB and 0.93 in SHDB, compared to 0.89--0.91 in RBDB and 0.91 in SHDB for ArNet2. The results highlight RawECGNet as a high-performance, generalizable algorithm for detection of AF and AFl episodes, exploiting information on both rhythm and morphology.
△ Less
Submitted 26 December, 2023;
originally announced January 2024.
-
DRStageNet: Deep Learning for Diabetic Retinopathy Staging from Fundus Images
Authors:
Yevgeniy Men,
Jonathan Fhima,
Leo Anthony Celi,
Lucas Zago Ribeiro,
Luis Filipe Nakayama,
Joachim A. Behar
Abstract:
Diabetic retinopathy (DR) is a prevalent complication of diabetes associated with a significant risk of vision loss. Timely identification is critical to curb vision impairment. Algorithms for DR staging from digital fundus images (DFIs) have been recently proposed. However, models often fail to generalize due to distribution shifts between the source domain on which the model was trained and the…
▽ More
Diabetic retinopathy (DR) is a prevalent complication of diabetes associated with a significant risk of vision loss. Timely identification is critical to curb vision impairment. Algorithms for DR staging from digital fundus images (DFIs) have been recently proposed. However, models often fail to generalize due to distribution shifts between the source domain on which the model was trained and the target domain where it is deployed. A common and particularly challenging shift is often encountered when the source- and target-domain supports do not fully overlap. In this research, we introduce DRStageNet, a deep learning model designed to mitigate this challenge. We used seven publicly available datasets, comprising a total of 93,534 DFIs that cover a variety of patient demographics, ethnicities, geographic origins and comorbidities. We fine-tune DINOv2, a pretrained model of self-supervised vision transformer, and implement a multi-source domain fine-tuning strategy to enhance generalization performance. We benchmark and demonstrate the superiority of our method to two state-of-the-art benchmarks, including a recently published foundation model. We adapted the grad-rollout method to our regression task in order to provide high-resolution explainability heatmaps. The error analysis showed that 59\% of the main errors had incorrect reference labels. DRStageNet is accessible at URL [upon acceptance of the manuscript].
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
Generalization in medical AI: a perspective on develo** scalable models
Authors:
Joachim A. Behar,
Jeremy Levy,
Leo Anthony Celi
Abstract:
Over the past few years, research has witnessed the advancement of deep learning models trained on large datasets, some even encompassing millions of examples. While these impressive performance on their hidden test sets, they often underperform when assessed on external datasets. Recognizing the critical role of generalization in medical AI development, many prestigious journals now require repor…
▽ More
Over the past few years, research has witnessed the advancement of deep learning models trained on large datasets, some even encompassing millions of examples. While these impressive performance on their hidden test sets, they often underperform when assessed on external datasets. Recognizing the critical role of generalization in medical AI development, many prestigious journals now require reporting results both on the local hidden test set as well as on external datasets before considering a study for publication. Effectively, the field of medical AI has transitioned from the traditional usage of a single dataset that is split into train and test to a more comprehensive framework using multiple datasets, some of which are used for model development (source domain) and others for testing (target domains). However, this new experimental setting does not necessarily resolve the challenge of generalization. This is because of the variability encountered in intended use and specificities across hospital cultures making the idea of universally generalizable systems a myth. On the other hand, the systematic, and a fortiori recurrent re-calibration, of models at the individual hospital level, although ideal, may be overoptimistic given the legal, regulatory and technical challenges that are involved. Re-calibration using transfer learning may not even be possible in some instances where reference labels of target domains are not available. In this perspective we establish a hierarchical three-level scale system reflecting the generalization level of a medical AI algorithm. This scale better reflects the diversity of real-world medical scenarios per which target domain data for re-calibration of models may or not be available and if it is, may or not have reference labels systematically available.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
End-to-end Risk Prediction of Atrial Fibrillation from the 12-Lead ECG by Deep Neural Networks
Authors:
Theogene Habineza,
Antônio H. Ribeiro,
Daniel Gedon,
Joachim A. Behar,
Antonio Luiz P. Ribeiro,
Thomas B. Schön
Abstract:
Background: Atrial fibrillation (AF) is one of the most common cardiac arrhythmias that affects millions of people each year worldwide and it is closely linked to increased risk of cardiovascular diseases such as stroke and heart failure. Machine learning methods have shown promising results in evaluating the risk of develo** atrial fibrillation from the electrocardiogram. We aim to develop and…
▽ More
Background: Atrial fibrillation (AF) is one of the most common cardiac arrhythmias that affects millions of people each year worldwide and it is closely linked to increased risk of cardiovascular diseases such as stroke and heart failure. Machine learning methods have shown promising results in evaluating the risk of develo** atrial fibrillation from the electrocardiogram. We aim to develop and evaluate one such algorithm on a large CODE dataset collected in Brazil.
Results: The deep neural network model identified patients without indication of AF in the presented ECG but who will develop AF in the future with an AUC score of 0.845. From our survival model, we obtain that patients in the high-risk group (i.e. with the probability of a future AF case being greater than 0.7) are 50% more likely to develop AF within 40 weeks, while patients belonging to the minimal-risk group (i.e. with the probability of a future AF case being less than or equal to 0.1) have more than 85% chance of remaining AF free up until after seven years.
Conclusion: We developed and validated a model for AF risk prediction. If applied in clinical practice, the model possesses the potential of providing valuable and useful information in decision-making and patient management processes.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
pyPPG: A Python toolbox for comprehensive photoplethysmography signal analysis
Authors:
Marton A. Goda,
Peter H. Charlton,
Joachim A. Behar
Abstract:
Photoplethysmography is a non-invasive optical technique that measures changes in blood volume within tissues. It is commonly and increasingly used for in a variety of research and clinical application to assess vascular dynamics and physiological parameters. Yet, contrary to heart rate variability measures, a field which has seen the development of stable standards and advanced toolboxes and soft…
▽ More
Photoplethysmography is a non-invasive optical technique that measures changes in blood volume within tissues. It is commonly and increasingly used for in a variety of research and clinical application to assess vascular dynamics and physiological parameters. Yet, contrary to heart rate variability measures, a field which has seen the development of stable standards and advanced toolboxes and software, no such standards and open tools exist for continuous photoplethysmogram (PPG) analysis. Consequently, the primary objective of this research was to identify, standardize, implement and validate key digital PPG biomarkers. This work describes the creation of a standard Python toolbox, denoted pyPPG, for long-term continuous PPG time series analysis recorded using a standard finger-based transmission pulse oximeter. The improved PPG peak detector had an F1-score of 88.19% for the state-of-the-art benchmark when evaluated on 2,054 adult polysomnography recordings totaling over 91 million reference beats. This algorithm outperformed the open-source original Matlab implementation by ~5% when benchmarked on a subset of 100 randomly selected MESA recordings. More than 3,000 fiducial points were manually annotated by two annotators in order to validate the fiducial points detector. The detector consistently demonstrated high performance, with a mean absolute error of less than 10 ms for all fiducial points. Based on these fiducial points, pyPPG engineers a set of 74 PPG biomarkers. Studying the PPG time series variability using pyPPG can enhance our understanding of the manifestations and etiology of diseases. This toolbox can also be used for biomarker engineering in training data-driven models. pyPPG is available on physiozoo.org
△ Less
Submitted 24 September, 2023;
originally announced September 2023.
-
LUNet: Deep Learning for the Segmentation of Arterioles and Venules in High Resolution Fundus Images
Authors:
Jonathan Fhima,
Jan Van Eijgen,
Hana Kulenovic,
Valérie Debeuf,
Marie Vangilbergen,
Marie-Isaline Billen,
Heloïse Brackenier,
Moti Freiman,
Ingeborg Stalmans,
Joachim A. Behar
Abstract:
The retina is the only part of the human body in which blood vessels can be accessed non-invasively using imaging techniques such as digital fundus images (DFI). The spatial distribution of the retinal microvasculature may change with cardiovascular diseases and thus the eyes may be regarded as a window to our hearts. Computerized segmentation of the retinal arterioles and venules (A/V) is essenti…
▽ More
The retina is the only part of the human body in which blood vessels can be accessed non-invasively using imaging techniques such as digital fundus images (DFI). The spatial distribution of the retinal microvasculature may change with cardiovascular diseases and thus the eyes may be regarded as a window to our hearts. Computerized segmentation of the retinal arterioles and venules (A/V) is essential for automated microvasculature analysis. Using active learning, we created a new DFI dataset containing 240 crowd-sourced manual A/V segmentations performed by fifteen medical students and reviewed by an ophthalmologist, and developed LUNet, a novel deep learning architecture for high resolution A/V segmentation. LUNet architecture includes a double dilated convolutional block that aims to enhance the receptive field of the model and reduce its parameter count. Furthermore, LUNet has a long tail that operates at high resolution to refine the segmentation. The custom loss function emphasizes the continuity of the blood vessels. LUNet is shown to significantly outperform two state-of-the-art segmentation algorithms on the local test set as well as on four external test sets simulating distribution shifts across ethnicity, comorbidities, and annotators. We make the newly created dataset open access (upon publication).
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
PhysioZoo: The Open Digital Physiological Biomarkers Resource
Authors:
Joachim A. Behar,
Jeremy Levy,
Eran Zvuloni,
Sheina Gendelman,
Aviv Rosenberg,
Shany Biton,
Raphael Derman,
Jonathan A. Sobel,
Alexandra Alexandrovich,
Peter Charlton,
Márton Á Goda
Abstract:
PhysioZoo is a collaborative platform designed for the analysis of continuous physiological time series. The platform currently comprises four modules, each consisting of a library, a user interface, and a set of tutorials: (1) PhysioZoo HRV, dedicated to studying heart rate variability (HRV) in humans and other mammals; (2) PhysioZoo SPO2, which focuses on the analysis of digital oximetry biomark…
▽ More
PhysioZoo is a collaborative platform designed for the analysis of continuous physiological time series. The platform currently comprises four modules, each consisting of a library, a user interface, and a set of tutorials: (1) PhysioZoo HRV, dedicated to studying heart rate variability (HRV) in humans and other mammals; (2) PhysioZoo SPO2, which focuses on the analysis of digital oximetry biomarkers (OBM) using continuous oximetry (SpO2) measurements from humans; (3) PhysioZoo ECG, dedicated to the analysis of electrocardiogram (ECG) time series; (4) PhysioZoo PPG, designed to study photoplethysmography (PPG) time series. In this proceeding, we introduce the PhysioZoo platform as an open resource for digital physiological biomarkers engineering, facilitating streamlined analysis and data visualization of physiological time series while ensuring the reproducibility of published experiments. We welcome researchers to contribute new libraries for the analysis of various physiological time series, such as electroencephalography, blood pressure, and phonocardiography. You can access the resource at physiozoo.com. We encourage researchers to explore and utilize this platform to advance their studies in the field of continuous physiological time-series analysis.
△ Less
Submitted 7 September, 2023;
originally announced September 2023.
-
Robust peak detection for photoplethysmography signal analysis
Authors:
Márton Á. Goda,
Peter H. Charlton,
Joachim A. Behar
Abstract:
Efficient and accurate evaluation of long-term photoplethysmography (PPG) recordings is essential for both clinical assessments and consumer products. In 2021, the top opensource peak detectors were benchmarked on the Multi-Ethnic Study of Atherosclerosis (MESA) database consisting of polysomnography (PSG) recordings and continuous sleep PPG data, where the Automatic Beat Detector (Aboy) had the b…
▽ More
Efficient and accurate evaluation of long-term photoplethysmography (PPG) recordings is essential for both clinical assessments and consumer products. In 2021, the top opensource peak detectors were benchmarked on the Multi-Ethnic Study of Atherosclerosis (MESA) database consisting of polysomnography (PSG) recordings and continuous sleep PPG data, where the Automatic Beat Detector (Aboy) had the best accuracy. This work presents Aboy++, an improved version of the original Aboy beat detector. The algorithm was evaluated on 100 adult PPG recordings from the MESA database, which contains more than 4.25 million reference beats. Aboy++ achieved an F1-score of 85.5%, compared to 80.99% for the original Aboy peak detector. On average, Aboy++ processed a 1 hour-long recording in less than 2 seconds. This is compared to 115 seconds (i.e., over 57-times longer) for the open-source implementation of the original Aboy peak detector. This study demonstrated the importance of develo** robust algorithms like Aboy++ to improve PPG data analysis and clinical outcomes. Overall, Aboy++ is a reliable tool for evaluating long-term wearable PPG measurements in clinical and consumer contexts.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Case Study: Fetal Breathing Movements as a Proxy for Fetal Lung Maturity Estimation
Authors:
Márton Á. Goda,
Ron Beloosesky,
Chen Ben David,
Zeev Weiner,
Joachim A. Behar
Abstract:
Premature births can lead to complications, with fetal lung immaturity being a primary concern. Currently, fetal lung maturity (FLM) requires an invasive surfactant extraction procedure between the 32nd and 39th weeks of pregnancy. Unfortunately, there is no non-invasive method for FLM assessment. This work hypothesized that fetal breathing movement (FBM) and surfactant levels are inversely couple…
▽ More
Premature births can lead to complications, with fetal lung immaturity being a primary concern. Currently, fetal lung maturity (FLM) requires an invasive surfactant extraction procedure between the 32nd and 39th weeks of pregnancy. Unfortunately, there is no non-invasive method for FLM assessment. This work hypothesized that fetal breathing movement (FBM) and surfactant levels are inversely coupled and that FBM can serve as a proxy for FLM estimation. To investigate the correlation between FBM and FLM, antenatal corticosteroid (ACS) was administered to increase fetal pulmonary surfactant levels in a high-risk 35th-week pregnant woman showing intrauterine growth restriction. Synchronous sonographic and phonographic measurements were continuously recorded for 25 minutes before and after the ASC treatments. Before the ACS injection, 268 continuous movements FBM episodes were recorded. The number of continuous FBM episodes significantly decreased to 3, 43, and 79 within 24, 48, and 72 hours, respectively, of the first injection of ACS, suggesting an inversely coupled connection between FBM and surfactant level s. Therefore, FBM may serve as a proxy for FLM estimation. Quantitative confirmation of these findings would suggest that FBM measurements could be used as a non-invasive and widely accessible FLM-assessment tool for high-risk pregnancies and routine examinations.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Machine Learning for Ranking f-wave Extraction Methods in Single-Lead ECGs
Authors:
Noam Ben-Moshe,
Shany Biton,
Kenta Tsutsui,
Mahmoud Suleiman,
Leif Sörnmo,
Joachim A. Behar
Abstract:
Introduction: The presence of fibrillatory waves (f-waves) is important in the diagnosis of atrial fibrillation (AF), which has motivated the development of methods for f-wave extraction. We propose a novel approach to benchmarking methods designed for single-lead ECG analysis, building on the hypothesis that better-performing AF classification using features computed from the extracted f-waves im…
▽ More
Introduction: The presence of fibrillatory waves (f-waves) is important in the diagnosis of atrial fibrillation (AF), which has motivated the development of methods for f-wave extraction. We propose a novel approach to benchmarking methods designed for single-lead ECG analysis, building on the hypothesis that better-performing AF classification using features computed from the extracted f-waves implies better-performing extraction. The approach is well-suited for processing large Holter data sets annotated with respect to the presence of AF. Methods: Three data sets with a total of 300 two- or three-lead Holter recordings, performed in the USA, Israel and Japan, were used as well as a simulated single-lead data set. Four existing extraction methods based on either average beat subtraction or principal component analysis (PCA) were evaluated. A random forest classifier was used for window-based AF classification. Performance was measured by the area under the receiver operating characteristic (AUROC). Results: The best performance was found for PCA-based extraction, resulting in AUROCs in the ranges 0.77--0.83, 0.62--0.78, and 0.87--0.89 for the data sets from USA, Israel, and Japan, respectively, when analyzed across leads; the AUROC of the simulated single-lead, noisy data set was 0.98. Conclusions: This study provides a novel approach to evaluating the performance of f-wave extraction methods, offering the advantage of not using ground truth f-waves for evaluation, thus being able to leverage real data sets for evaluation. The code is open source (following publication).
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Estimation of f-wave Dominant Frequency Using a Voting Scheme
Authors:
Shany Biton,
Mahmoud Suleiman,
Noam Ben Moshe,
Leif Sörnmo,
Joachim A. Behar
Abstract:
Introduction: Atrial fibrillation (AF) is the most common heart arrhythmia, characterized by the presence of fibrillatory waves (f-waves) in the ECG. We introduce a voting scheme to estimate the dominant atrial frequency (DAF) of f-waves. Methods: We analysed a subset of Holter recordings obtained from the University of Virginia AF Database. 100 Holter recordings with manually annotated AF events,…
▽ More
Introduction: Atrial fibrillation (AF) is the most common heart arrhythmia, characterized by the presence of fibrillatory waves (f-waves) in the ECG. We introduce a voting scheme to estimate the dominant atrial frequency (DAF) of f-waves. Methods: We analysed a subset of Holter recordings obtained from the University of Virginia AF Database. 100 Holter recordings with manually annotated AF events, resulting in a total 363 AF events lasting more than 1 min. The f-waves were extracted using four different template subtraction (TS) algorithms and the DAF was estimated from the first 1-min window of each AF event. A random forest classifier was used. We hypothesized that better extraction of the f-wave meant better AF/non-AF classification using the DAF as the single input feature of the RF model. Results: Performance on the test set, expressed in terms of AF/non-AF classification, was the best when the DAF was computed computed the three best-performing extraction methods. Using these three algorithms in a voting scheme, the classifier obtained AUC=0.60 and the DAFs were mostly spread around 6 Hz, 5.66 (4.83-7.47). Conclusions: This study has two novel contributions: (1) a method for assessing the performance of f-wave extraction algorithms, and (2) a voting scheme for improved DAF estimation.
△ Less
Submitted 23 August, 2022;
originally announced September 2022.
-
Atrial Fibrillation Recurrence Risk Prediction from 12-lead ECG Recorded Pre- and Post-Ablation Procedure
Authors:
Eran Zvuloni,
Sheina Gendelman,
Sanghamitra Mohanty,
Jason Lewen,
Andrea Natale,
Joachim A. Behar
Abstract:
Introduction: 12-lead electrocardiogram (ECG) is recorded during atrial fibrillation (AF) catheter ablation procedure (CAP). It is not easy to determine if CAP was successful without a long follow-up assessing for AF recurrence (AFR). Therefore, an AFR risk prediction algorithm could enable a better management of CAP patients. In this research, we extracted features from 12-lead ECG recorded befor…
▽ More
Introduction: 12-lead electrocardiogram (ECG) is recorded during atrial fibrillation (AF) catheter ablation procedure (CAP). It is not easy to determine if CAP was successful without a long follow-up assessing for AF recurrence (AFR). Therefore, an AFR risk prediction algorithm could enable a better management of CAP patients. In this research, we extracted features from 12-lead ECG recorded before and after CAP and train an AFR risk prediction machine learning model. Methods: Pre- and post-CAP segments were extracted from 112 patients. The analysis included a signal quality criterion, heart rate variability and morphological biomarkers engineered from the 12-lead ECG (804 features overall). 43 out of the 112 patients (n) had AFR clinical endpoint available. These were utilized to assess the feasibility of AFR risk prediction, using either pre or post CAP features. A random forest classifier was trained within a nested cross validation framework. Results: 36 features were found statistically significant for distinguishing between the pre and post surgery states (n=112). For the classification, an area under the receiver operating characteristic (AUROC) curve was reported with AUROC_pre=0.64 and AUROC_post=0.74 (n=43). Discussion and conclusions: This preliminary analysis showed the feasibility of AFR risk prediction. Such a model could be used to improve CAP management.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
ArNet-ECG: Deep Learning for the Detection of Atrial Fibrillation from the Raw Electrocardiogram
Authors:
Noam Ben-Moshe,
Shany Biton,
Joachim A. Behar
Abstract:
Atrial fibrillation (AF) is the most prevalent heart arrhythmia. AF manifests on the electrocardiogram (ECG) though irregular beat-to-beat time interval variation, the absence of P-wave and the presence of fibrillatory waves (f-wave). We hypothesize that a deep learning (DL) approach trained on the raw ECG will enable robust detection of AF events and the estimation of the AF burden (AFB). We furt…
▽ More
Atrial fibrillation (AF) is the most prevalent heart arrhythmia. AF manifests on the electrocardiogram (ECG) though irregular beat-to-beat time interval variation, the absence of P-wave and the presence of fibrillatory waves (f-wave). We hypothesize that a deep learning (DL) approach trained on the raw ECG will enable robust detection of AF events and the estimation of the AF burden (AFB). We further hypothesize that the performance reached leveraging the raw ECG will be superior to previously developed methods using the beat-to-beat interval variation time series. Consequently, we develop a new DL algorithm, denoted ArNet-ECG, to robustly detect AF events and estimate the AFB from the raw ECG and benchmark this algorithms against previous work. Methods: A dataset including 2,247 adult patients and totaling over 53,753 hours of continuous ECG from the University of Virginia (UVAF) was used. Results: ArNet-ECG obtained an F1 of 0.96 and ArNet2 obtained an F1 0.94. Discussion and conclusion: ArNet-ECG outperformed ArNet2 thus demonstrating that using the raw ECG provides added performance over the beat-to-beat interval time series. The main reason found for explaining the higher performance of ArNet-ECG was its high performance on atrial flutter examples versus poor performance on these recordings for ArNet2.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Lirot.ai: A Novel Platform for Crowd-Sourcing Retinal Image Segmentations
Authors:
Jonathan Fhima,
Jan Van Eijgen,
Moti Freiman,
Ingeborg Stalmans,
Joachim A. Behar
Abstract:
Introduction: For supervised deep learning (DL) tasks, researchers need a large annotated dataset. In medical data science, one of the major limitations to develop DL models is the lack of annotated examples in large quantity. This is most often due to the time and expertise required to annotate. We introduce Lirot. ai, a novel platform for facilitating and crowd-sourcing image segmentations. Meth…
▽ More
Introduction: For supervised deep learning (DL) tasks, researchers need a large annotated dataset. In medical data science, one of the major limitations to develop DL models is the lack of annotated examples in large quantity. This is most often due to the time and expertise required to annotate. We introduce Lirot. ai, a novel platform for facilitating and crowd-sourcing image segmentations. Methods: Lirot. ai is composed of three components; an iPadOS client application named Lirot. ai-app, a backend server named Lirot. ai-server and a python API name Lirot. ai-API. Lirot. ai-app was developed in Swift 5.6 and Lirot. ai-server is a firebase backend. Lirot. ai-API allows the management of the database. Lirot. ai-app can be installed on as many iPadOS devices as needed so that annotators may be able to perform their segmentation simultaneously and remotely. We incorporate Apple Pencil compatibility, making the segmentation faster, more accurate, and more intuitive for the expert than any other computer-based alternative. Results: We demonstrate the usage of Lirot. ai for the creation of a retinal fundus dataset with reference vasculature segmentations. Discussion and future work: We will use active learning strategies to continue enlarging our retinal fundus dataset by including a more efficient process to select the images to be annotated and distribute them to annotators.
△ Less
Submitted 14 January, 2024; v1 submitted 22 August, 2022;
originally announced August 2022.
-
PVBM: A Python Vasculature Biomarker Toolbox Based On Retinal Blood Vessel Segmentation
Authors:
Jonathan Fhima,
Jan Van Eijgen,
Ingeborg Stalmans,
Yevgeniy Men,
Moti Freiman,
Joachim A. Behar
Abstract:
Introduction: Blood vessels can be non-invasively visualized from a digital fundus image (DFI). Several studies have shown an association between cardiovascular risk and vascular features obtained from DFI. Recent advances in computer vision and image segmentation enable automatising DFI blood vessel segmentation. There is a need for a resource that can automatically compute digital vasculature bi…
▽ More
Introduction: Blood vessels can be non-invasively visualized from a digital fundus image (DFI). Several studies have shown an association between cardiovascular risk and vascular features obtained from DFI. Recent advances in computer vision and image segmentation enable automatising DFI blood vessel segmentation. There is a need for a resource that can automatically compute digital vasculature biomarkers (VBM) from these segmented DFI. Methods: In this paper, we introduce a Python Vasculature BioMarker toolbox, denoted PVBM. A total of 11 VBMs were implemented. In particular, we introduce new algorithmic methods to estimate tortuosity and branching angles. Using PVBM, and as a proof of usability, we analyze geometric vascular differences between glaucomatous patients and healthy controls. Results: We built a fully automated vasculature biomarker toolbox based on DFI segmentations and provided a proof of usability to characterize the vascular changes in glaucoma. For arterioles and venules, all biomarkers were significant and lower in glaucoma patients compared to healthy controls except for tortuosity, venular singularity length and venular branching angles.
Conclusion: We have automated the computation of 11 VBMs from retinal blood vessel segmentation. The PVBM toolbox is made open source under a GNU GPL 3 license and is available on physiozoo.com (following publication).
△ Less
Submitted 31 July, 2022;
originally announced August 2022.
-
Building Trust: Lessons from the Technion-Rambam Machine Learning in Healthcare Datathon Event
Authors:
Jonathan A. Sobel,
Ronit Almog,
Leo Anthony Celi,
Michal Gaziel-Yablowitz,
Danny Eytan,
Joachim A. Behar
Abstract:
A datathon is a time-constrained competition involving data science applied to a specific problem. In the past decade, datathons have been shown to be a valuable bridge between fields and expertise . Biomedical data analysis represents a challenging area requiring collaboration between engineers, biologists and physicians to gain a better understanding of patient physiology and of guide decision p…
▽ More
A datathon is a time-constrained competition involving data science applied to a specific problem. In the past decade, datathons have been shown to be a valuable bridge between fields and expertise . Biomedical data analysis represents a challenging area requiring collaboration between engineers, biologists and physicians to gain a better understanding of patient physiology and of guide decision processes for diagnosis, prognosis and therapeutic interventions to improve care practice. Here, we reflect on the outcomes of an event that we organized in Israel at the end of March 2022 between the MIT Critical Data group, Rambam Health Care Campus (Rambam) and the Technion Israel Institute of Technology (Technion) in Haifa. Participants were asked to complete a survey about their skills and interests, which enabled us to identify current needs in machine learning training for medical problem applications. This work describes opportunities and limitations in medical data science in the Israeli context.
△ Less
Submitted 2 August, 2022; v1 submitted 16 July, 2022;
originally announced July 2022.
-
Generalizable and Robust Deep Learning Algorithm for Atrial Fibrillation Diagnosis Across Ethnicities, Ages and Sexes
Authors:
Shany Biton,
Mohsin Aldhafeeri,
Erez Marcusohn,
Kenta Tsutsui,
Tom Szwagier,
Adi Elias,
Julien Oster,
Jean Marc Sellal,
Mahmoud Suleiman,
Joachim A. Behar
Abstract:
To drive health innovation that meets the needs of all and democratize healthcare, there is a need to assess the generalization performance of deep learning (DL) algorithms across various distribution shifts to ensure that these algorithms are robust. This retrospective study is, to the best of our knowledge, the first to develop and assess the generalization performance of a deep learning (DL) mo…
▽ More
To drive health innovation that meets the needs of all and democratize healthcare, there is a need to assess the generalization performance of deep learning (DL) algorithms across various distribution shifts to ensure that these algorithms are robust. This retrospective study is, to the best of our knowledge, the first to develop and assess the generalization performance of a deep learning (DL) model for AF events detection from long term beat-to-beat intervals across ethnicities, ages and sexes. The new recurrent DL model, denoted ArNet2, was developed on a large retrospective dataset of 2,147 patients totaling 51,386 hours of continuous electrocardiogram (ECG). The models generalization was evaluated on manually annotated test sets from four centers (USA, Israel, Japan and China) totaling 402 patients. The model was further validated on a retrospective dataset of 1,730 consecutives Holter recordings from the Rambam Hospital Holter clinic, Haifa, Israel. The model outperformed benchmark state-of-the-art models and generalized well across ethnicities, ages and sexes. Performance was higher for female than male and young adults (less than 60 years old) and showed some differences across ethnicities. The main finding explaining these variations was an impairment in performance in groups with a higher prevalence of atrial flutter (AFL). Our findings on the relative performance of ArNet2 across groups may have clinical implications on the choice of the preferred AF examination method to use relative to the group of interest.
△ Less
Submitted 20 July, 2022;
originally announced July 2022.
-
On Merging Feature Engineering and Deep Learning for Diagnosis, Risk-Prediction and Age Estimation Based on the 12-Lead ECG
Authors:
Eran Zvuloni,
Jesse Read,
Antônio H. Ribeiro,
Antonio Luiz P. Ribeiro,
Joachim A. Behar
Abstract:
Objective: Machine learning techniques have been used extensively for 12-lead electrocardiogram (ECG) analysis. For physiological time series, deep learning (DL) superiority to feature engineering (FE) approaches based on domain knowledge is still an open question. Moreover, it remains unclear whether combining DL with FE may improve performance. Methods: We considered three tasks intending to add…
▽ More
Objective: Machine learning techniques have been used extensively for 12-lead electrocardiogram (ECG) analysis. For physiological time series, deep learning (DL) superiority to feature engineering (FE) approaches based on domain knowledge is still an open question. Moreover, it remains unclear whether combining DL with FE may improve performance. Methods: We considered three tasks intending to address these research gaps: cardiac arrhythmia diagnosis (multiclass-multilabel classification), atrial fibrillation risk prediction (binary classification), and age estimation (regression). We used an overall dataset of 2.3M 12-lead ECG recordings to train the following models for each task: i) a random forest taking the FE as input was trained as a classical machine learning approach; ii) an end-to-end DL model; and iii) a merged model of FE+DL. Results: FE yielded comparable results to DL while necessitating significantly less data for the two classification tasks and it was outperformed by DL for the regression task. For all tasks, merging FE with DL did not improve performance over DL alone. Conclusion: We found that for traditional 12-lead ECG based diagnosis tasks DL did not yield a meaningful improvement over FE, while it improved significantly the nontraditional regression task. We also found that combining FE with DL did not improve over DL alone which suggests that the FE were redundant with the features learned by DL. Significance: Our findings provides important recommendations on what machine learning strategy and data regime to chose with respect to the task at hand for the development of new machine learning models based on the 12-lead ECG.
△ Less
Submitted 16 July, 2022; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Machine Learning to Support Triage of Children at Risk for Epileptic Seizures in the Pediatric Intensive Care Unit
Authors:
Raphael Azriel,
Cecil D. Hahn,
Thomas De Cooman,
Sabine Van Huffel,
Eric T. Payne,
Kristin L. McBain,
Danny Eytan,
Joachim A. Behar
Abstract:
Objective: Epileptic seizures are relatively common in critically-ill children admitted to the pediatric intensive care unit (PICU) and thus serve as an important target for identification and treatment. Most of these seizures have no discernible clinical manifestation but still have a significant impact on morbidity and mortality. Children that are deemed at risk for seizures within the PICU are…
▽ More
Objective: Epileptic seizures are relatively common in critically-ill children admitted to the pediatric intensive care unit (PICU) and thus serve as an important target for identification and treatment. Most of these seizures have no discernible clinical manifestation but still have a significant impact on morbidity and mortality. Children that are deemed at risk for seizures within the PICU are monitored using continuous-electroencephalogram (cEEG). cEEG monitoring cost is considerable and as the number of available machines is always limited, clinicians need to resort to triaging patients according to perceived risk in order to allocate resources. This research aims to develop a computer aided tool to improve seizures risk assessment in critically-ill children, using an ubiquitously recorded signal in the PICU, namely the electrocardiogram (ECG). Approach: A novel data-driven model was developed at a patient-level approach, based on features extracted from the first hour of ECG recording and the clinical data of the patient. Main results: The most predictive features were the age of the patient, the brain injury as coma etiology and the QRS area. For patients without any prior clinical data, using one hour of ECG recording, the classification performance of the random forest classifier reached an area under the receiver operating characteristic curve (AUROC) score of 0.84. When combining ECG features with the patients clinical history, the AUROC reached 0.87. Significance: Taking a real clinical scenario, we estimated that our clinical decision support triage tool can improve the positive predictive value by more than 59% over the clinical standard.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
FundusQ-Net: a Regression Quality Assessment Deep Learning Algorithm for Fundus Images Quality Grading
Authors:
Or Abramovich,
Hadas Pizem,
Jan Van Eijgen,
Ilan Oren,
Joshua Melamed,
Ingeborg Stalmans,
Eytan Z. Blumenthal,
Joachim A. Behar
Abstract:
Objective: Ophthalmological pathologies such as glaucoma, diabetic retinopathy and age-related macular degeneration are major causes of blindness and vision impairment. There is a need for novel decision support tools that can simplify and speed up the diagnosis of these pathologies. A key step in this process is to automatically estimate the quality of the fundus images to make sure these are int…
▽ More
Objective: Ophthalmological pathologies such as glaucoma, diabetic retinopathy and age-related macular degeneration are major causes of blindness and vision impairment. There is a need for novel decision support tools that can simplify and speed up the diagnosis of these pathologies. A key step in this process is to automatically estimate the quality of the fundus images to make sure these are interpretable by a human operator or a machine learning model. We present a novel fundus image quality scale and deep learning (DL) model that can estimate fundus image quality relative to this new scale.
Methods: A total of 1,245 images were graded for quality by two ophthalmologists within the range 1-10, with a resolution of 0.5. A DL regression model was trained for fundus image quality assessment. The architecture used was Inception-V3. The model was developed using a total of 89,947 images from 6 databases, of which 1,245 were labeled by the specialists and the remaining 88,702 images were used for pre-training and semi-supervised learning. The final DL model was evaluated on an internal test set (n=209) as well as an external test set (n=194).
Results: The final DL model, denoted FundusQ-Net, achieved a mean absolute error of 0.61 (0.54-0.68) on the internal test set. When evaluated as a binary classification model on the public DRIMDB database as an external test set the model obtained an accuracy of 99%.
Significance: the proposed algorithm provides a new robust tool for automated quality grading of fundus images.
△ Less
Submitted 6 June, 2023; v1 submitted 2 May, 2022;
originally announced May 2022.
-
SleepPPG-Net: a deep learning algorithm for robust sleep staging from continuous photoplethysmography
Authors:
Kevin Kotzen,
Peter H. Charlton,
Sharon Salabi,
Lea Amar,
Amir Landesberg,
Joachim A. Behar
Abstract:
Introduction: Sleep staging is an essential component in the diagnosis of sleep disorders and management of sleep health. It is traditionally measured in a clinical setting and requires a labor-intensive labeling process. We hypothesize that it is possible to perform robust 4-class sleep staging using the raw photoplethysmography (PPG) time series and modern advances in deep learning (DL). Methods…
▽ More
Introduction: Sleep staging is an essential component in the diagnosis of sleep disorders and management of sleep health. It is traditionally measured in a clinical setting and requires a labor-intensive labeling process. We hypothesize that it is possible to perform robust 4-class sleep staging using the raw photoplethysmography (PPG) time series and modern advances in deep learning (DL). Methods: We used two publicly available sleep databases that included raw PPG recordings, totalling 2,374 patients and 23,055 hours. We developed SleepPPG-Net, a DL model for 4-class sleep staging from the raw PPG time series. SleepPPG-Net was trained end-to-end and consists of a residual convolutional network for automatic feature extraction and a temporal convolutional network to capture long-range contextual information. We benchmarked the performance of SleepPPG-Net against models based on the best-reported state-of-the-art (SOTA) algorithms. Results: When benchmarked on a held-out test set, SleepPPG-Net obtained a median Cohen's Kappa ($κ$) score of 0.75 against 0.69 for the best SOTA approach. SleepPPG-Net showed good generalization performance to an external database, obtaining a $κ$ score of 0.74 after transfer learning. Perspective: Overall, SleepPPG-Net provides new SOTA performance. In addition, performance is high enough to open the path to the development of wearables that meet the requirements for usage in clinical applications such as the diagnosis and monitoring of obstructive sleep apnea.
△ Less
Submitted 29 April, 2022; v1 submitted 11 February, 2022;
originally announced February 2022.
-
From sleep medicine to medicine during sleep: A clinical perspective
Authors:
Nitai Bar,
Jonathan A. Sobel,
Thomas Penzel,
Yosi Shamay,
Joachim A. Behar
Abstract:
Sleep has a profound influence on the physiology of body systems and biological processes. Molecular studies have shown circadian-regulated shifts in protein expression patterns across human tissues, further emphasizing the unique functional, behavioral and pharmacokinetic landscape of sleep. Thus, many pathological processes are also expected to exhibit sleep-specific manifestations. Nevertheless…
▽ More
Sleep has a profound influence on the physiology of body systems and biological processes. Molecular studies have shown circadian-regulated shifts in protein expression patterns across human tissues, further emphasizing the unique functional, behavioral and pharmacokinetic landscape of sleep. Thus, many pathological processes are also expected to exhibit sleep-specific manifestations. Nevertheless, sleep is seldom utilized for the study, detection and treatment of non-sleep-specific pathologies. Modern advances in biosensor technologies have enabled remote, non-invasive recording of a growing number of physiologic parameters and biomarkers. Sleep is an ideal time frame for the collection of long and clean physiological time series data which can then be analyzed using data-driven algorithms such as deep learning. In this perspective paper, we aim to highlight the potential of sleep as an auspicious time for diagnosis, management and therapy of nonsleep-specific pathologies. We introduce key clinical studies in selected medical fields, which leveraged novel technologies and the advantageous period of sleep to diagnose, monitor and treat pathologies. We then discuss possible opportunities to further harness this new paradigm and modern technologies to explore human health and disease during sleep and to advance the development of novel clinical applications: From sleep medicine to medicine during sleep.
△ Less
Submitted 9 February, 2021;
originally announced February 2021.
-
Machine learning for nocturnal diagnosis of chronic obstructive pulmonary disease using digital oximetry biomarkers
Authors:
Jeremy Levy,
Daniel Alvarez,
Felix del Campo,
Joachim A. Behar
Abstract:
Objective: Chronic obstructive pulmonary disease (COPD) is a highly prevalent chronic condition. COPD is a major source of morbidity, mortality and healthcare costs. Spirometry is the gold standard test for a definitive diagnosis and severity grading of COPD. However, a large proportion of individuals with COPD are undiagnosed and untreated. Given the high prevalence of COPD and its clinical impor…
▽ More
Objective: Chronic obstructive pulmonary disease (COPD) is a highly prevalent chronic condition. COPD is a major source of morbidity, mortality and healthcare costs. Spirometry is the gold standard test for a definitive diagnosis and severity grading of COPD. However, a large proportion of individuals with COPD are undiagnosed and untreated. Given the high prevalence of COPD and its clinical importance, it is critical to develop new algorithms to identify undiagnosed COPD, especially in specific groups at risk, such as those with sleep disorder breathing. To our knowledge, no research has looked at the feasibility of COPD diagnosis from the nocturnal oximetry time series. Approach: We hypothesize that patients with COPD will exert certain patterns and/or dynamics of their overnight oximetry time series that are unique to this condition. We introduce a novel approach to nocturnal COPD diagnosis using 44 oximetry digital biomarkers and 5 demographic features and assess its performance in a population sample at risk of sleep-disordered breathing. A total of n=350 unique patients polysomnography (PSG) recordings. A random forest (RF) classifier is trained using these features and evaluated using the nested cross-validation procedure. Significance: Our research makes a number of novel scientific contributions. First, we demonstrated for the first time, the feasibility of COPD diagnosis from nocturnal oximetry time series in a population sample at risk of sleep disordered breathing. We highlighted what digital oximetry biomarkers best reflect how COPD manifests overnight. The results motivate that overnight single channel oximetry is a valuable pathway for COPD diagnosis.
△ Less
Submitted 10 December, 2020;
originally announced December 2020.
-
Digital biomarkers and artificial intelligence for mass diagnosis of atrial fibrillation in a population sample at risk of sleep disordered breathing
Authors:
Armand Chocron,
Roi Efraim,
Franck Mandel,
Michael Rueschman,
Niclas Palmius,
Thomas Penzel,
Meyer Elbaz,
Joachim A. Behar
Abstract:
Atrial fibrillation (AF) is the most prevalent arrhythmia and is associated with a five-fold increase in stroke risk. Many individuals with AF go undetected. These individuals are often asymptomatic. There are ongoing debates on whether mass screening for AF is to be recommended. However, there is incentive in performing screening for specific at risk groups such as individuals suspected of sleep-…
▽ More
Atrial fibrillation (AF) is the most prevalent arrhythmia and is associated with a five-fold increase in stroke risk. Many individuals with AF go undetected. These individuals are often asymptomatic. There are ongoing debates on whether mass screening for AF is to be recommended. However, there is incentive in performing screening for specific at risk groups such as individuals suspected of sleep-disordered breathing where an important association between AF and obstructive sleep apnea (OSA) has been demonstrated. We introduce a new methodology leveraging digital biomarkers and recent advances in artificial intelligence (AI) for the purpose of mass AF diagnosis. We demonstrate the value of such methodology in a large population sample at risk of sleep disordered breathing. Four databases, totaling n=3,088 patients and p=26,913 hours of ECG raw data were used. Three of the databases (n=125, p=2,513) were used for training a machine learning model in recognizing AF events from beat-to-beat interval time series. The visit 1 of the sleep heart health study database (SHHS1, n=2,963, p=24,400) consists of overnight polysomnographic (PSG) recordings, and was considered as the test set. In SHHS1, expert inspection identified a total of 70 patients with a prominent AF rhythm. Model prediction on the SHHS1 showed an overall Se=0.97,Sp=0.99,NPV=0.99,PPV=0.67 in classifying individuals with or without prominent AF. PPV was non-inferior (p=0.03) for individuals with an apnea-hypopnea index (AHI) > 15 versus AHI < 15. Over 22% of correctly identified prominent AF rhythm cases were not documented as AF in the SHHS1. Individuals with prominent AF can be automatically diagnosed from an overnight single channel ECG recording, with an accuracy unaffected by the presence of OSA. AF detection from overnight ECG recording revealed a large proportion of undiagnosed AF and may enhance the phenoty** of OSA.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
Remote health monitoring and diagnosis in the time of COVID-19
Authors:
Joachim A. Behar,
Chengyu Liu,
Kevin Kotzen,
Kenta Tsutsui,
Valentina D. A. Corino,
Janmajay Singh,
Marco A. F. Pimentel,
Philip Warrick,
Sebastian Zaunseder,
Fernando Andreotti,
David Sebag,
Georgy Popanitsa,
Patrick E. McSharry,
Walter Karlen,
Chandan Karmakar,
Gari D. Clifford
Abstract:
Coronavirus disease (COVID-19) is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that is rapidly spreading across the globe. The clinical spectrum of SARS-CoV-2 pneumonia ranges from mild to critically ill cases and requires early detection and monitoring, within a clinical environment for critical cases and remotely for mild cases. The fear of contamination in clinical…
▽ More
Coronavirus disease (COVID-19) is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that is rapidly spreading across the globe. The clinical spectrum of SARS-CoV-2 pneumonia ranges from mild to critically ill cases and requires early detection and monitoring, within a clinical environment for critical cases and remotely for mild cases. The fear of contamination in clinical environments has led to a dramatic reduction in on-site referrals for routine care. There has also been a perceived need to continuously monitor non-severe COVID- 19 patients, either from their quarantine site at home, or dedicated quarantine locations (e.g., hotels). Thus, the pandemic has driven incentives to innovate and enhance or create new routes for providing healthcare services at distance. In particular, this has created a dramatic impetus to find innovative ways to remotely and effectively monitor patient health status. In this paper we present a short review of remote health monitoring initiatives taken in 19 states during the time of the pandemic. We emphasize in the discussion particular aspects that are common ground for the reviewed states, in particular the future impact of the pandemic on remote health monitoring and consideration on data privacy.
△ Less
Submitted 15 October, 2020; v1 submitted 18 May, 2020;
originally announced May 2020.