-
Multimodal Variational Autoencoder for Low-cost Cardiac Hemodynamics Instability Detection
Authors:
Mohammod N. I. Suvon,
Prasun C. Tripathi,
Wenrui Fan,
Shuo Zhou,
Xianyuan Liu,
Samer Alabed,
Venet Osmani,
Andrew J. Swift,
Chen Chen,
Hai** Lu
Abstract:
Recent advancements in non-invasive detection of cardiac hemodynamic instability (CHDI) primarily focus on applying machine learning techniques to a single data modality, e.g. cardiac magnetic resonance imaging (MRI). Despite their potential, these approaches often fall short especially when the size of labeled patient data is limited, a common challenge in the medical domain. Furthermore, only a…
▽ More
Recent advancements in non-invasive detection of cardiac hemodynamic instability (CHDI) primarily focus on applying machine learning techniques to a single data modality, e.g. cardiac magnetic resonance imaging (MRI). Despite their potential, these approaches often fall short especially when the size of labeled patient data is limited, a common challenge in the medical domain. Furthermore, only a few studies have explored multimodal methods to study CHDI, which mostly rely on costly modalities such as cardiac MRI and echocardiogram. In response to these limitations, we propose a novel multimodal variational autoencoder ($\text{CardioVAE}_\text{X,G}$) to integrate low-cost chest X-ray (CXR) and electrocardiogram (ECG) modalities with pre-training on a large unlabeled dataset. Specifically, $\text{CardioVAE}_\text{X,G}$ introduces a novel tri-stream pre-training strategy to learn both shared and modality-specific features, thus enabling fine-tuning with both unimodal and multimodal datasets. We pre-train $\text{CardioVAE}_\text{X,G}$ on a large, unlabeled dataset of $50,982$ subjects from a subset of MIMIC database and then fine-tune the pre-trained model on a labeled dataset of $795$ subjects from the ASPIRE registry. Comprehensive evaluations against existing methods show that $\text{CardioVAE}_\text{X,G}$ offers promising performance (AUROC $=0.79$ and Accuracy $=0.77$), representing a significant step forward in non-invasive prediction of CHDI. Our model also excels in producing fine interpretations of predictions directly associated with clinical features, thereby supporting clinical decision-making.
△ Less
Submitted 20 June, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
MeDSLIP: Medical Dual-Stream Language-Image Pre-training for Fine-grained Alignment
Authors:
Wenrui Fan,
Mohammod Naimul Islam Suvon,
Shuo Zhou,
Xianyuan Liu,
Samer Alabed,
Venet Osmani,
Andrew Swift,
Chen Chen,
Hai** Lu
Abstract:
Vision-language pre-training (VLP) models have shown significant advancements in the medical domain. Yet, most VLP models align raw reports to images at a very coarse level, without modeling fine-grained relationships between anatomical and pathological concepts outlined in reports and the corresponding semantic counterparts in images. To address this problem, we propose a Medical Dual-Stream Lang…
▽ More
Vision-language pre-training (VLP) models have shown significant advancements in the medical domain. Yet, most VLP models align raw reports to images at a very coarse level, without modeling fine-grained relationships between anatomical and pathological concepts outlined in reports and the corresponding semantic counterparts in images. To address this problem, we propose a Medical Dual-Stream Language-Image Pre-training (MeDSLIP) framework. Specifically, MeDSLIP establishes vision-language fine-grained alignments via disentangling visual and textual representations into anatomy-relevant and pathology-relevant streams. Moreover, a novel vision-language Prototypical Contr-astive Learning (ProtoCL) method is adopted in MeDSLIP to enhance the alignment within the anatomical and pathological streams. MeDSLIP further employs cross-stream Intra-image Contrastive Learning (ICL) to ensure the consistent coexistence of paired anatomical and pathological concepts within the same image. Such a cross-stream regularization encourages the model to exploit the synchrony between two streams for a more comprehensive representation learning. MeDSLIP is evaluated under zero-shot and supervised fine-tuning settings on three public datasets: NIH CXR14, RSNA Pneumonia, and SIIM-ACR Pneumothorax. Under these settings, MeDSLIP outperforms six leading CNN-based models on classification, grounding, and segmentation tasks.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Mitigating Health Data Poverty: Generative Approaches versus Resampling for Time-series Clinical Data
Authors:
Raffaele Marchesi,
Nicolo Micheletti,
Giuseppe Jurman,
Venet Osmani
Abstract:
Several approaches have been developed to mitigate algorithmic bias stemming from health data poverty, where minority groups are underrepresented in training datasets. Augmenting the minority class using resampling (such as SMOTE) is a widely used approach due to the simplicity of the algorithms. However, these algorithms decrease data variability and may introduce correlations between samples, gi…
▽ More
Several approaches have been developed to mitigate algorithmic bias stemming from health data poverty, where minority groups are underrepresented in training datasets. Augmenting the minority class using resampling (such as SMOTE) is a widely used approach due to the simplicity of the algorithms. However, these algorithms decrease data variability and may introduce correlations between samples, giving rise to the use of generative approaches based on GAN. Generation of high-dimensional, time-series, authentic data that provides a wide distribution coverage of the real data, remains a challenging task for both resampling and GAN-based approaches. In this work we propose CA-GAN architecture that addresses some of the shortcomings of the current approaches, where we provide a detailed comparison with both SMOTE and WGAN-GP*, using a high-dimensional, time-series, real dataset of 3343 hypotensive Caucasian and Black patients. We show that our approach is better at both generating authentic data of the minority class and remaining within the original distribution of the real data.
△ Less
Submitted 26 October, 2022; v1 submitted 25 October, 2022;
originally announced October 2022.
-
Prediction of Blood Lactate Values in Critically Ill Patients: A Retrospective Multi-center Cohort Study
Authors:
Behrooz Mamandipoor,
Wesley Yeung,
Louis Agha-Mir-Salim,
David J. Stone,
Venet Osmani,
Leo Anthony Celi
Abstract:
Purpose. Elevations in initially obtained serum lactate levels are strong predictors of mortality in critically ill patients. Identifying patients whose serum lactate levels are more likely to increase can alert physicians to intensify care and guide them in the frequency of tending the blood test. We investigate whether machine learning models can predict subsequent serum lactate changes.
Metho…
▽ More
Purpose. Elevations in initially obtained serum lactate levels are strong predictors of mortality in critically ill patients. Identifying patients whose serum lactate levels are more likely to increase can alert physicians to intensify care and guide them in the frequency of tending the blood test. We investigate whether machine learning models can predict subsequent serum lactate changes.
Methods. We investigated serum lactate change prediction using the MIMIC-III and eICU-CRD datasets in internal as well as external validation of the eICU cohort on the MIMIC-III cohort. Three subgroups were defined based on the initial lactate levels: i) normal group (<2 mmol/L), ii) mild group (2-4 mmol/L), and iii) severe group (>4 mmol/L). Outcomes were defined based on increase or decrease of serum lactate levels between the groups. We also performed sensitivity analysis by defining the outcome as lactate change of >10% and furthermore investigated the influence of the time interval between subsequent lactate measurements on predictive performance.
Results. The LSTM models were able to predict deterioration of serum lactate values of MIMIC-III patients with an AUC of 0.77 (95% CI 0.762-0.771) for the normal group, 0.77 (95% CI 0.768-0.772) for the mild group, and 0.85 (95% CI 0.840-0.851) for the severe group, with a slightly lower performance in the external validation.
Conclusion. The LSTM demonstrated good discrimination of patients who had deterioration in serum lactate levels. Clinical studies are needed to evaluate whether utilization of a clinical decision support tool based on these results could positively impact decision-making and patient outcomes.
△ Less
Submitted 7 July, 2021;
originally announced July 2021.
-
Deep ROC Analysis and AUC as Balanced Average Accuracy to Improve Model Selection, Understanding and Interpretation
Authors:
André M. Carrington,
Douglas G. Manuel,
Paul W. Fieguth,
Tim Ramsay,
Venet Osmani,
Bernhard Wernly,
Carol Bennett,
Steven Hawken,
Matthew McInnes,
Olivia Magwood,
Yusuf Sheikh,
Andreas Holzinger
Abstract:
Optimal performance is critical for decision-making tasks from medicine to autonomous driving, however common performance measures may be too general or too specific. For binary classifiers, diagnostic tests or prognosis at a timepoint, measures such as the area under the receiver operating characteristic curve, or the area under the precision recall curve, are too general because they include unr…
▽ More
Optimal performance is critical for decision-making tasks from medicine to autonomous driving, however common performance measures may be too general or too specific. For binary classifiers, diagnostic tests or prognosis at a timepoint, measures such as the area under the receiver operating characteristic curve, or the area under the precision recall curve, are too general because they include unrealistic decision thresholds. On the other hand, measures such as accuracy, sensitivity or the F1 score are measures at a single threshold that reflect an individual single probability or predicted risk, rather than a range of individuals or risk. We propose a method in between, deep ROC analysis, that examines groups of probabilities or predicted risks for more insightful analysis. We translate esoteric measures into familiar terms: AUC and the normalized concordant partial AUC are balanced average accuracy (a new finding); the normalized partial AUC is average sensitivity; and the normalized horizontal partial AUC is average specificity. Along with post-test measures, we provide a method that can improve model selection in some cases and provide interpretation and assurance for patients in each risk group. We demonstrate deep ROC analysis in two case studies and provide a toolkit in Python.
△ Less
Submitted 21 March, 2021;
originally announced March 2021.
-
Blood lactate concentration prediction in critical care patients: handling missing values
Authors:
Behrooz Mamandipoor,
Mahshid Majd,
Monica Moz,
Venet Osmani
Abstract:
Blood lactate concentration is a strong indicator of mortality risk in critically ill patients. While frequent lactate measurements are necessary to assess patient's health state, the measurement is an invasive procedure that can increase risk of hospital-acquired infections. For this reason we formally define the problem of lactate prediction as a clinically relevant benchmark problem for machine…
▽ More
Blood lactate concentration is a strong indicator of mortality risk in critically ill patients. While frequent lactate measurements are necessary to assess patient's health state, the measurement is an invasive procedure that can increase risk of hospital-acquired infections. For this reason we formally define the problem of lactate prediction as a clinically relevant benchmark problem for machine learning community so as to assist clinical decision making in blood lactate testing. Accordingly, we demonstrate the relevant challenges of the problem and its data in addition to the adopted solutions. Also, we evaluate the performance of different prediction algorithms on a large dataset of ICU patients from the multi-centre eICU database. More specifically, we focus on investigating the impact of missing value imputation methods in lactate prediction for each algorithm. The experimental analysis shows promising prediction results that encourages further investigation of this problem.
△ Less
Submitted 3 October, 2019;
originally announced October 2019.
-
Benchmarking machine learning models on multi-centre eICU critical care dataset
Authors:
Seyedmostafa Sheikhalishahi,
Vevake Balaraman,
Venet Osmani
Abstract:
Progress of machine learning in critical care has been difficult to track, in part due to absence of public benchmarks. Other fields of research (such as computer vision and natural language processing) have established various competitions and public benchmarks. Recent availability of large clinical datasets has enabled the possibility of establishing public benchmarks. Taking advantage of this o…
▽ More
Progress of machine learning in critical care has been difficult to track, in part due to absence of public benchmarks. Other fields of research (such as computer vision and natural language processing) have established various competitions and public benchmarks. Recent availability of large clinical datasets has enabled the possibility of establishing public benchmarks. Taking advantage of this opportunity, we propose a public benchmark suite to address four areas of critical care, namely mortality prediction, estimation of length of stay, patient phenoty** and risk of decompensation. We define each task and compare the performance of both clinical models as well as baseline and deep learning models using eICU critical care dataset of around 73,000 patients. This is the first public benchmark on a multi-centre critical care dataset, comparing the performance of clinical gold standard with our predictive model. We also investigate the impact of numerical variables as well as handling of categorical variables on each of the defined tasks. The source code, detailing our methods and experiments is publicly available such that anyone can replicate our results and build upon our work.
△ Less
Submitted 5 August, 2021; v1 submitted 2 October, 2019;
originally announced October 2019.
-
Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review
Authors:
Seyedmostafa Sheikhalishahi,
Riccardo Miotto,
Joel T Dudley,
Alberto Lavelli,
Fabio Rinaldi,
Venet Osmani
Abstract:
Of the 2652 articles considered, 106 met the inclusion criteria. Review of the included papers resulted in identification of 43 chronic diseases, which were then further classified into 10 disease categories using ICD-10. The majority of studies focused on diseases of the circulatory system (n=38) while endocrine and metabolic diseases were fewest (n=14). This was due to the structure of clinical…
▽ More
Of the 2652 articles considered, 106 met the inclusion criteria. Review of the included papers resulted in identification of 43 chronic diseases, which were then further classified into 10 disease categories using ICD-10. The majority of studies focused on diseases of the circulatory system (n=38) while endocrine and metabolic diseases were fewest (n=14). This was due to the structure of clinical records related to metabolic diseases, which typically contain much more structured data, compared with medical records for diseases of the circulatory system, which focus more on unstructured data and consequently have seen a stronger focus of NLP. The review has shown that there is a significant increase in the use of machine learning methods compared to rule-based approaches; however, deep learning methods remain emergent (n=3). Consequently, the majority of works focus on classification of disease phenotype with only a handful of papers addressing extraction of comorbidities from the free text or integration of clinical notes with structured data. There is a notable use of relatively simple methods, such as shallow classifiers (or combination with rule-based methods), due to the interpretability of predictions, which still represents a significant issue for more complex methods. Finally, scarcity of publicly available data may also have contributed to insufficient development of more advanced methods, such as extraction of word embeddings from clinical notes. Further efforts are still required to improve (1) progression of clinical NLP methods from extraction toward understanding; (2) recognition of relations among entities rather than entities in isolation; (3) temporal extraction to understand past, current, and future clinical events; (4) exploitation of alternative sources of clinical knowledge; and (5) availability of large-scale, de-identified clinical corpora.
△ Less
Submitted 15 August, 2019;
originally announced August 2019.
-
Processing of Electronic Health Records using Deep Learning: A review
Authors:
Venet Osmani,
Li Li,
Matteo Danieletto,
Benjamin Glicksberg,
Joel Dudley,
Oscar Mayora
Abstract:
Availability of large amount of clinical data is opening up new research avenues in a number of fields. An exciting field in this respect is healthcare, where secondary use of healthcare data is beginning to revolutionize healthcare. Except for availability of Big Data, both medical data from healthcare institutions (such as EMR data) and data generated from health and wellbeing devices (such as p…
▽ More
Availability of large amount of clinical data is opening up new research avenues in a number of fields. An exciting field in this respect is healthcare, where secondary use of healthcare data is beginning to revolutionize healthcare. Except for availability of Big Data, both medical data from healthcare institutions (such as EMR data) and data generated from health and wellbeing devices (such as personal trackers), a significant contribution to this trend is also being made by recent advances on machine learning, specifically deep learning algorithms.
△ Less
Submitted 5 April, 2018;
originally announced April 2018.
-
Smartphone apps usage patterns as a predictor of perceived stress levels at workplace
Authors:
Raihana Ferdous,
Venet Osmani,
Oscar Mayora
Abstract:
Explosion of number of smartphone apps and their diversity has created a fertile ground to study behaviour of smartphone users. Patterns of app usage, specifically types of apps and their duration are influenced by the state of the user and this information can be correlated with the self-reported state of the users. The work in this paper is along the line of understanding patterns of app usage a…
▽ More
Explosion of number of smartphone apps and their diversity has created a fertile ground to study behaviour of smartphone users. Patterns of app usage, specifically types of apps and their duration are influenced by the state of the user and this information can be correlated with the self-reported state of the users. The work in this paper is along the line of understanding patterns of app usage and investigating relationship of these patterns with the perceived stress level within the workplace context. Our results show that using a subject-centric behaviour model we can predict stress levels based on smartphone app usage. The results we have achieved, of average accuracy of 75% and precision of 85.7%, can be used as an indicator of overall stress levels in work environments and in turn inform stress reduction organisational policies, especially when considering interrelation between stress and productivity of workers.
△ Less
Submitted 10 March, 2018;
originally announced March 2018.
-
Enabling Prescription-based Health Apps
Authors:
Venet Osmani,
Stefano Forti,
Oscar Mayora,
Diego Conforti
Abstract:
We describe an innovative framework for prescription of personalised health apps by integrating Personal Health Records (PHR) with disease-specific mobile applications for managing medical conditions and the communication with clinical professionals. The prescribed apps record multiple variables including medical history enriched with innovative features such as integration with medical monitoring…
▽ More
We describe an innovative framework for prescription of personalised health apps by integrating Personal Health Records (PHR) with disease-specific mobile applications for managing medical conditions and the communication with clinical professionals. The prescribed apps record multiple variables including medical history enriched with innovative features such as integration with medical monitoring devices and wellbeing trackers to provide patients and clinicians with a personalised support on disease management. Our framework is based on an existing PHR ecosystem called TreC, uniquely positioned between healthcare provider and the patients, which is being used by over 70.000 patients in Trentino region in Northern Italy. We also describe three important aspects of health app prescription and how medical information is automatically encoded through the TreC framework and is prescribed as a personalised app, ready to be installed in the patients' smartphone.
△ Less
Submitted 29 June, 2017;
originally announced June 2017.
-
Automatic Stress Detection in Working Environments from Smartphones' Accelerometer Data: A First Step
Authors:
Enrique Garcia-Ceja,
Venet Osmani,
Oscar Mayora
Abstract:
Increase in workload across many organisations and consequent increase in occupational stress is negatively affecting the health of the workforce. Measuring stress and other human psychological dynamics is difficult due to subjective nature of self- reporting and variability between and within individuals. With the advent of smartphones it is now possible to monitor diverse aspects of human behavi…
▽ More
Increase in workload across many organisations and consequent increase in occupational stress is negatively affecting the health of the workforce. Measuring stress and other human psychological dynamics is difficult due to subjective nature of self- reporting and variability between and within individuals. With the advent of smartphones it is now possible to monitor diverse aspects of human behaviour, including objectively measured behaviour related to psychological state and consequently stress. We have used data from the smartphone's built-in accelerometer to detect behaviour that correlates with subjects stress levels. Accelerometer sensor was chosen because it raises fewer privacy concerns (in comparison to location, video or audio recording, for example) and because its low power consumption makes it suitable to be embedded in smaller wearable devices, such as fitness trackers. 30 subjects from two different organizations were provided with smartphones. The study lasted for 8 weeks and was conducted in real working environments, with no constraints whatsoever placed upon smartphone usage. The subjects reported their perceived stress levels three times during their working hours. Using combination of statistical models to classify self reported stress levels, we achieved a maximum overall accuracy of 71% for user-specific models and an accuracy of 60% for the use of similar-users models, relying solely on data from a single accelerometer.
△ Less
Submitted 14 October, 2015;
originally announced October 2015.
-
Smartphones in Mental Health: Detecting Depressive and Manic Episodes
Authors:
Venet Osmani,
Agnes Gruenerbl,
Gernot Bahle,
Christian Haring,
Paul Lukowicz,
Oscar Mayora
Abstract:
An observational study with patients diagnosed with bipolar disorder investigates whether data from smartphone sensors can be used to recognize bipolar disorder episodes and detect behavior changes that can signal an onset of an episode using objective data.
An observational study with patients diagnosed with bipolar disorder investigates whether data from smartphone sensors can be used to recognize bipolar disorder episodes and detect behavior changes that can signal an onset of an episode using objective data.
△ Less
Submitted 19 October, 2015; v1 submitted 6 October, 2015;
originally announced October 2015.
-
Investigation of indoor localization with ambient FM radio stations
Authors:
Andrei Popleteev,
Venet Osmani,
Oscar Mayora
Abstract:
Localization plays an essential role in many ubiquitous computing applications. While the outdoor location-aware services based on GPS are becoming increasingly popular, their proliferation to indoor environments is limited due to the lack of widely available indoor localization systems. The de-facto standard for indoor positioning is based on Wi-Fi and while other localization alternatives exist,…
▽ More
Localization plays an essential role in many ubiquitous computing applications. While the outdoor location-aware services based on GPS are becoming increasingly popular, their proliferation to indoor environments is limited due to the lack of widely available indoor localization systems. The de-facto standard for indoor positioning is based on Wi-Fi and while other localization alternatives exist, they either require expensive hardware or provide a low accuracy. This paper presents an investigation into localization system that leverages signals of broadcasting FM radio stations. The FM stations provide a worldwide coverage, while FM tuners are readily available in many mobile devices. The experimental results show that FM radio can be used for indoor localization, while providing longer battery life than Wi-Fi, making FM an alternative to consider for positioning.
△ Less
Submitted 6 August, 2013;
originally announced August 2013.