-
Variational Shapley Network: A Probabilistic Approach to Self-Explaining Shapley values with Uncertainty Quantification
Authors:
Mert Ketenci,
Iñigo Urteaga,
Victor Alfonso Rodriguez,
Noémie Elhadad,
Adler Perotte
Abstract:
Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes. Despite their widespread adoption and unique ability to satisfy essential explainability axioms, computational challenges persist in their estimation when ($i$) evaluating a model over all possible subset of input feature combinations, ($ii$) estimating model marginals, and…
▽ More
Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes. Despite their widespread adoption and unique ability to satisfy essential explainability axioms, computational challenges persist in their estimation when ($i$) evaluating a model over all possible subset of input feature combinations, ($ii$) estimating model marginals, and ($iii$) addressing variability in explanations. We introduce a novel, self-explaining method that simplifies the computation of Shapley values significantly, requiring only a single forward pass. Recognizing the deterministic treatment of Shapley values as a limitation, we explore incorporating a probabilistic framework to capture the inherent uncertainty in explanations. Unlike alternatives, our technique does not rely directly on the observed data space to estimate marginals; instead, it uses adaptable baseline values derived from a latent, feature-specific embedding space, generated by a novel masked neural network architecture. Evaluations on simulated and real datasets underscore our technique's robust predictive and explanatory performance.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Maximum Likelihood Estimation of Flexible Survival Densities with Importance Sampling
Authors:
Mert Ketenci,
Shreyas Bhave,
Noémie Elhadad,
Adler Perotte
Abstract:
Survival analysis is a widely-used technique for analyzing time-to-event data in the presence of censoring. In recent years, numerous survival analysis methods have emerged which scale to large datasets and relax traditional assumptions such as proportional hazards. These models, while being performant, are very sensitive to model hyperparameters including: (1) number of bins and bin size for disc…
▽ More
Survival analysis is a widely-used technique for analyzing time-to-event data in the presence of censoring. In recent years, numerous survival analysis methods have emerged which scale to large datasets and relax traditional assumptions such as proportional hazards. These models, while being performant, are very sensitive to model hyperparameters including: (1) number of bins and bin size for discrete models and (2) number of cluster assignments for mixture-based models. Each of these choices requires extensive tuning by practitioners to achieve optimal performance. In addition, we demonstrate in empirical studies that: (1) optimal bin size may drastically differ based on the metric of interest (e.g., concordance vs brier score), and (2) mixture models may suffer from mode collapse and numerical instability. We propose a survival analysis approach which eliminates the need to tune hyperparameters such as mixture assignments and bin sizes, reducing the burden on practitioners. We show that the proposed approach matches or outperforms baselines on several real-world datasets.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
A Coreset-based, Tempered Variational Posterior for Accurate and Scalable Stochastic Gaussian Process Inference
Authors:
Mert Ketenci,
Adler Perotte,
Noémie Elhadad,
Iñigo Urteaga
Abstract:
We present a novel stochastic variational Gaussian process ($\mathcal{GP}$) inference method, based on a posterior over a learnable set of weighted pseudo input-output points (coresets). Instead of a free-form variational family, the proposed coreset-based, variational tempered family for $\mathcal{GP}$s (CVTGP) is defined in terms of the $\mathcal{GP}$ prior and the data-likelihood; hence, accomm…
▽ More
We present a novel stochastic variational Gaussian process ($\mathcal{GP}$) inference method, based on a posterior over a learnable set of weighted pseudo input-output points (coresets). Instead of a free-form variational family, the proposed coreset-based, variational tempered family for $\mathcal{GP}$s (CVTGP) is defined in terms of the $\mathcal{GP}$ prior and the data-likelihood; hence, accommodating the modeling inductive biases. We derive CVTGP's lower bound for the log-marginal likelihood via marginalization of the proposed posterior over latent $\mathcal{GP}$ coreset variables, and show it is amenable to stochastic optimization. CVTGP reduces the learnable parameter size to $\mathcal{O}(M)$, enjoys numerical stability, and maintains $\mathcal{O}(M^3)$ time- and $\mathcal{O}(M^2)$ space-complexity, by leveraging a coreset-based tempered posterior that, in turn, provides sparse and explainable representations of the data. Results on simulated and real-world regression problems with Gaussian observation noise validate that CVTGP provides better evidence lower-bound estimates and predictive root mean squared error than alternative stochastic $\mathcal{GP}$ inference methods.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
A generative, predictive model for menstrual cycle lengths that accounts for potential self-tracking artifacts in mobile health data
Authors:
Kathy Li,
Iñigo Urteaga,
Amanda Shea,
Virginia J. Vitzthum,
Chris H. Wiggins,
Noémie Elhadad
Abstract:
Mobile health (mHealth) apps such as menstrual trackers provide a rich source of self-tracked health observations that can be leveraged for health-relevant research. However, such data streams have questionable reliability since they hinge on user adherence to the app. Therefore, it is crucial for researchers to separate true behavior from self-tracking artifacts. By taking a machine learning appr…
▽ More
Mobile health (mHealth) apps such as menstrual trackers provide a rich source of self-tracked health observations that can be leveraged for health-relevant research. However, such data streams have questionable reliability since they hinge on user adherence to the app. Therefore, it is crucial for researchers to separate true behavior from self-tracking artifacts. By taking a machine learning approach to modeling self-tracked cycle lengths, we can both make more informed predictions and learn the underlying structure of the observed data. In this work, we propose and evaluate a hierarchical, generative model for predicting next cycle length based on previously-tracked cycle lengths that accounts explicitly for the possibility of users skip** tracking their period. Our model offers several advantages: 1) accounting explicitly for self-tracking artifacts yields better prediction accuracy as likelihood of skip** increases; 2) because it is a generative model, predictions can be updated online as a given cycle evolves, and we can gain interpretable insight into how these predictions change over time; and 3) its hierarchical nature enables modeling of an individual's cycle length history while incorporating population-level information. Our experiments using mHealth cycle length data encompassing over 186,000 menstruators with over 2 million natural menstrual cycles show that our method yields state-of-the-art performance against neural network-based and summary statistic-based baselines, while providing insights on disentangling menstrual patterns from self-tracking artifacts. This work can benefit users, mHealth app developers, and researchers in better understanding cycle patterns and user adherence.
△ Less
Submitted 16 March, 2021; v1 submitted 24 February, 2021;
originally announced February 2021.
-
Exploring Gender Disparities in Time to Diagnosis
Authors:
Tony Y. Sun,
Oliver J. Bear Don't Walk IV,
Jennifer L. Chen,
Harry Reyes Nieva,
Noémie Elhadad
Abstract:
Sex and gender-based healthcare disparities contribute to differences in health outcomes. We focus on time to diagnosis (TTD) by conducting two large-scale, complementary analyses among men and women across 29 phenotypes and 195K patients. We first find that women are consistently more likely to experience a longer TTD than men, even when presenting with the same conditions. We further explore how…
▽ More
Sex and gender-based healthcare disparities contribute to differences in health outcomes. We focus on time to diagnosis (TTD) by conducting two large-scale, complementary analyses among men and women across 29 phenotypes and 195K patients. We first find that women are consistently more likely to experience a longer TTD than men, even when presenting with the same conditions. We further explore how TTD disparities affect diagnostic performance between genders, both across and persistent to time, by evaluating gender-agnostic disease classifiers across increasing diagnostic information. In both fairness analyses, the diagnostic process favors men over women, contradicting the previous observation that women may demonstrate relevant symptoms earlier than men. These analyses suggest that TTD is an important yet complex aspect when studying gender disparities, and warrants further investigation.
△ Less
Submitted 14 November, 2020; v1 submitted 11 November, 2020;
originally announced November 2020.
-
Towards Patient Record Summarization Through Joint Phenotype Learning in HIV Patients
Authors:
Gal Levy-Fix,
Jason Zucker,
Konstantin Stojanovic,
Noémie Elhadad
Abstract:
Identifying a patient's key problems over time is a common task for providers at the point care, yet a complex and time-consuming activity given current electric health records. To enable a problem-oriented summarizer to identify a patient's comprehensive list of problems and their salience, we propose an unsupervised phenoty** approach that jointly learns a large number of phenotypes/problems a…
▽ More
Identifying a patient's key problems over time is a common task for providers at the point care, yet a complex and time-consuming activity given current electric health records. To enable a problem-oriented summarizer to identify a patient's comprehensive list of problems and their salience, we propose an unsupervised phenoty** approach that jointly learns a large number of phenotypes/problems across structured and unstructured data. To identify the appropriate granularity of the learned phenotypes, the model is trained on a target patient population of the same clinic. To enable the content organization of a problem-oriented summarizer, the model identifies phenotype relatedness as well. The model leverages a correlated-mixed membership approach with variational inference applied to heterogenous clinical data. In this paper, we focus our experiments on assessing the learned phenotypes and their relatedness as learned from a specific patient population. We ground our experiments in phenoty** patients from an HIV clinic in a large urban care institution (n=7,523), where patients have voluminous, longitudinal documentation, and where providers would benefit from summaries of these patient's medical histories, whether about their HIV or any comorbidities. We find that the learned phenotypes and their relatedness are clinically valid when assessed qualitatively by clinical experts, and that the model surpasses baseline in inferring phenotype-relatedness when comparing to existing expert-curated condition grou**s.
△ Less
Submitted 9 March, 2020;
originally announced March 2020.
-
Indicators of retention in remote digital health studies: A cross-study evaluation of 100,000 participants
Authors:
Abhishek Pratap,
Elias Chaibub Neto,
Phil Snyder,
Carl Stepnowsky,
Noémie Elhadad,
Daniel Grant,
Matthew H. Mohebbi,
Sean Mooney,
Christine Suver,
John Wilbanks,
Lara Mangravite,
Patrick Heagerty,
Pat Arean,
Larsson Omberg
Abstract:
Digital technologies such as smartphones are transforming the way scientists conduct biomedical research using real-world data. Several remotely-conducted studies have recruited thousands of participants over a span of a few months. Unfortunately, these studies are hampered by substantial participant attrition, calling into question the representativeness of the collected data including generaliza…
▽ More
Digital technologies such as smartphones are transforming the way scientists conduct biomedical research using real-world data. Several remotely-conducted studies have recruited thousands of participants over a span of a few months. Unfortunately, these studies are hampered by substantial participant attrition, calling into question the representativeness of the collected data including generalizability of findings from these studies. We report the challenges in retention and recruitment in eight remote digital health studies comprising over 100,000 participants who participated for more than 850,000 days, completing close to 3.5 million remote health evaluations. Survival modeling surfaced several factors significantly associated(P < 1e-16) with increase in median retention time i) Clinician referral(increase of 40 days), ii) Effect of compensation (22 days), iii) Clinical conditions of interest to the study (7 days) and iv) Older adults(4 days). Additionally, four distinct patterns of daily app usage behavior that were also associated(P < 1e-10) with participant demographics were identified. Most studies were not able to recruit a representative sample, either demographically or regionally. Combined together these findings can help inform recruitment and retention strategies to enable equitable participation of populations in future digital health research.
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
Characterizing physiological and symptomatic variation in menstrual cycles using self-tracked mobile health data
Authors:
Kathy Li,
Iñigo Urteaga,
Chris H. Wiggins,
Anna Druet,
Amanda Shea,
Virginia J. Vitzthum,
Noémie Elhadad
Abstract:
The menstrual cycle is a key indicator of overall health for women of reproductive age. Previously, menstruation was primarily studied through survey results; however, as menstrual tracking mobile apps become more widely adopted, they provide an increasingly large, content-rich source of menstrual health experiences and behaviors over time. By exploring a database of user-tracked observations from…
▽ More
The menstrual cycle is a key indicator of overall health for women of reproductive age. Previously, menstruation was primarily studied through survey results; however, as menstrual tracking mobile apps become more widely adopted, they provide an increasingly large, content-rich source of menstrual health experiences and behaviors over time. By exploring a database of user-tracked observations from the Clue app by BioWink of over 378,000 users and 4.9 million natural cycles, we show that self-reported menstrual tracker data can reveal statistically significant relationships between per-person cycle length variability and self-reported qualitative symptoms. A concern for self-tracked data is that they reflect not only physiological behaviors, but also the engagement dynamics of app users. To mitigate such potential artifacts, we develop a procedure to exclude cycles lacking user engagement, thereby allowing us to better distinguish true menstrual patterns from tracking anomalies. We uncover that women located at different ends of the menstrual variability spectrum, based on the consistency of their cycle length statistics, exhibit statistically significant differences in their cycle characteristics and symptom tracking patterns. We also find that cycle and period length statistics are stationary over the app usage timeline across the variability spectrum. The symptoms that we identify as showing statistically significant association with timing data can be useful to clinicians and users for predicting cycle variability from symptoms or as potential health indicators for conditions like endometriosis. Our findings showcase the potential of longitudinal, high-resolution self-tracked data to improve understanding of menstruation and women's health as a whole.
△ Less
Submitted 14 May, 2020; v1 submitted 24 September, 2019;
originally announced September 2019.
-
Multi-Task Gaussian Processes and Dilated Convolutional Networks for Reconstruction of Reproductive Hormonal Dynamics
Authors:
Iñigo Urteaga,
Tristan Bertin,
Theresa M. Hardy,
David J. Albers,
Noémie Elhadad
Abstract:
We present an end-to-end statistical framework for personalized, accurate, and minimally invasive modeling of female reproductive hormonal patterns. Reconstructing and forecasting the evolution of hormonal dynamics is a challenging task, but a critical one to improve general understanding of the menstrual cycle and personalized detection of potential health issues. Our goal is to infer and forecas…
▽ More
We present an end-to-end statistical framework for personalized, accurate, and minimally invasive modeling of female reproductive hormonal patterns. Reconstructing and forecasting the evolution of hormonal dynamics is a challenging task, but a critical one to improve general understanding of the menstrual cycle and personalized detection of potential health issues. Our goal is to infer and forecast individual hormone daily levels over time, while accommodating pragmatic and minimally invasive measurement settings. To that end, our approach combines the power of probabilistic generative models (i.e., multi-task Gaussian processes) with the flexibility of neural networks (i.e., a dilated convolutional architecture) to learn complex temporal map**s. To attain accurate hormone level reconstruction with as little data as possible, we propose a sampling mechanism for optimal reconstruction accuracy with limited sampling budget. Our results show the validity of our proposed hormonal dynamic modeling framework, as it provides accurate predictive performance across different realistic sampling budgets and outperforms baselines methods.
△ Less
Submitted 27 August, 2019;
originally announced August 2019.
-
Machine Learning and Visualization in Clinical Decision Support: Current State and Future Directions
Authors:
Gal Levy-Fix,
Gilad J. Kuperman,
Noémie Elhadad
Abstract:
Deep learning, an area of machine learning, is set to revolutionize patient care. But it is not yet part of standard of care, especially when it comes to individual patient care. In fact, it is unclear to what extent data-driven techniques are being used to support clinical decision making (CDS). Heretofore, there has not been a review of ways in which research in machine learning and other types…
▽ More
Deep learning, an area of machine learning, is set to revolutionize patient care. But it is not yet part of standard of care, especially when it comes to individual patient care. In fact, it is unclear to what extent data-driven techniques are being used to support clinical decision making (CDS). Heretofore, there has not been a review of ways in which research in machine learning and other types of data-driven techniques can contribute effectively to clinical care and the types of support they can bring to clinicians. In this paper, we consider ways in which two data driven domains - machine learning and data visualizations - can contribute to the next generation of clinical decision support systems. We review the literature regarding the ways heuristic knowledge, machine learning, and visualization are - and can be - applied to three types of CDS. There has been substantial research into the use of predictive modeling for alerts, however current CDS systems are not utilizing these methods. Approaches that leverage interactive visualizations and machine-learning inferences to organize and review patient data are gaining popularity but are still at the prototype stage and are not yet in use. CDS systems that could benefit from prescriptive machine learning (e.g., treatment recommendations for specific patients) have not yet been developed. We discuss potential reasons for the lack of deployment of data-driven methods in CDS and directions for future research.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
Phenoty** Endometriosis through Mixed Membership Models of Self-Tracking Data
Authors:
Iñigo Urteaga,
Mollie McKillop,
Sharon Lipsky-Gorman,
Noémie Elhadad
Abstract:
We investigate the use of self-tracking data and unsupervised mixed-membership models to phenotype endometriosis. Endometriosis is a systemic, chronic condition of women in reproductive age and, at the same time, a highly enigmatic condition with no known biomarkers to monitor its progression and no established staging. We leverage data collected through a self-tracking app in an observational res…
▽ More
We investigate the use of self-tracking data and unsupervised mixed-membership models to phenotype endometriosis. Endometriosis is a systemic, chronic condition of women in reproductive age and, at the same time, a highly enigmatic condition with no known biomarkers to monitor its progression and no established staging. We leverage data collected through a self-tracking app in an observational research study of over 2,800 women with endometriosis tracking their condition over a year and a half (456,900 observations overall). We extend a classical mixed-membership model to accommodate the idiosyncrasies of the data at hand (i.e., the multimodality of the tracked variables). Our experiments show that our approach identifies potential subtypes that are robust in terms of biases of self-tracked data (e.g., wide variations in tracking frequency amongst participants), as well as to variations in hyperparameters of the model. Jointly modeling a wide range of observations about participants (symptoms, quality of life, treatments) yields clinically meaningful subtypes that both validate what is already known about endometriosis and suggest new findings.
△ Less
Submitted 6 November, 2018;
originally announced November 2018.
-
Generative Adversarial Networks for Electronic Health Records: A Framework for Exploring and Evaluating Methods for Predicting Drug-Induced Laboratory Test Trajectories
Authors:
Alexandre Yahi,
Rami Vanguri,
Noémie Elhadad,
Nicholas P. Tatonetti
Abstract:
Generative Adversarial Networks (GANs) represent a promising class of generative networks that combine neural networks with game theory. From generating realistic images and videos to assisting musical creation, GANs are transforming many fields of arts and sciences. However, their application to healthcare has not been fully realized, more specifically in generating electronic health records (EHR…
▽ More
Generative Adversarial Networks (GANs) represent a promising class of generative networks that combine neural networks with game theory. From generating realistic images and videos to assisting musical creation, GANs are transforming many fields of arts and sciences. However, their application to healthcare has not been fully realized, more specifically in generating electronic health records (EHR) data. In this paper, we propose a framework for exploring the value of GANs in the context of continuous laboratory time series data. We devise an unsupervised evaluation method that measures the predictive power of synthetic laboratory test time series. Further, we show that when it comes to predicting the impact of drug exposure on laboratory test data, incorporating representation learning of the training cohorts prior to training GAN models is beneficial.
△ Less
Submitted 30 November, 2017;
originally announced December 2017.
-
Towards Personalized Modeling of the Female Hormonal Cycle: Experiments with Mechanistic Models and Gaussian Processes
Authors:
Iñigo Urteaga,
David J. Albers,
Marija Vlajic Wheeler,
Anna Druet,
Hans Raffauf,
Noémie Elhadad
Abstract:
In this paper, we introduce a novel task for machine learning in healthcare, namely personalized modeling of the female hormonal cycle. The motivation for this work is to model the hormonal cycle and predict its phases in time, both for healthy individuals and for those with disorders of the reproductive system. Because there are individual differences in the menstrual cycle, we are particularly i…
▽ More
In this paper, we introduce a novel task for machine learning in healthcare, namely personalized modeling of the female hormonal cycle. The motivation for this work is to model the hormonal cycle and predict its phases in time, both for healthy individuals and for those with disorders of the reproductive system. Because there are individual differences in the menstrual cycle, we are particularly interested in personalized models that can account for individual idiosyncracies, towards identifying phenotypes of menstrual cycles. As a first step, we consider the hormonal cycle as a set of observations through time. We use a previously validated mechanistic model to generate realistic hormonal patterns, and experiment with Gaussian process regression to estimate their values over time. Specifically, we are interested in the feasibility of predicting menstrual cycle phases under varying learning conditions: number of cycles used for training, hormonal measurement noise and sampling rates, and informed vs. agnostic sampling of hormonal measurements. Our results indicate that Gaussian processes can help model the female menstrual cycle. We discuss the implications of our experiments in the context of modeling the female menstrual cycle.
△ Less
Submitted 30 November, 2017;
originally announced December 2017.
-
Deep Survival Analysis
Authors:
Rajesh Ranganath,
Adler Perotte,
Noémie Elhadad,
David Blei
Abstract:
The electronic health record (EHR) provides an unprecedented opportunity to build actionable tools to support physicians at the point of care. In this paper, we investigate survival analysis in the context of EHR data. We introduce deep survival analysis, a hierarchical generative approach to survival analysis. It departs from previous approaches in two primary ways: (1) all observations, includin…
▽ More
The electronic health record (EHR) provides an unprecedented opportunity to build actionable tools to support physicians at the point of care. In this paper, we investigate survival analysis in the context of EHR data. We introduce deep survival analysis, a hierarchical generative approach to survival analysis. It departs from previous approaches in two primary ways: (1) all observations, including covariates, are modeled jointly conditioned on a rich latent structure; and (2) the observations are aligned by their failure time, rather than by an arbitrary time zero as in traditional survival analysis. Further, it (3) scalably handles heterogeneous (continuous and discrete) data types that occur in the EHR. We validate deep survival analysis model by stratifying patients according to risk of develo** coronary heart disease (CHD). Specifically, we study a dataset of 313,000 patients corresponding to 5.5 million months of observations. When compared to the clinically validated Framingham CHD risk score, deep survival analysis is significantly superior in stratifying patients according to their risk.
△ Less
Submitted 18 September, 2016; v1 submitted 6 August, 2016;
originally announced August 2016.