Search | arXiv e-print repository

Variational Shapley Network: A Probabilistic Approach to Self-Explaining Shapley values with Uncertainty Quantification

Authors: Mert Ketenci, Iñigo Urteaga, Victor Alfonso Rodriguez, Noémie Elhadad, Adler Perotte

Abstract: Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes. Despite their widespread adoption and unique ability to satisfy essential explainability axioms, computational challenges persist in their estimation when ($i$) evaluating a model over all possible subset of input feature combinations, ($ii$) estimating model marginals, and… ▽ More Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes. Despite their widespread adoption and unique ability to satisfy essential explainability axioms, computational challenges persist in their estimation when ($i$) evaluating a model over all possible subset of input feature combinations, ($ii$) estimating model marginals, and ($iii$) addressing variability in explanations. We introduce a novel, self-explaining method that simplifies the computation of Shapley values significantly, requiring only a single forward pass. Recognizing the deterministic treatment of Shapley values as a limitation, we explore incorporating a probabilistic framework to capture the inherent uncertainty in explanations. Unlike alternatives, our technique does not rely directly on the observed data space to estimate marginals; instead, it uses adaptable baseline values derived from a latent, feature-specific embedding space, generated by a novel masked neural network architecture. Evaluations on simulated and real datasets underscore our technique's robust predictive and explanatory performance. △ Less

Submitted 6 February, 2024; originally announced February 2024.

arXiv:2311.01660 [pdf, other]

Maximum Likelihood Estimation of Flexible Survival Densities with Importance Sampling

Authors: Mert Ketenci, Shreyas Bhave, Noémie Elhadad, Adler Perotte

Abstract: Survival analysis is a widely-used technique for analyzing time-to-event data in the presence of censoring. In recent years, numerous survival analysis methods have emerged which scale to large datasets and relax traditional assumptions such as proportional hazards. These models, while being performant, are very sensitive to model hyperparameters including: (1) number of bins and bin size for disc… ▽ More Survival analysis is a widely-used technique for analyzing time-to-event data in the presence of censoring. In recent years, numerous survival analysis methods have emerged which scale to large datasets and relax traditional assumptions such as proportional hazards. These models, while being performant, are very sensitive to model hyperparameters including: (1) number of bins and bin size for discrete models and (2) number of cluster assignments for mixture-based models. Each of these choices requires extensive tuning by practitioners to achieve optimal performance. In addition, we demonstrate in empirical studies that: (1) optimal bin size may drastically differ based on the metric of interest (e.g., concordance vs brier score), and (2) mixture models may suffer from mode collapse and numerical instability. We propose a survival analysis approach which eliminates the need to tune hyperparameters such as mixture assignments and bin sizes, reducing the burden on practitioners. We show that the proposed approach matches or outperforms baselines on several real-world datasets. △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2311.01409 [pdf, other]

A Coreset-based, Tempered Variational Posterior for Accurate and Scalable Stochastic Gaussian Process Inference

Authors: Mert Ketenci, Adler Perotte, Noémie Elhadad, Iñigo Urteaga

Abstract: We present a novel stochastic variational Gaussian process ($\mathcal{GP}$) inference method, based on a posterior over a learnable set of weighted pseudo input-output points (coresets). Instead of a free-form variational family, the proposed coreset-based, variational tempered family for $\mathcal{GP}$s (CVTGP) is defined in terms of the $\mathcal{GP}$ prior and the data-likelihood; hence, accomm… ▽ More We present a novel stochastic variational Gaussian process ($\mathcal{GP}$) inference method, based on a posterior over a learnable set of weighted pseudo input-output points (coresets). Instead of a free-form variational family, the proposed coreset-based, variational tempered family for $\mathcal{GP}$s (CVTGP) is defined in terms of the $\mathcal{GP}$ prior and the data-likelihood; hence, accommodating the modeling inductive biases. We derive CVTGP's lower bound for the log-marginal likelihood via marginalization of the proposed posterior over latent $\mathcal{GP}$ coreset variables, and show it is amenable to stochastic optimization. CVTGP reduces the learnable parameter size to $\mathcal{O}(M)$, enjoys numerical stability, and maintains $\mathcal{O}(M^3)$ time- and $\mathcal{O}(M^2)$ space-complexity, by leveraging a coreset-based tempered posterior that, in turn, provides sparse and explainable representations of the data. Results on simulated and real-world regression problems with Gaussian observation noise validate that CVTGP provides better evidence lower-bound estimates and predictive root mean squared error than alternative stochastic $\mathcal{GP}$ inference methods. △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2102.12439 [pdf, other]

A generative, predictive model for menstrual cycle lengths that accounts for potential self-tracking artifacts in mobile health data

Authors: Kathy Li, Iñigo Urteaga, Amanda Shea, Virginia J. Vitzthum, Chris H. Wiggins, Noémie Elhadad

Abstract: Mobile health (mHealth) apps such as menstrual trackers provide a rich source of self-tracked health observations that can be leveraged for health-relevant research. However, such data streams have questionable reliability since they hinge on user adherence to the app. Therefore, it is crucial for researchers to separate true behavior from self-tracking artifacts. By taking a machine learning appr… ▽ More Mobile health (mHealth) apps such as menstrual trackers provide a rich source of self-tracked health observations that can be leveraged for health-relevant research. However, such data streams have questionable reliability since they hinge on user adherence to the app. Therefore, it is crucial for researchers to separate true behavior from self-tracking artifacts. By taking a machine learning approach to modeling self-tracked cycle lengths, we can both make more informed predictions and learn the underlying structure of the observed data. In this work, we propose and evaluate a hierarchical, generative model for predicting next cycle length based on previously-tracked cycle lengths that accounts explicitly for the possibility of users skip** tracking their period. Our model offers several advantages: 1) accounting explicitly for self-tracking artifacts yields better prediction accuracy as likelihood of skip** increases; 2) because it is a generative model, predictions can be updated online as a given cycle evolves, and we can gain interpretable insight into how these predictions change over time; and 3) its hierarchical nature enables modeling of an individual's cycle length history while incorporating population-level information. Our experiments using mHealth cycle length data encompassing over 186,000 menstruators with over 2 million natural menstrual cycles show that our method yields state-of-the-art performance against neural network-based and summary statistic-based baselines, while providing insights on disentangling menstrual patterns from self-tracking artifacts. This work can benefit users, mHealth app developers, and researchers in better understanding cycle patterns and user adherence. △ Less

Submitted 16 March, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

Comments: Extended version of the work presented at the NeurIPS 2020 Machine Learning for Mobile Health Workshop (see https://sites.google.com/view/ml4mobilehealth-neurips-2020/home)

arXiv:2011.06100 [pdf, other]

Exploring Gender Disparities in Time to Diagnosis

Authors: Tony Y. Sun, Oliver J. Bear Don't Walk IV, Jennifer L. Chen, Harry Reyes Nieva, Noémie Elhadad

Abstract: Sex and gender-based healthcare disparities contribute to differences in health outcomes. We focus on time to diagnosis (TTD) by conducting two large-scale, complementary analyses among men and women across 29 phenotypes and 195K patients. We first find that women are consistently more likely to experience a longer TTD than men, even when presenting with the same conditions. We further explore how… ▽ More Sex and gender-based healthcare disparities contribute to differences in health outcomes. We focus on time to diagnosis (TTD) by conducting two large-scale, complementary analyses among men and women across 29 phenotypes and 195K patients. We first find that women are consistently more likely to experience a longer TTD than men, even when presenting with the same conditions. We further explore how TTD disparities affect diagnostic performance between genders, both across and persistent to time, by evaluating gender-agnostic disease classifiers across increasing diagnostic information. In both fairness analyses, the diagnostic process favors men over women, contradicting the previous observation that women may demonstrate relevant symptoms earlier than men. These analyses suggest that TTD is an important yet complex aspect when studying gender disparities, and warrants further investigation. △ Less

Submitted 14 November, 2020; v1 submitted 11 November, 2020; originally announced November 2020.

Comments: Machine Learning for Health (ML4H) at NeurIPS 2020 - Extended Abstract

arXiv:2003.11474 [pdf, other]

Towards Patient Record Summarization Through Joint Phenotype Learning in HIV Patients

Authors: Gal Levy-Fix, Jason Zucker, Konstantin Stojanovic, Noémie Elhadad

Abstract: Identifying a patient's key problems over time is a common task for providers at the point care, yet a complex and time-consuming activity given current electric health records. To enable a problem-oriented summarizer to identify a patient's comprehensive list of problems and their salience, we propose an unsupervised phenoty** approach that jointly learns a large number of phenotypes/problems a… ▽ More Identifying a patient's key problems over time is a common task for providers at the point care, yet a complex and time-consuming activity given current electric health records. To enable a problem-oriented summarizer to identify a patient's comprehensive list of problems and their salience, we propose an unsupervised phenoty** approach that jointly learns a large number of phenotypes/problems across structured and unstructured data. To identify the appropriate granularity of the learned phenotypes, the model is trained on a target patient population of the same clinic. To enable the content organization of a problem-oriented summarizer, the model identifies phenotype relatedness as well. The model leverages a correlated-mixed membership approach with variational inference applied to heterogenous clinical data. In this paper, we focus our experiments on assessing the learned phenotypes and their relatedness as learned from a specific patient population. We ground our experiments in phenoty** patients from an HIV clinic in a large urban care institution (n=7,523), where patients have voluminous, longitudinal documentation, and where providers would benefit from summaries of these patient's medical histories, whether about their HIV or any comorbidities. We find that the learned phenotypes and their relatedness are clinically valid when assessed qualitatively by clinical experts, and that the model surpasses baseline in inferring phenotype-relatedness when comparing to existing expert-curated condition grou**s. △ Less

Submitted 9 March, 2020; originally announced March 2020.

arXiv:1910.01165 [pdf]

Indicators of retention in remote digital health studies: A cross-study evaluation of 100,000 participants

Authors: Abhishek Pratap, Elias Chaibub Neto, Phil Snyder, Carl Stepnowsky, Noémie Elhadad, Daniel Grant, Matthew H. Mohebbi, Sean Mooney, Christine Suver, John Wilbanks, Lara Mangravite, Patrick Heagerty, Pat Arean, Larsson Omberg

Abstract: Digital technologies such as smartphones are transforming the way scientists conduct biomedical research using real-world data. Several remotely-conducted studies have recruited thousands of participants over a span of a few months. Unfortunately, these studies are hampered by substantial participant attrition, calling into question the representativeness of the collected data including generaliza… ▽ More Digital technologies such as smartphones are transforming the way scientists conduct biomedical research using real-world data. Several remotely-conducted studies have recruited thousands of participants over a span of a few months. Unfortunately, these studies are hampered by substantial participant attrition, calling into question the representativeness of the collected data including generalizability of findings from these studies. We report the challenges in retention and recruitment in eight remote digital health studies comprising over 100,000 participants who participated for more than 850,000 days, completing close to 3.5 million remote health evaluations. Survival modeling surfaced several factors significantly associated(P < 1e-16) with increase in median retention time i) Clinician referral(increase of 40 days), ii) Effect of compensation (22 days), iii) Clinical conditions of interest to the study (7 days) and iv) Older adults(4 days). Additionally, four distinct patterns of daily app usage behavior that were also associated(P < 1e-10) with participant demographics were identified. Most studies were not able to recruit a representative sample, either demographically or regionally. Combined together these findings can help inform recruitment and retention strategies to enable equitable participation of populations in future digital health research. △ Less

Submitted 2 October, 2019; originally announced October 2019.

arXiv:1909.11211 [pdf, other]

Characterizing physiological and symptomatic variation in menstrual cycles using self-tracked mobile health data

Authors: Kathy Li, Iñigo Urteaga, Chris H. Wiggins, Anna Druet, Amanda Shea, Virginia J. Vitzthum, Noémie Elhadad

Abstract: The menstrual cycle is a key indicator of overall health for women of reproductive age. Previously, menstruation was primarily studied through survey results; however, as menstrual tracking mobile apps become more widely adopted, they provide an increasingly large, content-rich source of menstrual health experiences and behaviors over time. By exploring a database of user-tracked observations from… ▽ More The menstrual cycle is a key indicator of overall health for women of reproductive age. Previously, menstruation was primarily studied through survey results; however, as menstrual tracking mobile apps become more widely adopted, they provide an increasingly large, content-rich source of menstrual health experiences and behaviors over time. By exploring a database of user-tracked observations from the Clue app by BioWink of over 378,000 users and 4.9 million natural cycles, we show that self-reported menstrual tracker data can reveal statistically significant relationships between per-person cycle length variability and self-reported qualitative symptoms. A concern for self-tracked data is that they reflect not only physiological behaviors, but also the engagement dynamics of app users. To mitigate such potential artifacts, we develop a procedure to exclude cycles lacking user engagement, thereby allowing us to better distinguish true menstrual patterns from tracking anomalies. We uncover that women located at different ends of the menstrual variability spectrum, based on the consistency of their cycle length statistics, exhibit statistically significant differences in their cycle characteristics and symptom tracking patterns. We also find that cycle and period length statistics are stationary over the app usage timeline across the variability spectrum. The symptoms that we identify as showing statistically significant association with timing data can be useful to clinicians and users for predicting cycle variability from symptoms or as potential health indicators for conditions like endometriosis. Our findings showcase the potential of longitudinal, high-resolution self-tracked data to improve understanding of menstruation and women's health as a whole. △ Less

Submitted 14 May, 2020; v1 submitted 24 September, 2019; originally announced September 2019.

Comments: The Supplementary Information for this work, as well as the code required for data pre-processing and producing results is available in https://github.com/iurteaga/menstrual_cycle_analysis

arXiv:1908.10226 [pdf, other]

Multi-Task Gaussian Processes and Dilated Convolutional Networks for Reconstruction of Reproductive Hormonal Dynamics

Authors: Iñigo Urteaga, Tristan Bertin, Theresa M. Hardy, David J. Albers, Noémie Elhadad

Abstract: We present an end-to-end statistical framework for personalized, accurate, and minimally invasive modeling of female reproductive hormonal patterns. Reconstructing and forecasting the evolution of hormonal dynamics is a challenging task, but a critical one to improve general understanding of the menstrual cycle and personalized detection of potential health issues. Our goal is to infer and forecas… ▽ More We present an end-to-end statistical framework for personalized, accurate, and minimally invasive modeling of female reproductive hormonal patterns. Reconstructing and forecasting the evolution of hormonal dynamics is a challenging task, but a critical one to improve general understanding of the menstrual cycle and personalized detection of potential health issues. Our goal is to infer and forecast individual hormone daily levels over time, while accommodating pragmatic and minimally invasive measurement settings. To that end, our approach combines the power of probabilistic generative models (i.e., multi-task Gaussian processes) with the flexibility of neural networks (i.e., a dilated convolutional architecture) to learn complex temporal map**s. To attain accurate hormone level reconstruction with as little data as possible, we propose a sampling mechanism for optimal reconstruction accuracy with limited sampling budget. Our results show the validity of our proposed hormonal dynamic modeling framework, as it provides accurate predictive performance across different realistic sampling budgets and outperforms baselines methods. △ Less

Submitted 27 August, 2019; originally announced August 2019.

Comments: Accepted and presented in Machine Learning for Healthcare 2019

arXiv:1906.02664 [pdf]

Machine Learning and Visualization in Clinical Decision Support: Current State and Future Directions

Authors: Gal Levy-Fix, Gilad J. Kuperman, Noémie Elhadad

Abstract: Deep learning, an area of machine learning, is set to revolutionize patient care. But it is not yet part of standard of care, especially when it comes to individual patient care. In fact, it is unclear to what extent data-driven techniques are being used to support clinical decision making (CDS). Heretofore, there has not been a review of ways in which research in machine learning and other types… ▽ More Deep learning, an area of machine learning, is set to revolutionize patient care. But it is not yet part of standard of care, especially when it comes to individual patient care. In fact, it is unclear to what extent data-driven techniques are being used to support clinical decision making (CDS). Heretofore, there has not been a review of ways in which research in machine learning and other types of data-driven techniques can contribute effectively to clinical care and the types of support they can bring to clinicians. In this paper, we consider ways in which two data driven domains - machine learning and data visualizations - can contribute to the next generation of clinical decision support systems. We review the literature regarding the ways heuristic knowledge, machine learning, and visualization are - and can be - applied to three types of CDS. There has been substantial research into the use of predictive modeling for alerts, however current CDS systems are not utilizing these methods. Approaches that leverage interactive visualizations and machine-learning inferences to organize and review patient data are gaining popularity but are still at the prototype stage and are not yet in use. CDS systems that could benefit from prescriptive machine learning (e.g., treatment recommendations for specific patients) have not yet been developed. We discuss potential reasons for the lack of deployment of data-driven methods in CDS and directions for future research. △ Less

Submitted 6 June, 2019; originally announced June 2019.

arXiv:1811.03431 [pdf, other]

Phenoty** Endometriosis through Mixed Membership Models of Self-Tracking Data

Authors: Iñigo Urteaga, Mollie McKillop, Sharon Lipsky-Gorman, Noémie Elhadad

Abstract: We investigate the use of self-tracking data and unsupervised mixed-membership models to phenotype endometriosis. Endometriosis is a systemic, chronic condition of women in reproductive age and, at the same time, a highly enigmatic condition with no known biomarkers to monitor its progression and no established staging. We leverage data collected through a self-tracking app in an observational res… ▽ More We investigate the use of self-tracking data and unsupervised mixed-membership models to phenotype endometriosis. Endometriosis is a systemic, chronic condition of women in reproductive age and, at the same time, a highly enigmatic condition with no known biomarkers to monitor its progression and no established staging. We leverage data collected through a self-tracking app in an observational research study of over 2,800 women with endometriosis tracking their condition over a year and a half (456,900 observations overall). We extend a classical mixed-membership model to accommodate the idiosyncrasies of the data at hand (i.e., the multimodality of the tracked variables). Our experiments show that our approach identifies potential subtypes that are robust in terms of biases of self-tracked data (e.g., wide variations in tracking frequency amongst participants), as well as to variations in hyperparameters of the model. Jointly modeling a wide range of observations about participants (symptoms, quality of life, treatments) yields clinically meaningful subtypes that both validate what is already known about endometriosis and suggest new findings. △ Less

Submitted 6 November, 2018; originally announced November 2018.

Comments: As presented in Machine Learning for Healthcare 2018, https://www.mlforhc.org/2018-conference/

arXiv:1712.00164 [pdf, other]

Generative Adversarial Networks for Electronic Health Records: A Framework for Exploring and Evaluating Methods for Predicting Drug-Induced Laboratory Test Trajectories

Authors: Alexandre Yahi, Rami Vanguri, Noémie Elhadad, Nicholas P. Tatonetti

Abstract: Generative Adversarial Networks (GANs) represent a promising class of generative networks that combine neural networks with game theory. From generating realistic images and videos to assisting musical creation, GANs are transforming many fields of arts and sciences. However, their application to healthcare has not been fully realized, more specifically in generating electronic health records (EHR… ▽ More Generative Adversarial Networks (GANs) represent a promising class of generative networks that combine neural networks with game theory. From generating realistic images and videos to assisting musical creation, GANs are transforming many fields of arts and sciences. However, their application to healthcare has not been fully realized, more specifically in generating electronic health records (EHR) data. In this paper, we propose a framework for exploring the value of GANs in the context of continuous laboratory time series data. We devise an unsupervised evaluation method that measures the predictive power of synthetic laboratory test time series. Further, we show that when it comes to predicting the impact of drug exposure on laboratory test data, incorporating representation learning of the training cohorts prior to training GAN models is beneficial. △ Less

Submitted 30 November, 2017; originally announced December 2017.

Comments: NIPS ML4H 2017

arXiv:1712.00117 [pdf, other]

Towards Personalized Modeling of the Female Hormonal Cycle: Experiments with Mechanistic Models and Gaussian Processes

Authors: Iñigo Urteaga, David J. Albers, Marija Vlajic Wheeler, Anna Druet, Hans Raffauf, Noémie Elhadad

Abstract: In this paper, we introduce a novel task for machine learning in healthcare, namely personalized modeling of the female hormonal cycle. The motivation for this work is to model the hormonal cycle and predict its phases in time, both for healthy individuals and for those with disorders of the reproductive system. Because there are individual differences in the menstrual cycle, we are particularly i… ▽ More In this paper, we introduce a novel task for machine learning in healthcare, namely personalized modeling of the female hormonal cycle. The motivation for this work is to model the hormonal cycle and predict its phases in time, both for healthy individuals and for those with disorders of the reproductive system. Because there are individual differences in the menstrual cycle, we are particularly interested in personalized models that can account for individual idiosyncracies, towards identifying phenotypes of menstrual cycles. As a first step, we consider the hormonal cycle as a set of observations through time. We use a previously validated mechanistic model to generate realistic hormonal patterns, and experiment with Gaussian process regression to estimate their values over time. Specifically, we are interested in the feasibility of predicting menstrual cycle phases under varying learning conditions: number of cycles used for training, hormonal measurement noise and sampling rates, and informed vs. agnostic sampling of hormonal measurements. Our results indicate that Gaussian processes can help model the female menstrual cycle. We discuss the implications of our experiments in the context of modeling the female menstrual cycle. △ Less

Submitted 30 November, 2017; originally announced December 2017.

Comments: Accepted at NIPS 2017 Workshop on Machine Learning for Health (https://ml4health.github.io/2017/)

arXiv:1608.02158 [pdf, other]

Deep Survival Analysis

Authors: Rajesh Ranganath, Adler Perotte, Noémie Elhadad, David Blei

Abstract: The electronic health record (EHR) provides an unprecedented opportunity to build actionable tools to support physicians at the point of care. In this paper, we investigate survival analysis in the context of EHR data. We introduce deep survival analysis, a hierarchical generative approach to survival analysis. It departs from previous approaches in two primary ways: (1) all observations, includin… ▽ More The electronic health record (EHR) provides an unprecedented opportunity to build actionable tools to support physicians at the point of care. In this paper, we investigate survival analysis in the context of EHR data. We introduce deep survival analysis, a hierarchical generative approach to survival analysis. It departs from previous approaches in two primary ways: (1) all observations, including covariates, are modeled jointly conditioned on a rich latent structure; and (2) the observations are aligned by their failure time, rather than by an arbitrary time zero as in traditional survival analysis. Further, it (3) scalably handles heterogeneous (continuous and discrete) data types that occur in the EHR. We validate deep survival analysis model by stratifying patients according to risk of develo** coronary heart disease (CHD). Specifically, we study a dataset of 313,000 patients corresponding to 5.5 million months of observations. When compared to the clinically validated Framingham CHD risk score, deep survival analysis is significantly superior in stratifying patients according to their risk. △ Less

Submitted 18 September, 2016; v1 submitted 6 August, 2016; originally announced August 2016.

Comments: Presented at 2016 Machine Learning and Healthcare Conference (MLHC 2016), Los Angeles, CA

Showing 1–14 of 14 results for author: Elhadad, N