-
Handling Missing Data with Graph Representation Learning
Authors:
Jiaxuan You,
Xiaobai Ma,
Daisy Yi Ding,
Mykel Kochenderfer,
Jure Leskovec
Abstract:
Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values, and label prediction where downstream labels are learned directly from incomplete data. However, existing imputation models tend to have strong prior assumptions and cannot learn from downstream tasks, while models targeting…
▽ More
Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values, and label prediction where downstream labels are learned directly from incomplete data. However, existing imputation models tend to have strong prior assumptions and cannot learn from downstream tasks, while models targeting label prediction often involve heuristics and can encounter scalability issues. Here we propose GRAPE, a graph-based framework for feature imputation as well as label prediction. GRAPE tackles the missing data problem using a graph representation, where the observations and features are viewed as two types of nodes in a bipartite graph, and the observed feature values as edges. Under the GRAPE framework, the feature imputation is formulated as an edge-level prediction task and the label prediction as a node-level prediction task. These tasks are then solved with Graph Neural Networks. Experimental results on nine benchmark datasets show that GRAPE yields 20% lower mean absolute error for imputation tasks and 10% lower for label prediction tasks, compared with existing state-of-the-art methods.
△ Less
Submitted 30 October, 2020;
originally announced October 2020.
-
NGBoost: Natural Gradient Boosting for Probabilistic Prediction
Authors:
Tony Duan,
Anand Avati,
Daisy Yi Ding,
Khanh K. Thai,
Sanjay Basu,
Andrew Y. Ng,
Alejandro Schuler
Abstract:
We present Natural Gradient Boosting (NGBoost), an algorithm for generic probabilistic prediction via gradient boosting. Typical regression models return a point estimate, conditional on covariates, but probabilistic regression models output a full probability distribution over the outcome space, conditional on the covariates. This allows for predictive uncertainty estimation -- crucial in applica…
▽ More
We present Natural Gradient Boosting (NGBoost), an algorithm for generic probabilistic prediction via gradient boosting. Typical regression models return a point estimate, conditional on covariates, but probabilistic regression models output a full probability distribution over the outcome space, conditional on the covariates. This allows for predictive uncertainty estimation -- crucial in applications like healthcare and weather forecasting. NGBoost generalizes gradient boosting to probabilistic regression by treating the parameters of the conditional distribution as targets for a multiparameter boosting algorithm. Furthermore, we show how the Natural Gradient is required to correct the training dynamics of our multiparameter boosting approach. NGBoost can be used with any base learner, any family of distributions with continuous parameters, and any scoring rule. NGBoost matches or exceeds the performance of existing methods for probabilistic prediction while offering additional benefits in flexibility, scalability, and usability. An open-source implementation is available at github.com/stanfordmlgroup/ngboost.
△ Less
Submitted 9 June, 2020; v1 submitted 8 October, 2019;
originally announced October 2019.
-
Counterfactual Reasoning for Fair Clinical Risk Prediction
Authors:
Stephen Pfohl,
Tony Duan,
Daisy Yi Ding,
Nigam H. Shah
Abstract:
The use of machine learning systems to support decision making in healthcare raises questions as to what extent these systems may introduce or exacerbate disparities in care for historically underrepresented and mistreated groups, due to biases implicitly embedded in observational data in electronic health records. To address this problem in the context of clinical risk prediction models, we devel…
▽ More
The use of machine learning systems to support decision making in healthcare raises questions as to what extent these systems may introduce or exacerbate disparities in care for historically underrepresented and mistreated groups, due to biases implicitly embedded in observational data in electronic health records. To address this problem in the context of clinical risk prediction models, we develop an augmented counterfactual fairness criteria to extend the group fairness criteria of equalized odds to an individual level. We do so by requiring that the same prediction be made for a patient, and a counterfactual patient resulting from changing a sensitive attribute, if the factual and counterfactual outcomes do not differ. We investigate the extent to which the augmented counterfactual fairness criteria may be applied to develop fair models for prolonged inpatient length of stay and mortality with observational electronic health records data. As the fairness criteria is ill-defined without knowledge of the data generating process, we use a variational autoencoder to perform counterfactual inference in the context of an assumed causal graph. While our technique provides a means to trade off maintenance of fairness with reduction in predictive performance in the context of a learned generative model, further work is needed to assess the generality of this approach.
△ Less
Submitted 14 July, 2019;
originally announced July 2019.
-
Learning to Summarize Radiology Findings
Authors:
Yuhao Zhang,
Daisy Yi Ding,
Tianpei Qian,
Christopher D. Manning,
Curtis P. Langlotz
Abstract:
The Impression section of a radiology report summarizes crucial radiology findings in natural language and plays a central role in communicating these findings to physicians. However, the process of generating impressions by summarizing findings is time-consuming for radiologists and prone to errors. We propose to automate the generation of radiology impressions with neural sequence-to-sequence le…
▽ More
The Impression section of a radiology report summarizes crucial radiology findings in natural language and plays a central role in communicating these findings to physicians. However, the process of generating impressions by summarizing findings is time-consuming for radiologists and prone to errors. We propose to automate the generation of radiology impressions with neural sequence-to-sequence learning. We further propose a customized neural model for this task which learns to encode the study background information and use this information to guide the decoding process. On a large dataset of radiology reports collected from actual hospital studies, our model outperforms existing non-neural and neural baselines under the ROUGE metrics. In a blind experiment, a board-certified radiologist indicated that 67% of sampled system summaries are at least as good as the corresponding human-written summaries, suggesting significant clinical validity. To our knowledge our work represents the first attempt in this direction.
△ Less
Submitted 8 October, 2018; v1 submitted 12 September, 2018;
originally announced September 2018.
-
The Effectiveness of Multitask Learning for Phenoty** with Electronic Health Records Data
Authors:
Daisy Yi Ding,
ChloƩ Simpson,
Stephen Pfohl,
Dave C. Kale,
Kenneth Jung,
Nigam H. Shah
Abstract:
Electronic phenoty** is the task of ascertaining whether an individual has a medical condition of interest by analyzing their medical record and is foundational in clinical informatics. Increasingly, electronic phenoty** is performed via supervised learning. We investigate the effectiveness of multitask learning for phenoty** using electronic health records (EHR) data. Multitask learning aim…
▽ More
Electronic phenoty** is the task of ascertaining whether an individual has a medical condition of interest by analyzing their medical record and is foundational in clinical informatics. Increasingly, electronic phenoty** is performed via supervised learning. We investigate the effectiveness of multitask learning for phenoty** using electronic health records (EHR) data. Multitask learning aims to improve model performance on a target task by jointly learning additional auxiliary tasks and has been used in disparate areas of machine learning. However, its utility when applied to EHR data has not been established, and prior work suggests that its benefits are inconsistent. We present experiments that elucidate when multitask learning with neural nets improves performance for phenoty** using EHR data relative to neural nets trained for a single phenotype and to well-tuned logistic regression baselines. We find that multitask neural nets consistently outperform single-task neural nets for rare phenotypes but underperform for relatively more common phenotypes. The effect size increases as more auxiliary tasks are added. Moreover, multitask learning reduces the sensitivity of neural nets to hyperparameter settings for rare phenotypes. Last, we quantify phenotype complexity and find that neural nets trained with or without multitask learning do not improve on simple baselines unless the phenotypes are sufficiently complex.
△ Less
Submitted 5 January, 2019; v1 submitted 9 August, 2018;
originally announced August 2018.
-
Flexible Multiple Base Station Association and Activation for Downlink Heterogeneous Networks
Authors:
Kaiming Shen,
Ya-Feng Liu,
David Yiwei Ding,
Wei Yu
Abstract:
This letter shows that the flexible association of possibly multiple base stations (BSs) to each user over multiple frequency bands, along with the joint optimization of BS transmit power that encourages the turning-off of the BSs at off-peak time, can significantly improve the performance of a downlink heterogeneous wireless cellular network. We propose a gradient projection algorithm for optimiz…
▽ More
This letter shows that the flexible association of possibly multiple base stations (BSs) to each user over multiple frequency bands, along with the joint optimization of BS transmit power that encourages the turning-off of the BSs at off-peak time, can significantly improve the performance of a downlink heterogeneous wireless cellular network. We propose a gradient projection algorithm for optimizing BS association and an iteratively reweighting scheme together with a novel proximal gradient method for optimizing power in order to find the optimal tradeoff between network utility and power consumption. Simulation results reveal significant performance improvement as compared to the conventional single-BS association.
△ Less
Submitted 8 August, 2017;
originally announced August 2017.
-
Well Placement Optimization under Uncertainty with CMA-ES Using the Neighborhood
Authors:
Zyed Bouzarkouna,
Didier Yu Ding,
Anne Auger
Abstract:
In the well placement problem, as well as in other field development optimization problems, geological uncertainty is a key source of risk affecting the viability of field development projects. Well placement problems under geological uncertainty are formulated as optimization problems in which the objective function is evaluated using a reservoir simulator on a number of possible geological reali…
▽ More
In the well placement problem, as well as in other field development optimization problems, geological uncertainty is a key source of risk affecting the viability of field development projects. Well placement problems under geological uncertainty are formulated as optimization problems in which the objective function is evaluated using a reservoir simulator on a number of possible geological realizations. In this paper, we present a new approach to handle geological uncertainty for the well placement problem with a reduced number of reservoir simulations. The proposed approach uses already simulated well configurations in the neighborhood of each well configuration for the objective function evaluation. We use thus only one single reservoir simulation performed on a randomly chosen realization together with the neighborhood to estimate the objective function instead of using multiple simulations on multiple realizations. This approach is combined with the stochastic optimizer CMA-ES. The proposed approach is shown on the benchmark reservoir case PUNQ-S3 to be able to capture the geological uncertainty using a smaller number of reservoir simulations. This approach is compared to the reference approach using all the possible realizations for each well configuration, and shown to be able to reduce significantly the number of reservoir simulations (around 80%).
△ Less
Submitted 4 September, 2012;
originally announced September 2012.
-
Using Evolution Strategy with Meta-models for Well Placement Optimization
Authors:
Zyed Bouzarkouna,
Didier Yu Ding,
Anne Auger
Abstract:
Optimum implementation of non-conventional wells allows us to increase considerably hydrocarbon recovery. By considering the high drilling cost and the potential improvement in well productivity, well placement decision is an important issue in field development. Considering complex reservoir geology and high reservoir heterogeneities, stochastic optimization methods are the most suitable approach…
▽ More
Optimum implementation of non-conventional wells allows us to increase considerably hydrocarbon recovery. By considering the high drilling cost and the potential improvement in well productivity, well placement decision is an important issue in field development. Considering complex reservoir geology and high reservoir heterogeneities, stochastic optimization methods are the most suitable approaches for optimum well placement. This paper proposes an optimization methodology to determine optimal well location and trajectory based upon the Covariance Matrix Adaptation - Evolution Strategy (CMA-ES) which is a variant of Evolution Strategies recognized as one of the most powerful derivative-free optimizers for continuous optimization. To improve the optimization procedure, two new techniques are investigated: (1). Adaptive penalization with rejection is developed to handle well placement constraints. (2). A meta-model, based on locally weighted regression, is incorporated into CMA-ES using an approximate ranking procedure. Therefore, we can reduce the number of reservoir simulations, which are computationally expensive. Several examples are presented. Our new approach is compared with a Genetic Algorithm incorporating the Genocop III technique. It is shown that our approach outperforms the genetic algorithm: it leads in general to both a higher NPV and a significant reduction of the number of reservoir simulations.
△ Less
Submitted 24 November, 2010;
originally announced November 2010.