-
A Novel Two-level Causal Inference Framework for On-road Vehicle Quality Issues Diagnosis
Authors:
Qian Wang,
Huanyi Shui,
Thi Tu Trinh Tran,
Milad Zafar Nezhad,
Devesh Upadhyay,
Kamran Paynabar,
Anqi He
Abstract:
In the automotive industry, the full cycle of managing in-use vehicle quality issues can take weeks to investigate. The process involves isolating root causes, defining and implementing appropriate treatments, and refining treatments if needed. The main pain-point is the lack of a systematic method to identify causal relationships, evaluate treatment effectiveness, and direct the next actionable t…
▽ More
In the automotive industry, the full cycle of managing in-use vehicle quality issues can take weeks to investigate. The process involves isolating root causes, defining and implementing appropriate treatments, and refining treatments if needed. The main pain-point is the lack of a systematic method to identify causal relationships, evaluate treatment effectiveness, and direct the next actionable treatment if the current treatment was deemed ineffective. This paper will show how we leverage causal Machine Learning (ML) to speed up such processes. A real-word data set collected from on-road vehicles will be used to demonstrate the proposed framework. Open challenges for vehicle quality applications will also be discussed.
△ Less
Submitted 31 March, 2023;
originally announced April 2023.
-
Representation Learning with Autoencoders for Electronic Health Records: A Comparative Study
Authors:
Najibesadat Sadati,
Milad Zafar Nezhad,
Ratna Babu Chinnam,
Dongxiao Zhu
Abstract:
Increasing volume of Electronic Health Records (EHR) in recent years provides great opportunities for data scientists to collaborate on different aspects of healthcare research by applying advanced analytics to these EHR clinical data. A key requirement however is obtaining meaningful insights from high dimensional, sparse and complex clinical data. Data science approaches typically address this c…
▽ More
Increasing volume of Electronic Health Records (EHR) in recent years provides great opportunities for data scientists to collaborate on different aspects of healthcare research by applying advanced analytics to these EHR clinical data. A key requirement however is obtaining meaningful insights from high dimensional, sparse and complex clinical data. Data science approaches typically address this challenge by performing feature learning in order to build more reliable and informative feature representations from clinical data followed by supervised learning. In this paper, we propose a predictive modeling approach based on deep learning based feature representations and word embedding techniques. Our method uses different deep architectures (stacked sparse autoencoders, deep belief network, adversarial autoencoders and variational autoencoders) for feature representation in higher-level abstraction to obtain effective and robust features from EHRs, and then build prediction models on top of them. Our approach is particularly useful when the unlabeled data is abundant whereas labeled data is scarce. We investigate the performance of representation learning through a supervised learning approach. Our focus is to present a comparative study to evaluate the performance of different deep architectures through supervised learning and provide insights in the choice of deep feature representation techniques. Our experiments demonstrate that for small data sets, stacked sparse autoencoder demonstrates a superior generality performance in prediction due to sparsity regularization whereas variational autoencoders outperform the competing approaches for large data sets due to its capability of learning the representation distribution
△ Less
Submitted 19 September, 2019; v1 submitted 24 August, 2019;
originally announced August 2019.
-
A Deep Active Survival Analysis Approach for Precision Treatment Recommendations: Application of Prostate Cancer
Authors:
Milad Zafar Nezhad,
Najibesadat Sadati,
Kai Yang,
Dongxiao Zhu
Abstract:
Survival analysis has been developed and applied in the number of areas including manufacturing, finance, economics and healthcare. In healthcare domain, usually clinical data are high-dimensional, sparse and complex and sometimes there exists few amount of time-to-event (labeled) instances. Therefore building an accurate survival model from electronic health records is challenging. With this moti…
▽ More
Survival analysis has been developed and applied in the number of areas including manufacturing, finance, economics and healthcare. In healthcare domain, usually clinical data are high-dimensional, sparse and complex and sometimes there exists few amount of time-to-event (labeled) instances. Therefore building an accurate survival model from electronic health records is challenging. With this motivation, we address this issue and provide a new survival analysis framework using deep learning and active learning with a novel sampling strategy. First, our approach provides better representation with lower dimensions from clinical features using labeled (time-to-event) and unlabeled (censored) instances and then actively trains the survival model by labeling the censored data using an oracle. As a clinical assistive tool, we introduce a simple effective treatment recommendation approach based on our survival model. In the experimental study, we apply our approach on SEER-Medicare data related to prostate cancer among African-Americans and white patients. The results indicate that our approach outperforms significantly than baseline models.
△ Less
Submitted 9 April, 2018;
originally announced April 2018.
-
Representation Learning with Autoencoders for Electronic Health Records: A Comparative Study
Authors:
Najibesadat Sadati,
Milad Zafar Nezhad,
Ratna Babu Chinnam,
Dongxiao Zhu
Abstract:
Increasing volume of Electronic Health Records (EHR) in recent years provides great opportunities for data scientists to collaborate on different aspects of healthcare research by applying advanced analytics to these EHR clinical data. A key requirement however is obtaining meaningful insights from high dimensional, sparse and complex clinical data. Data science approaches typically address this c…
▽ More
Increasing volume of Electronic Health Records (EHR) in recent years provides great opportunities for data scientists to collaborate on different aspects of healthcare research by applying advanced analytics to these EHR clinical data. A key requirement however is obtaining meaningful insights from high dimensional, sparse and complex clinical data. Data science approaches typically address this challenge by performing feature learning in order to build more reliable and informative feature representations from clinical data followed by supervised learning. In this paper, we propose a predictive modeling approach based on deep learning based feature representations and word embedding techniques. Our method uses different deep architectures (stacked sparse autoencoders, deep belief network, adversarial autoencoders and variational autoencoders) for feature representation in higher-level abstraction to obtain effective and robust features from EHRs, and then build prediction models on top of them. Our approach is particularly useful when the unlabeled data is abundant whereas labeled data is scarce. We investigate the performance of representation learning through a supervised learning approach. Our focus is to present a comparative study to evaluate the performance of different deep architectures through supervised learning and provide insights in the choice of deep feature representation techniques. Our experiments demonstrate that for small data sets, stacked sparse autoencoder demonstrates a superior generality performance in prediction due to sparsity regularization whereas variational autoencoders outperform the competing approaches for large data sets due to its capability of learning the representation distribution.
△ Less
Submitted 29 September, 2019; v1 submitted 6 January, 2018;
originally announced January 2018.
-
SUBIC: A Supervised Bi-Clustering Approach for Precision Medicine
Authors:
Milad Zafar Nezhad,
Dongxiao Zhu,
Najibesadat Sadati,
Kai Yang,
Phillip Levy
Abstract:
Traditional medicine typically applies one-size-fits-all treatment for the entire patient population whereas precision medicine develops tailored treatment schemes for different patient subgroups. The fact that some factors may be more significant for a specific patient subgroup motivates clinicians and medical researchers to develop new approaches to subgroup detection and analysis, which is an e…
▽ More
Traditional medicine typically applies one-size-fits-all treatment for the entire patient population whereas precision medicine develops tailored treatment schemes for different patient subgroups. The fact that some factors may be more significant for a specific patient subgroup motivates clinicians and medical researchers to develop new approaches to subgroup detection and analysis, which is an effective strategy to personalize treatment. In this study, we propose a novel patient subgroup detection method, called Supervised Biclustring (SUBIC) using convex optimization and apply our approach to detect patient subgroups and prioritize risk factors for hypertension (HTN) in a vulnerable demographic subgroup (African-American). Our approach not only finds patient subgroups with guidance of a clinically relevant target variable but also identifies and prioritizes risk factors by pursuing sparsity of the input variables and encouraging similarity among the input variables and between the input and target variables
△ Less
Submitted 26 September, 2017;
originally announced September 2017.
-
Observational Data-Driven Modeling and Optimization of Manufacturing Processes
Authors:
Najibesadat Sadati,
Ratna Babu Chinnam,
Milad Zafar Nezhad
Abstract:
The dramatic increase of observational data across industries provides unparalleled opportunities for data-driven decision making and management, including the manufacturing industry. In the context of production, data-driven approaches can exploit observational data to model, control and improve the process performance. When supplied by observational data with adequate coverage to inform the true…
▽ More
The dramatic increase of observational data across industries provides unparalleled opportunities for data-driven decision making and management, including the manufacturing industry. In the context of production, data-driven approaches can exploit observational data to model, control and improve the process performance. When supplied by observational data with adequate coverage to inform the true process performance dynamics, they can overcome the cost associated with intrusive controlled designed experiments and can be applied for both monitoring and improving process quality. We propose a novel integrated approach that uses observational data for process parameter design while simultaneously identifying the significant control variables. We evaluate our method using simulated experiments and also apply it to a real-world case setting from a tire manufacturing company.
△ Less
Submitted 16 September, 2017; v1 submitted 17 May, 2017;
originally announced May 2017.
-
SAFS: A Deep Feature Selection Approach for Precision Medicine
Authors:
Milad Zafar Nezhad,
Dongxiao Zhu,
Xiangrui Li,
Kai Yang,
Phillip Levy
Abstract:
In this paper, we propose a new deep feature selection method based on deep architecture. Our method uses stacked auto-encoders for feature representation in higher-level abstraction. We developed and applied a novel feature learning approach to a specific precision medicine problem, which focuses on assessing and prioritizing risk factors for hypertension (HTN) in a vulnerable demographic subgrou…
▽ More
In this paper, we propose a new deep feature selection method based on deep architecture. Our method uses stacked auto-encoders for feature representation in higher-level abstraction. We developed and applied a novel feature learning approach to a specific precision medicine problem, which focuses on assessing and prioritizing risk factors for hypertension (HTN) in a vulnerable demographic subgroup (African-American). Our approach is to use deep learning to identify significant risk factors affecting left ventricular mass indexed to body surface area (LVMI) as an indicator of heart damage risk. The results show that our feature learning and representation approach leads to better results in comparison with others.
△ Less
Submitted 19 April, 2017;
originally announced April 2017.