Search | arXiv e-print repository

Interpretable Machine Learning based on Functional ANOVA Framework: Algorithms and Comparisons

Authors: Linwei Hu, Vijayan N. Nair, Agus Sudjianto, Aijun Zhang, Jie Chen

Abstract: In the early days of machine learning (ML), the emphasis was on develo** complex algorithms to achieve best predictive performance. To understand and explain the model results, one had to rely on post hoc explainability techniques, which are known to have limitations. Recently, with the recognition that interpretability is just as important, researchers are compromising on small increases in pre… ▽ More In the early days of machine learning (ML), the emphasis was on develo** complex algorithms to achieve best predictive performance. To understand and explain the model results, one had to rely on post hoc explainability techniques, which are known to have limitations. Recently, with the recognition that interpretability is just as important, researchers are compromising on small increases in predictive performance to develop algorithms that are inherently interpretable. While doing so, the ML community has rediscovered the use of low-order functional ANOVA (fANOVA) models that have been known in the statistical literature for some time. This paper starts with a description of challenges with post hoc explainability and reviews the fANOVA framework with a focus on main effects and second-order interactions. This is followed by an overview of two recently developed techniques: Explainable Boosting Machines or EBM (Lou et al., 2013) and GAMI-Net (Yang et al., 2021b). The paper proposes a new algorithm, called GAMI-Lin-T, that also uses trees like EBM, but it does linear fits instead of piecewise constants within the partitions. There are many other differences, including the development of a new interaction filtering algorithm. Finally, the paper uses simulated and real datasets to compare selected ML algorithms. The results show that GAMI-Lin-T and GAMI-Net have comparable performances, and both are generally better than EBM. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: 24 pages, 15 figures. arXiv admin note: substantial text overlap with arXiv:2207.06950

arXiv:2304.13761 [pdf, other]

Enhancing Robustness of Gradient-Boosted Decision Trees through One-Hot Encoding and Regularization

Authors: Shijie Cui, Agus Sudjianto, Aijun Zhang, Runze Li

Abstract: Gradient-boosted decision trees (GBDT) are widely used and highly effective machine learning approach for tabular data modeling. However, their complex structure may lead to low robustness against small covariate perturbation in unseen data. In this study, we apply one-hot encoding to convert a GBDT model into a linear framework, through encoding of each tree leaf to one dummy variable. This allow… ▽ More Gradient-boosted decision trees (GBDT) are widely used and highly effective machine learning approach for tabular data modeling. However, their complex structure may lead to low robustness against small covariate perturbation in unseen data. In this study, we apply one-hot encoding to convert a GBDT model into a linear framework, through encoding of each tree leaf to one dummy variable. This allows for the use of linear regression techniques, plus a novel risk decomposition for assessing the robustness of a GBDT model against covariate perturbations. We propose to enhance the robustness of GBDT models by refitting their linear regression forms with $L_1$ or $L_2$ regularization. Theoretical results are obtained about the effect of regularization on the model performance and robustness. It is demonstrated through numerical experiments that the proposed regularization approach can enhance the robustness of the one-hot-encoded GBDT models. △ Less

Submitted 11 May, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

arXiv:2204.12365 [pdf]

Explaining Adverse Actions in Credit Decisions Using Shapley Decomposition

Authors: Vijayan N. Nair, Tianshu Feng, Linwei Hu, Zach Zhang, Jie Chen, Agus Sudjianto

Abstract: When a financial institution declines an application for credit, an adverse action (AA) is said to occur. The applicant is then entitled to an explanation for the negative decision. This paper focuses on credit decisions based on a predictive model for probability of default and proposes a methodology for AA explanation. The problem involves identifying the important predictors responsible for the… ▽ More When a financial institution declines an application for credit, an adverse action (AA) is said to occur. The applicant is then entitled to an explanation for the negative decision. This paper focuses on credit decisions based on a predictive model for probability of default and proposes a methodology for AA explanation. The problem involves identifying the important predictors responsible for the negative decision and is straightforward when the underlying model is additive. However, it becomes non-trivial even for linear models with interactions. We consider models with low-order interactions and develop a simple and intuitive approach based on first principles. We then show how the methodology generalizes to the well-known Shapely decomposition and the recently proposed concept of Baseline Shapley (B-Shap). Unlike other Shapley techniques in the literature for local interpretability of machine learning results, B-Shap is computationally tractable since it involves just function evaluations. An illustrative case study is used to demonstrate the usefulness of the method. The paper also discusses situations with highly correlated predictors and desirable properties of fitted models in the credit-lending context, such as monotonicity and continuity. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Comments: 20 pages, 8 figures

arXiv:2111.08922 [pdf, other]

Traversing the Local Polytopes of ReLU Neural Networks: A Unified Approach for Network Verification

Authors: Shaojie Xu, Joel Vaughan, Jie Chen, Aijun Zhang, Agus Sudjianto

Abstract: Although neural networks (NNs) with ReLU activation functions have found success in a wide range of applications, their adoption in risk-sensitive settings has been limited by the concerns on robustness and interpretability. Previous works to examine robustness and to improve interpretability partially exploited the piecewise linear function form of ReLU NNs. In this paper, we explore the unique t… ▽ More Although neural networks (NNs) with ReLU activation functions have found success in a wide range of applications, their adoption in risk-sensitive settings has been limited by the concerns on robustness and interpretability. Previous works to examine robustness and to improve interpretability partially exploited the piecewise linear function form of ReLU NNs. In this paper, we explore the unique topological structure that ReLU NNs create in the input space, identifying the adjacency among the partitioned local polytopes and develo** a traversing algorithm based on this adjacency. Our polytope traversing algorithm can be adapted to verify a wide range of network properties related to robustness and interpretability, providing an unified approach to examine the network behavior. As the traversing algorithm explicitly visits all local polytopes, it returns a clear and full picture of the network behavior within the traversed region. The time and space complexity of the traversing algorithm is determined by the number of a ReLU NN's partitioning hyperplanes passing through the traversing region. △ Less

Submitted 9 January, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

arXiv:2111.01743 [pdf, other]

Designing Inherently Interpretable Machine Learning Models

Authors: Agus Sudjianto, Aijun Zhang

Abstract: Interpretable machine learning (IML) becomes increasingly important in highly regulated industry sectors related to the health and safety or fundamental rights of human beings. In general, the inherently IML models should be adopted because of their transparency and explainability, while black-box models with model-agnostic explainability can be more difficult to defend under regulatory scrutiny.… ▽ More Interpretable machine learning (IML) becomes increasingly important in highly regulated industry sectors related to the health and safety or fundamental rights of human beings. In general, the inherently IML models should be adopted because of their transparency and explainability, while black-box models with model-agnostic explainability can be more difficult to defend under regulatory scrutiny. For assessing inherent interpretability of a machine learning model, we propose a qualitative template based on feature effects and model architecture constraints. It provides the design principles for high-performance IML model development, with examples given by reviewing our recent works on ExNN, GAMI-Net, SIMTree, and the Aletheia toolkit for local linear interpretability of deep ReLU networks. We further demonstrate how to design an interpretable ReLU DNN model with evaluation of conceptual soundness for a real case study of predicting credit default in home lending. We hope that this work will provide a practical guide of develo** inherently IML models in high risk applications in banking industry, as well as other sectors. △ Less

Submitted 2 November, 2021; originally announced November 2021.

Comments: arXiv admin note: text overlap with arXiv:2011.04041

arXiv:2109.04244 [pdf, other]

Supervised Linear Dimension-Reduction Methods: Review, Extensions, and Comparisons

Authors: Shaojie Xu, Joel Vaughan, Jie Chen, Agus Sudjianto, Vijayan Nair

Abstract: Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling. It is an unsupervised learning technique that identifies a suitable linear subspace for the input variable that contains maximal variation and preserves as much information as possible. PCA has also been used in prediction models where the original, high-dim… ▽ More Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling. It is an unsupervised learning technique that identifies a suitable linear subspace for the input variable that contains maximal variation and preserves as much information as possible. PCA has also been used in prediction models where the original, high-dimensional space of predictors is reduced to a smaller, more manageable, set before conducting regression analysis. However, this approach does not incorporate information in the response during the dimension-reduction stage and hence can have poor predictive performance. To address this concern, several supervised linear dimension-reduction techniques have been proposed in the literature. This paper reviews selected techniques, extends some of them, and compares their performance through simulations. Two of these techniques, partial least squares (PLS) and least-squares PCA (LSPCA), consistently outperform the others in this study. △ Less

Submitted 9 September, 2021; originally announced September 2021.

arXiv:2105.06558 [pdf]

Bias, Fairness, and Accountability with AI and ML Algorithms

Authors: Nengfeng Zhou, Zach Zhang, Vijayan N. Nair, Harsh Singhal, Jie Chen, Agus Sudjianto

Abstract: The advent of AI and ML algorithms has led to opportunities as well as challenges. In this paper, we provide an overview of bias and fairness issues that arise with the use of ML algorithms. We describe the types and sources of data bias, and discuss the nature of algorithmic unfairness. This is followed by a review of fairness metrics in the literature, discussion of their limitations, and a desc… ▽ More The advent of AI and ML algorithms has led to opportunities as well as challenges. In this paper, we provide an overview of bias and fairness issues that arise with the use of ML algorithms. We describe the types and sources of data bias, and discuss the nature of algorithmic unfairness. This is followed by a review of fairness metrics in the literature, discussion of their limitations, and a description of de-biasing (or mitigation) techniques in the model life cycle. △ Less

Submitted 13 May, 2021; originally announced May 2021.

Comments: 18 pages, 5 figures

MSC Class: 00-02

arXiv:2103.09983 [pdf, other]

Linear Iterative Feature Embedding: An Ensemble Framework for Interpretable Model

Authors: Agus Sudjianto, **wen Qiu, Miaoqi Li, Jie Chen

Abstract: A new ensemble framework for interpretable model called Linear Iterative Feature Embedding (LIFE) has been developed to achieve high prediction accuracy, easy interpretation and efficient computation simultaneously. The LIFE algorithm is able to fit a wide single-hidden-layer neural network (NN) accurately with three steps: defining the subsets of a dataset by the linear projections of neural node… ▽ More A new ensemble framework for interpretable model called Linear Iterative Feature Embedding (LIFE) has been developed to achieve high prediction accuracy, easy interpretation and efficient computation simultaneously. The LIFE algorithm is able to fit a wide single-hidden-layer neural network (NN) accurately with three steps: defining the subsets of a dataset by the linear projections of neural nodes, creating the features from multiple narrow single-hidden-layer NNs trained on the different subsets of the data, combining the features with a linear model. The theoretical rationale behind LIFE is also provided by the connection to the loss ambiguity decomposition of stack ensemble methods. Both simulation and empirical experiments confirm that LIFE consistently outperforms directly trained single-hidden-layer NNs and also outperforms many other benchmark models, including multi-layers Feed Forward Neural Network (FFNN), Xgboost, and Random Forest (RF) in many experiments. As a wide single-hidden-layer NN, LIFE is intrinsically interpretable. Meanwhile, both variable importance and global main and interaction effects can be easily created and visualized. In addition, the parallel nature of the base learner building makes LIFE computationally efficient by leveraging parallel computing. △ Less

Submitted 17 March, 2021; originally announced March 2021.

arXiv:2011.04041 [pdf, other]

Unwrap** The Black Box of Deep ReLU Networks: Interpretability, Diagnostics, and Simplification

Authors: Agus Sudjianto, William Knauth, Rahul Singh, Zebin Yang, Aijun Zhang

Abstract: The deep neural networks (DNNs) have achieved great success in learning complex patterns with strong predictive power, but they are often thought of as "black box" models without a sufficient level of transparency and interpretability. It is important to demystify the DNNs with rigorous mathematics and practical tools, especially when they are used for mission-critical applications. This paper aim… ▽ More The deep neural networks (DNNs) have achieved great success in learning complex patterns with strong predictive power, but they are often thought of as "black box" models without a sufficient level of transparency and interpretability. It is important to demystify the DNNs with rigorous mathematics and practical tools, especially when they are used for mission-critical applications. This paper aims to unwrap the black box of deep ReLU networks through local linear representation, which utilizes the activation pattern and disentangles the complex network into an equivalent set of local linear models (LLMs). We develop a convenient LLM-based toolkit for interpretability, diagnostics, and simplification of a pre-trained deep ReLU network. We propose the local linear profile plot and other visualization methods for interpretation and diagnostics, and an effective merging strategy for network simplification. The proposed methods are demonstrated by simulation examples, benchmark datasets, and a real case study in home lending credit risk assessment. △ Less

Submitted 8 November, 2020; originally announced November 2020.

arXiv:2008.04059 [pdf]

Supervised Machine Learning Techniques: An Overview with Applications to Banking

Authors: Linwei Hu, Jie Chen, Joel Vaughan, Hanyu Yang, Kelly Wang, Agus Sudjianto, Vijayan N. Nair

Abstract: This article provides an overview of Supervised Machine Learning (SML) with a focus on applications to banking. The SML techniques covered include Bagging (Random Forest or RF), Boosting (Gradient Boosting Machine or GBM) and Neural Networks (NNs). We begin with an introduction to ML tasks and techniques. This is followed by a description of: i) tree-based ensemble algorithms including Bagging wit… ▽ More This article provides an overview of Supervised Machine Learning (SML) with a focus on applications to banking. The SML techniques covered include Bagging (Random Forest or RF), Boosting (Gradient Boosting Machine or GBM) and Neural Networks (NNs). We begin with an introduction to ML tasks and techniques. This is followed by a description of: i) tree-based ensemble algorithms including Bagging with RF and Boosting with GBMs, ii) Feedforward NNs, iii) a discussion of hyper-parameter optimization techniques, and iv) machine learning interpretability. The paper concludes with a comparison of the features of different ML algorithms. Examples taken from credit risk modeling in banking are used throughout the paper to illustrate the techniques and interpret the results of the algorithms. △ Less

Submitted 28 July, 2020; originally announced August 2020.

arXiv:2007.14528 [pdf]

Surrogate Locally-Interpretable Models with Supervised Machine Learning Algorithms

Authors: Linwei Hu, Jie Chen, Vijayan N. Nair, Agus Sudjianto

Abstract: Supervised Machine Learning (SML) algorithms, such as Gradient Boosting, Random Forest, and Neural Networks, have become popular in recent years due to their superior predictive performance over traditional statistical methods. However, their complexity makes the results hard to interpret without additional tools. There has been a lot of recent work in develo** global and local diagnostics for i… ▽ More Supervised Machine Learning (SML) algorithms, such as Gradient Boosting, Random Forest, and Neural Networks, have become popular in recent years due to their superior predictive performance over traditional statistical methods. However, their complexity makes the results hard to interpret without additional tools. There has been a lot of recent work in develo** global and local diagnostics for interpreting SML models. In this paper, we propose a locally-interpretable model that takes the fitted ML response surface, partitions the predictor space using model-based regression trees, and fits interpretable main-effects models at each of the nodes. We adapt the algorithm to be efficient in dealing with high-dimensional predictors. While the main focus is on interpretability, the resulting surrogate model also has reasonably good predictive performance. △ Less

Submitted 28 July, 2020; originally announced July 2020.

arXiv:2005.08027 [pdf, other]

An Effective and Efficient Initialization Scheme for Training Multi-layer Feedforward Neural Networks

Authors: Zebin Yang, Hengtao Zhang, Agus Sudjianto, Aijun Zhang

Abstract: Network initialization is the first and critical step for training neural networks. In this paper, we propose a novel network initialization scheme based on the celebrated Stein's identity. By viewing multi-layer feedforward neural networks as cascades of multi-index models, the projection weights to the first hidden layer are initialized using eigenvectors of the cross-moment matrix between the i… ▽ More Network initialization is the first and critical step for training neural networks. In this paper, we propose a novel network initialization scheme based on the celebrated Stein's identity. By viewing multi-layer feedforward neural networks as cascades of multi-index models, the projection weights to the first hidden layer are initialized using eigenvectors of the cross-moment matrix between the input's second-order score function and the response. The input data is then forward propagated to the next layer and such a procedure can be repeated until all the hidden layers are initialized. Finally, the weights for the output layer are initialized by generalized linear modeling. Such a proposed SteinGLM method is shown through extensive numerical results to be much faster and more accurate than other popular methods commonly used for training neural networks. △ Less

Submitted 25 June, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

arXiv:2004.02353 [pdf]

Adaptive Explainable Neural Networks (AxNNs)

Authors: Jie Chen, Joel Vaughan, Vijayan N. Nair, Agus Sudjianto

Abstract: While machine learning techniques have been successfully applied in several fields, the black-box nature of the models presents challenges for interpreting and explaining the results. We develop a new framework called Adaptive Explainable Neural Networks (AxNN) for achieving the dual goals of good predictive performance and model interpretability. For predictive performance, we build a structured… ▽ More While machine learning techniques have been successfully applied in several fields, the black-box nature of the models presents challenges for interpreting and explaining the results. We develop a new framework called Adaptive Explainable Neural Networks (AxNN) for achieving the dual goals of good predictive performance and model interpretability. For predictive performance, we build a structured neural network made up of ensembles of generalized additive model networks and additive index models (through explainable neural networks) using a two-stage process. This can be done using either a boosting or a stacking ensemble. For interpretability, we show how to decompose the results of AxNN into main effects and higher-order interaction effects. The computations are inherited from Google's open source tool AdaNet and can be efficiently accelerated by training with distributed computing. The results are illustrated on simulated and real datasets. △ Less

Submitted 2 June, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

arXiv:2003.07132 [pdf, other]

GAMI-Net: An Explainable Neural Network based on Generalized Additive Models with Structured Interactions

Authors: Zebin Yang, Aijun Zhang, Agus Sudjianto

Abstract: The lack of interpretability is an inevitable problem when using neural network models in real applications. In this paper, an explainable neural network based on generalized additive models with structured interactions (GAMI-Net) is proposed to pursue a good balance between prediction accuracy and model interpretability. GAMI-Net is a disentangled feedforward network with multiple additive subnet… ▽ More The lack of interpretability is an inevitable problem when using neural network models in real applications. In this paper, an explainable neural network based on generalized additive models with structured interactions (GAMI-Net) is proposed to pursue a good balance between prediction accuracy and model interpretability. GAMI-Net is a disentangled feedforward network with multiple additive subnetworks; each subnetwork consists of multiple hidden layers and is designed for capturing one main effect or one pairwise interaction. Three interpretability aspects are further considered, including a) sparsity, to select the most significant effects for parsimonious representations; b) heredity, a pairwise interaction could only be included when at least one of its parent main effects exists; and c) marginal clarity, to make main effects and pairwise interactions mutually distinguishable. An adaptive training algorithm is developed, where main effects are first trained and then pairwise interactions are fitted to the residuals. Numerical experiments on both synthetic functions and real-world datasets show that the proposed model enjoys superior interpretability and it maintains competitive prediction accuracy in comparison to the explainable boosting machine and other classic machine learning models. △ Less

Submitted 2 June, 2021; v1 submitted 16 March, 2020; originally announced March 2020.

arXiv:1904.11419 [pdf]

Time Series Simulation by Conditional Generative Adversarial Net

Authors: Rao Fu, Jie Chen, Shutian Zeng, Yi** Zhuang, Agus Sudjianto

Abstract: Generative Adversarial Net (GAN) has been proven to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions can be both categorical and continuous variables containing different kinds of auxiliary information. Our simulation studies show that CGAN… ▽ More Generative Adversarial Net (GAN) has been proven to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions can be both categorical and continuous variables containing different kinds of auxiliary information. Our simulation studies show that CGAN is able to learn different kinds of normal and heavy tail distributions, as well as dependent structures of different time series and it can further generate conditional predictive distributions consistent with the training data distributions. We also provide an in-depth discussion on the rationale of GAN and the neural network as hierarchical splines to draw a clear connection with the existing statistical method for distribution generation. In practice, CGAN has a wide range of applications in the market risk and counterparty risk analysis: it can be applied to learn the historical data and generate scenarios for the calculation of Value-at-Risk (VaR) and Expected Shortfall (ES) and predict the movement of the market risk factors. We present a real data analysis including a backtesting to demonstrate CGAN is able to outperform the Historic Simulation, a popular method in market risk analysis for the calculation of VaR. CGAN can also be applied in the economic time series modeling and forecasting, and an example of hypothetical shock analysis for economic models and the generation of potential CCAR scenarios by CGAN is given at the end of the paper. △ Less

Submitted 25 April, 2019; originally announced April 2019.

arXiv:1901.03838 [pdf, other]

Enhancing Explainability of Neural Networks through Architecture Constraints

Authors: Zebin Yang, Aijun Zhang, Agus Sudjianto

Abstract: Prediction accuracy and model explainability are the two most important objectives when develo** machine learning algorithms to solve real-world problems. The neural networks are known to possess good prediction performance, but lack of sufficient model interpretability. In this paper, we propose to enhance the explainability of neural networks through the following architecture constraints: a)… ▽ More Prediction accuracy and model explainability are the two most important objectives when develo** machine learning algorithms to solve real-world problems. The neural networks are known to possess good prediction performance, but lack of sufficient model interpretability. In this paper, we propose to enhance the explainability of neural networks through the following architecture constraints: a) sparse additive subnetworks; b) projection pursuit with orthogonality constraint; and c) smooth function approximation. It leads to an explainable neural network (xNN) with the superior balance between prediction performance and model interpretability. We derive the necessary and sufficient identifiability conditions for the proposed xNN model. The multiple parameters are simultaneously estimated by a modified mini-batch gradient descent method based on the backpropagation algorithm for calculating the derivatives and the Cayley transform for preserving the projection orthogonality. Through simulation study under six different scenarios, we compare the proposed method to several benchmarks including least absolute shrinkage and selection operator, support vector machine, random forest, extreme learning machine, and multi-layer perceptron. It is shown that the proposed xNN model keeps the flexibility of pursuing high prediction accuracy while attaining improved interpretability. Finally, a real data example is employed as a showcase application. △ Less

Submitted 2 September, 2019; v1 submitted 12 January, 2019; originally announced January 2019.

arXiv:1808.07216 [pdf]

Model Interpretation: A Unified Derivative-based Framework for Nonparametric Regression and Supervised Machine Learning

Authors: Xiaoyu Liu, Jie Chen, Joel Vaughan, Vijayan Nair, Agus Sudjianto

Abstract: Interpreting a nonparametric regression model with many predictors is known to be a challenging problem. There has been renewed interest in this topic due to the extensive use of machine learning algorithms and the difficulty in understanding and explaining their input-output relationships. This paper develops a unified framework using a derivative-based approach for existing tools in the literatu… ▽ More Interpreting a nonparametric regression model with many predictors is known to be a challenging problem. There has been renewed interest in this topic due to the extensive use of machine learning algorithms and the difficulty in understanding and explaining their input-output relationships. This paper develops a unified framework using a derivative-based approach for existing tools in the literature, including the partial-dependence plots, marginal plots and accumulated effects plots. It proposes a new interpretation technique called the accumulated total derivative effects plot and demonstrates how its components can be used to develop extensive insights in complex regression models with correlated predictors. The techniques are illustrated through simulation results. △ Less

Submitted 8 September, 2018; v1 submitted 22 August, 2018; originally announced August 2018.

arXiv:1806.01933 [pdf, other]

Explainable Neural Networks based on Additive Index Models

Authors: Joel Vaughan, Agus Sudjianto, Erind Brahimi, Jie Chen, Vijayan N. Nair

Abstract: Machine Learning algorithms are increasingly being used in recent years due to their flexibility in model fitting and increased predictive performance. However, the complexity of the models makes them hard for the data analyst to interpret the results and explain them without additional tools. This has led to much research in develo** various approaches to understand the model behavior. In this… ▽ More Machine Learning algorithms are increasingly being used in recent years due to their flexibility in model fitting and increased predictive performance. However, the complexity of the models makes them hard for the data analyst to interpret the results and explain them without additional tools. This has led to much research in develo** various approaches to understand the model behavior. In this paper, we present the Explainable Neural Network (xNN), a structured neural network designed especially to learn interpretable features. Unlike fully connected neural networks, the features engineered by the xNN can be extracted from the network in a relatively straightforward manner and the results displayed. With appropriate regularization, the xNN provides a parsimonious explanation of the relationship between the features and the output. We illustrate this interpretable feature--engineering property on simulated examples. △ Less

Submitted 5 June, 2018; originally announced June 2018.

Comments: 10 pages, 8 figures

arXiv:1806.00663 [pdf]

Locally Interpretable Models and Effects based on Supervised Partitioning (LIME-SUP)

Authors: Linwei Hu, Jie Chen, Vijayan N. Nair, Agus Sudjianto

Abstract: Supervised Machine Learning (SML) algorithms such as Gradient Boosting, Random Forest, and Neural Networks have become popular in recent years due to their increased predictive performance over traditional statistical methods. This is especially true with large data sets (millions or more observations and hundreds to thousands of predictors). However, the complexity of the SML models makes them op… ▽ More Supervised Machine Learning (SML) algorithms such as Gradient Boosting, Random Forest, and Neural Networks have become popular in recent years due to their increased predictive performance over traditional statistical methods. This is especially true with large data sets (millions or more observations and hundreds to thousands of predictors). However, the complexity of the SML models makes them opaque and hard to interpret without additional tools. There has been a lot of interest recently in develo** global and local diagnostics for interpreting and explaining SML models. In this paper, we propose locally interpretable models and effects based on supervised partitioning (trees) referred to as LIME-SUP. This is in contrast with the KLIME approach that is based on clustering the predictor space. We describe LIME-SUP based on fitting trees to the fitted response (LIM-SUP-R) as well as the derivatives of the fitted response (LIME-SUP-D). We compare the results with KLIME and describe its advantages using simulation and real data. △ Less

Submitted 2 June, 2018; originally announced June 2018.

Comments: 15 pages, 10 figures

arXiv:1305.2815 [pdf, other]

Modelling time and vintage variability in retail credit portfolios: the decomposition approach

Authors: Jonathan J. Forster, Agus Sudjianto

Abstract: In this paper, we consider the problem of modelling historical data on retail credit portfolio performance, with a view to forecasting future performance, and facilitating strategic decision making. We consider a situation, common in practice, where accounts with common origination date (typically month) are aggregated into a single vintage for analysis, and the data for analysis consists of a tim… ▽ More In this paper, we consider the problem of modelling historical data on retail credit portfolio performance, with a view to forecasting future performance, and facilitating strategic decision making. We consider a situation, common in practice, where accounts with common origination date (typically month) are aggregated into a single vintage for analysis, and the data for analysis consists of a time series of a univariate portfolio performance variable (for example, the proportion of defaulting accounts) for each vintage over successive time periods since origination. An invaluable management tool for understanding portfolio behaviour can be obtained by decomposing the data series nonparametrically into components of exogenous variability (E), maturity (time since origination; M) and vintage (V), referred to as an EMV model. For example, identification of a good macroeconomic model is the key to effective forecasting, particularly in applications such as stress testing, and identification of this can be facilitated by investigation of the macroeconomic component of an EMV decomposition. We show that care needs to be taken with such a decomposition, drawing parallels with the Age-Period-Cohort approach, common in demography, epidemiology and sociology. We develop a practical decomposition strategy, and illustrate our approach using data extracted from a credit card portfolio. △ Less

Submitted 13 May, 2013; originally announced May 2013.

Showing 1–20 of 20 results for author: Sudjianto, A