Skip to main content

Showing 1–20 of 20 results for author: Sudjianto, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2305.15670  [pdf

    stat.ML cs.LG

    Interpretable Machine Learning based on Functional ANOVA Framework: Algorithms and Comparisons

    Authors: Linwei Hu, Vijayan N. Nair, Agus Sudjianto, Aijun Zhang, Jie Chen

    Abstract: In the early days of machine learning (ML), the emphasis was on develo** complex algorithms to achieve best predictive performance. To understand and explain the model results, one had to rely on post hoc explainability techniques, which are known to have limitations. Recently, with the recognition that interpretability is just as important, researchers are compromising on small increases in pre… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 24 pages, 15 figures. arXiv admin note: substantial text overlap with arXiv:2207.06950

  2. arXiv:2304.13761  [pdf, other

    stat.ML cs.LG

    Enhancing Robustness of Gradient-Boosted Decision Trees through One-Hot Encoding and Regularization

    Authors: Shijie Cui, Agus Sudjianto, Aijun Zhang, Runze Li

    Abstract: Gradient-boosted decision trees (GBDT) are widely used and highly effective machine learning approach for tabular data modeling. However, their complex structure may lead to low robustness against small covariate perturbation in unseen data. In this study, we apply one-hot encoding to convert a GBDT model into a linear framework, through encoding of each tree leaf to one dummy variable. This allow… ▽ More

    Submitted 11 May, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

  3. arXiv:2204.12365  [pdf

    stat.ML cs.LG

    Explaining Adverse Actions in Credit Decisions Using Shapley Decomposition

    Authors: Vijayan N. Nair, Tianshu Feng, Linwei Hu, Zach Zhang, Jie Chen, Agus Sudjianto

    Abstract: When a financial institution declines an application for credit, an adverse action (AA) is said to occur. The applicant is then entitled to an explanation for the negative decision. This paper focuses on credit decisions based on a predictive model for probability of default and proposes a methodology for AA explanation. The problem involves identifying the important predictors responsible for the… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

    Comments: 20 pages, 8 figures

  4. arXiv:2111.08922  [pdf, other

    cs.LG math.OC stat.ML

    Traversing the Local Polytopes of ReLU Neural Networks: A Unified Approach for Network Verification

    Authors: Shaojie Xu, Joel Vaughan, Jie Chen, Aijun Zhang, Agus Sudjianto

    Abstract: Although neural networks (NNs) with ReLU activation functions have found success in a wide range of applications, their adoption in risk-sensitive settings has been limited by the concerns on robustness and interpretability. Previous works to examine robustness and to improve interpretability partially exploited the piecewise linear function form of ReLU NNs. In this paper, we explore the unique t… ▽ More

    Submitted 9 January, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

  5. arXiv:2111.01743  [pdf, other

    cs.LG stat.ML

    Designing Inherently Interpretable Machine Learning Models

    Authors: Agus Sudjianto, Aijun Zhang

    Abstract: Interpretable machine learning (IML) becomes increasingly important in highly regulated industry sectors related to the health and safety or fundamental rights of human beings. In general, the inherently IML models should be adopted because of their transparency and explainability, while black-box models with model-agnostic explainability can be more difficult to defend under regulatory scrutiny.… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

    Comments: arXiv admin note: text overlap with arXiv:2011.04041

  6. arXiv:2109.04244  [pdf, other

    stat.ML cs.LG

    Supervised Linear Dimension-Reduction Methods: Review, Extensions, and Comparisons

    Authors: Shaojie Xu, Joel Vaughan, Jie Chen, Agus Sudjianto, Vijayan Nair

    Abstract: Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling. It is an unsupervised learning technique that identifies a suitable linear subspace for the input variable that contains maximal variation and preserves as much information as possible. PCA has also been used in prediction models where the original, high-dim… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

  7. arXiv:2105.06558  [pdf

    stat.ML cs.LG

    Bias, Fairness, and Accountability with AI and ML Algorithms

    Authors: Nengfeng Zhou, Zach Zhang, Vijayan N. Nair, Harsh Singhal, Jie Chen, Agus Sudjianto

    Abstract: The advent of AI and ML algorithms has led to opportunities as well as challenges. In this paper, we provide an overview of bias and fairness issues that arise with the use of ML algorithms. We describe the types and sources of data bias, and discuss the nature of algorithmic unfairness. This is followed by a review of fairness metrics in the literature, discussion of their limitations, and a desc… ▽ More

    Submitted 13 May, 2021; originally announced May 2021.

    Comments: 18 pages, 5 figures

    MSC Class: 00-02

  8. arXiv:2103.09983  [pdf, other

    stat.ML cs.LG

    Linear Iterative Feature Embedding: An Ensemble Framework for Interpretable Model

    Authors: Agus Sudjianto, **wen Qiu, Miaoqi Li, Jie Chen

    Abstract: A new ensemble framework for interpretable model called Linear Iterative Feature Embedding (LIFE) has been developed to achieve high prediction accuracy, easy interpretation and efficient computation simultaneously. The LIFE algorithm is able to fit a wide single-hidden-layer neural network (NN) accurately with three steps: defining the subsets of a dataset by the linear projections of neural node… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

  9. arXiv:2011.04041  [pdf, other

    cs.LG cs.AI stat.ML

    Unwrap** The Black Box of Deep ReLU Networks: Interpretability, Diagnostics, and Simplification

    Authors: Agus Sudjianto, William Knauth, Rahul Singh, Zebin Yang, Aijun Zhang

    Abstract: The deep neural networks (DNNs) have achieved great success in learning complex patterns with strong predictive power, but they are often thought of as "black box" models without a sufficient level of transparency and interpretability. It is important to demystify the DNNs with rigorous mathematics and practical tools, especially when they are used for mission-critical applications. This paper aim… ▽ More

    Submitted 8 November, 2020; originally announced November 2020.

  10. arXiv:2008.04059  [pdf

    q-fin.GN cs.LG stat.ML

    Supervised Machine Learning Techniques: An Overview with Applications to Banking

    Authors: Linwei Hu, Jie Chen, Joel Vaughan, Hanyu Yang, Kelly Wang, Agus Sudjianto, Vijayan N. Nair

    Abstract: This article provides an overview of Supervised Machine Learning (SML) with a focus on applications to banking. The SML techniques covered include Bagging (Random Forest or RF), Boosting (Gradient Boosting Machine or GBM) and Neural Networks (NNs). We begin with an introduction to ML tasks and techniques. This is followed by a description of: i) tree-based ensemble algorithms including Bagging wit… ▽ More

    Submitted 28 July, 2020; originally announced August 2020.

  11. arXiv:2007.14528  [pdf

    stat.ML cs.LG

    Surrogate Locally-Interpretable Models with Supervised Machine Learning Algorithms

    Authors: Linwei Hu, Jie Chen, Vijayan N. Nair, Agus Sudjianto

    Abstract: Supervised Machine Learning (SML) algorithms, such as Gradient Boosting, Random Forest, and Neural Networks, have become popular in recent years due to their superior predictive performance over traditional statistical methods. However, their complexity makes the results hard to interpret without additional tools. There has been a lot of recent work in develo** global and local diagnostics for i… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

  12. arXiv:2005.08027  [pdf, other

    cs.LG stat.CO stat.ML

    An Effective and Efficient Initialization Scheme for Training Multi-layer Feedforward Neural Networks

    Authors: Zebin Yang, Hengtao Zhang, Agus Sudjianto, Aijun Zhang

    Abstract: Network initialization is the first and critical step for training neural networks. In this paper, we propose a novel network initialization scheme based on the celebrated Stein's identity. By viewing multi-layer feedforward neural networks as cascades of multi-index models, the projection weights to the first hidden layer are initialized using eigenvectors of the cross-moment matrix between the i… ▽ More

    Submitted 25 June, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

  13. arXiv:2004.02353  [pdf

    stat.ML cs.AI cs.LG

    Adaptive Explainable Neural Networks (AxNNs)

    Authors: Jie Chen, Joel Vaughan, Vijayan N. Nair, Agus Sudjianto

    Abstract: While machine learning techniques have been successfully applied in several fields, the black-box nature of the models presents challenges for interpreting and explaining the results. We develop a new framework called Adaptive Explainable Neural Networks (AxNN) for achieving the dual goals of good predictive performance and model interpretability. For predictive performance, we build a structured… ▽ More

    Submitted 2 June, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

  14. arXiv:2003.07132  [pdf, other

    stat.ML cs.LG stat.CO

    GAMI-Net: An Explainable Neural Network based on Generalized Additive Models with Structured Interactions

    Authors: Zebin Yang, Aijun Zhang, Agus Sudjianto

    Abstract: The lack of interpretability is an inevitable problem when using neural network models in real applications. In this paper, an explainable neural network based on generalized additive models with structured interactions (GAMI-Net) is proposed to pursue a good balance between prediction accuracy and model interpretability. GAMI-Net is a disentangled feedforward network with multiple additive subnet… ▽ More

    Submitted 2 June, 2021; v1 submitted 16 March, 2020; originally announced March 2020.

  15. arXiv:1904.11419  [pdf

    stat.ML cs.LG eess.IV

    Time Series Simulation by Conditional Generative Adversarial Net

    Authors: Rao Fu, Jie Chen, Shutian Zeng, Yi** Zhuang, Agus Sudjianto

    Abstract: Generative Adversarial Net (GAN) has been proven to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions can be both categorical and continuous variables containing different kinds of auxiliary information. Our simulation studies show that CGAN… ▽ More

    Submitted 25 April, 2019; originally announced April 2019.

  16. arXiv:1901.03838  [pdf, other

    stat.ML cs.LG

    Enhancing Explainability of Neural Networks through Architecture Constraints

    Authors: Zebin Yang, Aijun Zhang, Agus Sudjianto

    Abstract: Prediction accuracy and model explainability are the two most important objectives when develo** machine learning algorithms to solve real-world problems. The neural networks are known to possess good prediction performance, but lack of sufficient model interpretability. In this paper, we propose to enhance the explainability of neural networks through the following architecture constraints: a)… ▽ More

    Submitted 2 September, 2019; v1 submitted 12 January, 2019; originally announced January 2019.

  17. arXiv:1808.07216  [pdf

    stat.ML cs.LG

    Model Interpretation: A Unified Derivative-based Framework for Nonparametric Regression and Supervised Machine Learning

    Authors: Xiaoyu Liu, Jie Chen, Joel Vaughan, Vijayan Nair, Agus Sudjianto

    Abstract: Interpreting a nonparametric regression model with many predictors is known to be a challenging problem. There has been renewed interest in this topic due to the extensive use of machine learning algorithms and the difficulty in understanding and explaining their input-output relationships. This paper develops a unified framework using a derivative-based approach for existing tools in the literatu… ▽ More

    Submitted 8 September, 2018; v1 submitted 22 August, 2018; originally announced August 2018.

  18. arXiv:1806.01933  [pdf, other

    stat.ML cs.LG

    Explainable Neural Networks based on Additive Index Models

    Authors: Joel Vaughan, Agus Sudjianto, Erind Brahimi, Jie Chen, Vijayan N. Nair

    Abstract: Machine Learning algorithms are increasingly being used in recent years due to their flexibility in model fitting and increased predictive performance. However, the complexity of the models makes them hard for the data analyst to interpret the results and explain them without additional tools. This has led to much research in develo** various approaches to understand the model behavior. In this… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

    Comments: 10 pages, 8 figures

  19. arXiv:1806.00663  [pdf

    stat.ML cs.LG

    Locally Interpretable Models and Effects based on Supervised Partitioning (LIME-SUP)

    Authors: Linwei Hu, Jie Chen, Vijayan N. Nair, Agus Sudjianto

    Abstract: Supervised Machine Learning (SML) algorithms such as Gradient Boosting, Random Forest, and Neural Networks have become popular in recent years due to their increased predictive performance over traditional statistical methods. This is especially true with large data sets (millions or more observations and hundreds to thousands of predictors). However, the complexity of the SML models makes them op… ▽ More

    Submitted 2 June, 2018; originally announced June 2018.

    Comments: 15 pages, 10 figures

  20. arXiv:1305.2815  [pdf, other

    stat.AP

    Modelling time and vintage variability in retail credit portfolios: the decomposition approach

    Authors: Jonathan J. Forster, Agus Sudjianto

    Abstract: In this paper, we consider the problem of modelling historical data on retail credit portfolio performance, with a view to forecasting future performance, and facilitating strategic decision making. We consider a situation, common in practice, where accounts with common origination date (typically month) are aggregated into a single vintage for analysis, and the data for analysis consists of a tim… ▽ More

    Submitted 13 May, 2013; originally announced May 2013.