Skip to main content

Showing 1–9 of 9 results for author: Ni, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.04421  [pdf, other

    cs.LG stat.ML

    Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension

    Authors: Shuang Ni, Adrien Aumon, Guy Wolf, Kevin R. Moon, Jake S. Rhodes

    Abstract: The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Common dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction met… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 7 pages, 3 figures

  2. arXiv:2205.00388  [pdf, other

    cs.CY cs.LG stat.AP

    Abnormal-aware Multi-person Evaluation System with Improved Fuzzy Weighting

    Authors: Shutong Ni

    Abstract: There exists a phenomenon that subjectivity highly lies in the daily evaluation process. Our research primarily concentrates on a multi-person evaluation system with anomaly detection to minimize the possible inaccuracy that subjective assessment brings. We choose the two-stage screening method, which consists of rough screening and score-weighted Kendall-$τ$ Distance to winnow out abnormal data,… ▽ More

    Submitted 30 April, 2022; originally announced May 2022.

    Comments: 13 pages, 5 figures

  3. arXiv:2107.11136  [pdf, other

    cs.LG cs.CR stat.ML

    High Dimensional Differentially Private Stochastic Optimization with Heavy-tailed Data

    Authors: Lijie Hu, Shuo Ni, Hanshen Xiao, Di Wang

    Abstract: As one of the most fundamental problems in machine learning, statistics and differential privacy, Differentially Private Stochastic Convex Optimization (DP-SCO) has been extensively studied in recent years. However, most of the previous work can only handle either regular data distribution or irregular data in the low dimensional space case. To better understand the challenges arising from irregul… ▽ More

    Submitted 9 August, 2021; v1 submitted 23 July, 2021; originally announced July 2021.

  4. Develo** and Improving Risk Models using Machine-learning Based Algorithms

    Authors: Yan Wang, Xuelei Sherry Ni

    Abstract: The objective of this study is to develop a good risk model for classifying business delinquency by simultaneously exploring several machine learning based methods including regularization, hyper-parameter optimization, and model ensembling algorithms. The rationale under the analyses is firstly to obtain good base binary classifiers (include Logistic Regression ($LR$), K-Nearest Neighbors ($KNN$)… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

  5. arXiv:1903.05535  [pdf

    stat.ML cs.LG

    Predicting class-imbalanced business risk using resampling, regularization, and model ensembling algorithms

    Authors: Yan Wang, Xuelei Sherry Ni

    Abstract: We aim at develo** and improving the imbalanced business risk modeling via jointly using proper evaluation criteria, resampling, cross-validation, classifier regularization, and ensembling techniques. Area Under the Receiver Operating Characteristic Curve (AUC of ROC) is used for model comparison based on 10-fold cross validation. Two undersampling strategies including random undersampling (RUS)… ▽ More

    Submitted 13 March, 2019; originally announced March 2019.

    Journal ref: International Journal of Managing Information Technology (IJIMIT) Vol. 11, No. 1, Februray 2019

  6. arXiv:1902.04954  [pdf, other

    cs.LG q-fin.GN stat.ML

    Risk Prediction of Peer-to-Peer Lending Market by a LSTM Model with Macroeconomic Factor

    Authors: Yan Wang, Xuelei Sherry Ni

    Abstract: In the peer to peer (P2P) lending platform, investors hope to maximize their return while minimizing the risk through a comprehensive understanding of the P2P market. A low and stable average default rate across all the borrowers denotes a healthy P2P market and provides investors more confidence in a promising investment. Therefore, having a powerful model to describe the trend of the default rat… ▽ More

    Submitted 9 September, 2020; v1 submitted 13 February, 2019; originally announced February 2019.

  7. arXiv:1901.08433  [pdf

    stat.ML cs.LG

    A XGBoost risk model via feature selection and Bayesian hyper-parameter optimization

    Authors: Yan Wang, Xuelei Sherry Ni

    Abstract: This paper aims to explore models based on the extreme gradient boosting (XGBoost) approach for business risk classification. Feature selection (FS) algorithms and hyper-parameter optimizations are simultaneously considered during model training. The five most commonly used FS methods including weight by Gini, weight by Chi-square, hierarchical variable clustering, weight by correlation, and weigh… ▽ More

    Submitted 24 January, 2019; originally announced January 2019.

    Comments: Accepted by International Journal of Database Management Systems (IJDMS)

  8. arXiv:1901.00251  [pdf, other

    stat.ML cs.LG

    An Automatic Interaction Detection Hybrid Model for Bankcard Response Classification

    Authors: Yan Wang, Xuelei Sherry Ni, Brian Stone

    Abstract: In this paper, we propose a hybrid bankcard response model, which integrates decision tree based chi-square automatic interaction detection (CHAID) into logistic regression. In the first stage of the hybrid model, CHAID analysis is used to detect the possibly potential variable interactions. Then in the second stage, these potential interactions are served as the additional input variables in logi… ▽ More

    Submitted 1 January, 2019; originally announced January 2019.

    Journal ref: The 2018 5th International Conference on Systems and Informatics (ICSAI2018)

  9. arXiv:1812.02546  [pdf

    stat.ML cs.LG

    A two-stage hybrid model by using artificial neural networks as feature construction algorithms

    Authors: Yan Wang, Xuelei Sherry Ni, Brian Stone

    Abstract: We propose a two-stage hybrid approach with neural networks as the new feature construction algorithms for bankcard response classifications. The hybrid model uses a very simple neural network structure as the new feature construction tool in the first stage, then the newly created features are used as the additional input variables in logistic regression in the second stage. The model is compared… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.