Skip to main content

Showing 1–20 of 20 results for author: Zhong, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.10456  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Generative Modeling for Tabular Data via Penalized Optimal Transport Network

    Authors: Wenhui Sophia Lu, Chenyang Zhong, Wing Hung Wong

    Abstract: The task of precisely learning the probability distribution of rows within tabular data and producing authentic synthetic samples is both crucial and non-trivial. Wasserstein generative adversarial network (WGAN) marks a notable improvement in generative modeling, addressing the challenges faced by its predecessor, generative adversarial network. However, due to the mixed data types and multimodal… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 37 pages, 23 figures

  2. arXiv:2311.03967  [pdf, other

    cs.CV stat.ML

    CeCNN: Copula-enhanced convolutional neural networks in joint prediction of refraction error and axial length based on ultra-widefield fundus images

    Authors: Chong Zhong, Yang Li, Danjuan Yang, Meiyan Li, Xingyao Zhou, Bo Fu, Catherine C. Liu, A. H. Welsh

    Abstract: Ultra-widefield (UWF) fundus images are replacing traditional fundus images in screening, detection, prediction, and treatment of complications related to myopia because their much broader visual range is advantageous for highly myopic eyes. Spherical equivalent (SE) is extensively used as the main myopia outcome measure, and axial length (AL) has drawn increasing interest as an important ocular c… ▽ More

    Submitted 1 June, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

  3. arXiv:2304.06686  [pdf, other

    cs.LG stat.ML

    OKRidge: Scalable Optimal k-Sparse Ridge Regression

    Authors: Jiachang Liu, Sam Rosen, Chudi Zhong, Cynthia Rudin

    Abstract: We consider an important problem in scientific discovery, namely identifying sparse governing equations for nonlinear dynamical systems. This involves solving sparse ridge regression problems to provable optimality in order to determine which terms drive the underlying dynamics. We propose a fast algorithm, OKRidge, for sparse ridge regression, using a novel lower bound calculation involving, firs… ▽ More

    Submitted 11 January, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023 Spotlight

  4. arXiv:2303.16047  [pdf, other

    cs.LG cs.AI stat.ML

    Exploring and Interacting with the Set of Good Sparse Generalized Additive Models

    Authors: Chudi Zhong, Zhi Chen, Jiachang Liu, Margo Seltzer, Cynthia Rudin

    Abstract: In real applications, interaction between machine learning models and domain experts is critical; however, the classical machine learning paradigm that usually produces only a single model does not facilitate such interaction. Approximating and exploring the Rashomon set, i.e., the set of all near-optimal models, addresses this practical challenge by providing the user with a searchable space cont… ▽ More

    Submitted 17 November, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

    Comments: NeurIPS 2023

  5. arXiv:2212.03973  [pdf, other

    physics.soc-ph physics.data-an stat.AP

    Inferring urban polycentricity from the variability in human mobility patterns

    Authors: Carmen Cabrera-Arnau, Chen Zhong, Michael Batty, Ricardo Silva, Soong Moon Kang

    Abstract: The polycentric city model has gained popularity in spatial planning policy, since it is believed to overcome some of the problems often present in monocentric metropolises, ranging from congestion to difficult accessibility to jobs and services. However, the concept 'polycentric city' has a fuzzy definition and as a result, the extent to which a city is polycentric cannot be easily determined. He… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: 15 pages, 5 figures

    Journal ref: Sci. Rep. 13 (2023) 5751

  6. arXiv:2209.14995  [pdf, other

    stat.ME stat.AP

    Non-segmental Bayesian Detection of Multiple Change-points

    Authors: Chong Zhong, Zhihua Ma, Xu Zhang, Catherine C. Liu

    Abstract: We propose an original and general NOn-SEgmental (NOSE) approach for the detection of multiple change-points. NOSE identifies change-points by the non-negligibility of posterior estimates of the jump heights. Alternatively, under the Bayesian paradigm, NOSE treats the step-wise signal as a global infinite dimensional parameter drawn from a proposed process of atomic representation, where the rando… ▽ More

    Submitted 16 June, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

  7. arXiv:2205.14504  [pdf, other

    stat.ME

    Bayesian prediction via nonparametric transformation models

    Authors: Chong Zhong, ** Yang, Junshan Shen, Catherine Liu, Zhaohai Li

    Abstract: This article tackles the old problem of prediction via a nonparametric transformation model (NTM) in a new Bayesian way. Estimation of NTMs is known challenging due to model unidentifiability though appealing because of its robust prediction capability in survival analysis. Inspired by the uniqueness of the posterior predictive distribution, we achieve efficient prediction via the NTM aforemention… ▽ More

    Submitted 7 February, 2023; v1 submitted 28 May, 2022; originally announced May 2022.

    Comments: The corresponding R package BuLTM is available on GitHub https://github.com/LazyLaker

  8. arXiv:2205.12004  [pdf, other

    quant-ph cs.AI cs.LG stat.ML

    Quantum Kerr Learning

    Authors: Junyu Liu, Changchun Zhong, Matthew Otten, Anirban Chandra, Cristian L. Cortes, Chaoyang Ti, Stephen K Gray, Xu Han

    Abstract: Quantum machine learning is a rapidly evolving field of research that could facilitate important applications for quantum computing and also significantly impact data-driven sciences. In our work, based on various arguments from complexity theory and physics, we demonstrate that a single Kerr mode can provide some "quantum enhancements" when dealing with kernel-based methods. Using kernel properti… ▽ More

    Submitted 30 November, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: 20 pages, many figures. v2: significant updates, author added

    Journal ref: Mach. Learn.: Sci. Technol. 4 025003, 2023

  9. arXiv:2202.11389  [pdf, other

    cs.LG stat.ML

    Fast Sparse Classification for Generalized Linear and Additive Models

    Authors: Jiachang Liu, Chudi Zhong, Margo Seltzer, Cynthia Rudin

    Abstract: We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast sparse logistic regression, our computational speed-up over other best-subset search techniques owes to linear and quadratic surrogate cuts for the l… ▽ More

    Submitted 29 October, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

    Comments: AISTATS 2022

  10. arXiv:2112.13456  [pdf, other

    math.PR math.CO stat.CO

    Mallows permutation models with $L^1$ and $L^2$ distances I: hit and run algorithms and mixing times

    Authors: Chenyang Zhong

    Abstract: Mallows permutation model, introduced by Mallows in statistical ranking theory, is a class of non-uniform probability measures on the symmetric group $S_n$. The model depends on a distance metric $d(σ,τ)$ on $S_n$, which can be chosen from a host of metrics on permutations. In this paper, we focus on Mallows permutation models with $L^1$ and $L^2$ distances, respectively known in the statistics li… ▽ More

    Submitted 26 December, 2021; originally announced December 2021.

    Comments: 54 pages, 11 figures

  11. arXiv:2112.04912  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Scalable and Decentralized Algorithms for Anomaly Detection via Learning-Based Controlled Sensing

    Authors: Geethu Joseph, Chen Zhong, M. Cenk Gursoy, Senem Velipasalar, Pramod K. Varshney

    Abstract: We address the problem of sequentially selecting and observing processes from a given set to find the anomalies among them. The decision-maker observes a subset of the processes at any given time instant and obtains a noisy binary indicator of whether or not the corresponding process is anomalous. In this setting, we develop an anomaly detection algorithm that chooses the processes to be observed… ▽ More

    Submitted 8 December, 2021; originally announced December 2021.

    Comments: 13 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2105.06289

  12. arXiv:2109.03713  [pdf, other

    stat.ME stat.CO

    Dependent Dirichlet Processes for Analysis of a Generalized Shared Frailty Model

    Authors: Chong Zhong, Zhihua Ma, Junshan Shen, Catherine Liu

    Abstract: Bayesian paradigm takes advantage of well fitting complicated survival models and feasible computing in survival analysis owing to the superiority in tackling the complex censoring scheme, compared with the frequentist paradigm. In this chapter, we aim to display the latest tendency in Bayesian computing, in the sense of automating the posterior sampling, through Bayesian analysis of survival mode… ▽ More

    Submitted 9 September, 2021; v1 submitted 8 September, 2021; originally announced September 2021.

  13. arXiv:2109.02849  [pdf, other

    stat.CO math.ST stat.ME

    Convergence rate of a collapsed Gibbs sampler for crossed random effects models

    Authors: Swarnadip Ghosh, Chenyang Zhong

    Abstract: In this paper, we analyze the convergence rate of a collapsed Gibbs sampler for crossed random effects models. Our results apply to a substantially larger range of models than previous works, including models that incorporate missingness mechanism and unbalanced level data. The theoretical tools involved in our analysis include a connection between relaxation time and autoregression matrix, concen… ▽ More

    Submitted 21 October, 2021; v1 submitted 7 September, 2021; originally announced September 2021.

    Comments: 42 pages, 12 figures

  14. arXiv:2108.06568  [pdf

    stat.ME

    A Bayesian group sequential schema for ordinal endpoints

    Authors: Chengxue Zhong, Haitao Pan, Hongyu Miao

    Abstract: The ordinal endpoint is prevalent in clinical studies. For example, for the COVID-19, the most common endpoint used was 7-point ordinal scales. Another example is in phase II cancer studies, efficacy is often assessed as an ordinal variable based on a level of response of solid tumors with four categories: complete response, partial response, stable disease, and progression, though often a dichoto… ▽ More

    Submitted 9 June, 2022; v1 submitted 14 August, 2021; originally announced August 2021.

    Comments: 29 pages, 4 figures

  15. arXiv:2107.13480  [pdf, other

    stat.ME

    Survival stacking: casting survival analysis as a classification problem

    Authors: Erin Craig, Chenyang Zhong, Robert Tibshirani

    Abstract: While there are many well-developed data science methods for classification and regression, there are relatively few methods for working with right-censored data. Here, we present "survival stacking": a method for casting survival analysis problems as classification problems, thereby allowing the use of general classification methods and software in a survival setting. Inspired by the Cox partial… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

  16. arXiv:2103.11251  [pdf, other

    cs.LG stat.ML

    Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges

    Authors: Cynthia Rudin, Chaofan Chen, Zhi Chen, Haiyang Huang, Lesia Semenova, Chudi Zhong

    Abstract: Interpretability in machine learning (ML) is crucial for high stakes decisions and troubleshooting. In this work, we provide fundamental principles for interpretable ML, and dispel common misunderstandings that dilute the importance of this crucial topic. We also identify 10 technical challenge areas in interpretable machine learning and provide history and background on each problem. Some of thes… ▽ More

    Submitted 9 July, 2021; v1 submitted 20 March, 2021; originally announced March 2021.

    MSC Class: 68T01 ACM Class: I.2.6

    Journal ref: Statistics Surveys, 2021

  17. arXiv:2009.13598  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Anomaly Detection and Sampling Cost Control via Hierarchical GANs

    Authors: Chen Zhong, M. Cenk Gursoy, Senem Velipasalar

    Abstract: Anomaly detection incurs certain sampling and sensing costs and therefore it is of great importance to strike a balance between the detection accuracy and these costs. In this work, we study anomaly detection by considering the detection of threshold crossings in a stochastic time series without the knowledge of its statistics. To reduce the sampling cost in this detection process, we propose the… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

    Comments: 6 pages, 7 figures, has been accepted by Globecom 2020

  18. arXiv:2006.08690  [pdf, other

    cs.LG stat.ML

    Generalized and Scalable Optimal Sparse Decision Trees

    Authors: Jimmy Lin, Chudi Zhong, Diane Hu, Cynthia Rudin, Margo Seltzer

    Abstract: Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been made that have allowed practical algorithms to find optimal decision trees. These new techniques have the potential to trigger a paradigm shift where it is possi… ▽ More

    Submitted 22 November, 2022; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: This paper was published in ICML 2020

    ACM Class: I.2.6

  19. arXiv:1909.11171  [pdf, other

    stat.ME

    Survival analysis as a classification problem

    Authors: Chenyang Zhong, Robert Tibshirani

    Abstract: In this paper, we explore a method for treating survival analysis as a classification problem. The method uses a "stacking" idea that collects the features and outcomes of the survival data in a large data frame, and then treats it as a classification problem. In this framework, various statistical learning algorithms (including logistic regression, random forests, gradient boosting machines and n… ▽ More

    Submitted 26 September, 2019; v1 submitted 24 September, 2019; originally announced September 2019.

    Comments: 15 pages, 8 figures, 3 Tables

    MSC Class: 62N86

  20. arXiv:1908.08401  [pdf, ps, other

    cs.LG cs.IT stat.ML

    A Deep Actor-Critic Reinforcement Learning Framework for Dynamic Multichannel Access

    Authors: Chen Zhong, Ziyang Lu, M. Cenk Gursoy, Senem Velipasalar

    Abstract: To make efficient use of limited spectral resources, we in this work propose a deep actor-critic reinforcement learning based framework for dynamic multichannel access. We consider both a single-user case and a scenario in which multiple users attempt to access channels simultaneously. We employ the proposed framework as a single agent in the single-user case, and extend it to a decentralized mult… ▽ More

    Submitted 20 August, 2019; originally announced August 2019.

    Comments: 14 figures. arXiv admin note: text overlap with arXiv:1810.03695