Skip to main content

Showing 1–50 of 137 results for author: Mao, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.13989  [pdf, other

    stat.ML cs.IT cs.LG math.ST

    Random pairing MLE for estimation of item parameters in Rasch model

    Authors: Yuepeng Yang, Cong Ma

    Abstract: The Rasch model, a classical model in the item response theory, is widely used in psychometrics to model the relationship between individuals' latent traits and their binary responses on assessments or questionnaires. In this paper, we introduce a new likelihood-based estimator -- random pairing maximum likelihood estimator ($\mathsf{RP\text{-}MLE}$) and its bootstrapped variant multiple random pa… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2406.01378  [pdf, ps, other

    cs.LG stat.ML

    A Theory of Learnability for Offline Decision Making

    Authors: Chenjie Mao, Qiaosheng Zhang

    Abstract: We study the problem of offline decision making, which focuses on learning decisions from datasets only partially correlated with the learning objective. While previous research has extensively studied specific offline decision making problems like offline reinforcement learning (RL) and off-policy evaluation (OPE), a unified framework and theory remain absent. To address this gap, we introduce a… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  3. arXiv:2404.06969  [pdf, other

    cs.LG stat.ML

    FiP: a Fixed-Point Approach for Causal Generative Modeling

    Authors: Meyer Scetbon, Joel Jennings, Agrin Hilmkil, Cheng Zhang, Chao Ma

    Abstract: Modeling true world data-generating processes lies at the heart of empirical science. Structural Causal Models (SCMs) and their associated Directed Acyclic Graphs (DAGs) provide an increasingly popular answer to such problems by defining the causal generative process that transforms random noise into observations. However, learning them from observational data poses an ill-posed and NP-hard invers… ▽ More

    Submitted 14 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  4. arXiv:2402.17732  [pdf, other

    math.ST cs.LG stat.ML

    Batched Nonparametric Contextual Bandits

    Authors: Rong Jiang, Cong Ma

    Abstract: We study nonparametric contextual bandits under batch constraints, where the expected reward for each action is modeled as a smooth function of covariates, and the policy updates are made at the end of each batch of observations. We establish a minimax regret lower bound for this setting and propose a novel batch learning algorithm that achieves the optimal regret (up to logarithmic factors). In e… ▽ More

    Submitted 10 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Add lower bound when grid is adaptively chosen; add results on adaptivity to margin parameter

  5. arXiv:2402.07445  [pdf, other

    stat.ML cs.IT cs.LG math.ST

    Top-$K$ ranking with a monotone adversary

    Authors: Yuepeng Yang, Antares Chen, Lorenzo Orecchia, Cong Ma

    Abstract: In this paper, we address the top-$K$ ranking problem with a monotone adversary. We consider the scenario where a comparison graph is randomly generated and the adversary is allowed to add arbitrary edges. The statistician's goal is then to accurately identify the top-$K$ preferred items based on pairwise comparisons derived from this semi-random comparison graph. The main contribution of this pap… ▽ More

    Submitted 20 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: Accepted to Conference of Learning Theory, 2024

  6. arXiv:2402.00382  [pdf, other

    math.ST stat.ML

    On the design-dependent suboptimality of the Lasso

    Authors: Reese Pathak, Cong Ma

    Abstract: This paper investigates the effect of the design matrix on the ability (or inability) to estimate a sparse parameter in linear regression. More specifically, we characterize the optimal rate of estimation when the smallest singular value of the design matrix is bounded away from zero. In addition to this information-theoretic result, we provide and analyze a procedure which is simultaneously stati… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 19 pages, 1 figure

  7. arXiv:2402.00305  [pdf, ps, other

    math.ST cs.IT cs.SI stat.ML

    Information-Theoretic Thresholds for Planted Dense Cycles

    Authors: Cheng Mao, Alexander S. Wein, Shenduo Zhang

    Abstract: We study a random graph model for small-world networks which are ubiquitous in social and biological sciences. In this model, a dense cycle of expected bandwidth $n τ$, representing the hidden one-dimensional geometry of vertices, is planted in an ambient random graph on $n$ vertices. For both detection and recovery of the planted dense cycle, we characterize the information-theoretic thresholds i… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

    Comments: 31 pages, 1 figure

    MSC Class: 94A15; 62B10; 68Q87; 05C80; 05C60

  8. arXiv:2401.15913  [pdf, other

    eess.IV cs.CV cs.LG physics.flu-dyn stat.AP

    Vision-Informed Flow Image Super-Resolution with Quaternion Spatial Modeling and Dynamic Flow Convolution

    Authors: Qinglong Cao, Zhengqin Xu, Chao Ma, Xiaokang Yang, Yuntian Chen

    Abstract: Flow image super-resolution (FISR) aims at recovering high-resolution turbulent velocity fields from low-resolution flow images. Existing FISR methods mainly process the flow images in natural image patterns, while the critical and distinct flow visual properties are rarely considered. This negligence would cause the significant domain gap between flow and natural images to severely hamper the acc… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  9. arXiv:2401.11600  [pdf, other

    cs.LG stat.ML

    Understanding the Generalization Benefits of Late Learning Rate Decay

    Authors: Yinuo Ren, Chao Ma, Lexing Ying

    Abstract: Why do neural networks trained with large learning rates for a longer time often lead to better generalization? In this paper, we delve into this question by examining the relation between training and testing loss in neural networks. Through visualization of these losses, we note that the training trajectory with a large learning rate navigates through the minima manifold of the training loss, fi… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: Accepted by AISTATS 2024

  10. arXiv:2401.06557  [pdf, other

    cs.LG cs.AI cs.SI stat.ME

    Treatment-Aware Hyperbolic Representation Learning for Causal Effect Estimation with Social Networks

    Authors: Ziqiang Cui, Xing Tang, Yang Qiao, Bowei He, Liang Chen, Xiuqiang He, Chen Ma

    Abstract: Estimating the individual treatment effect (ITE) from observational data is a crucial research topic that holds significant value across multiple domains. How to identify hidden confounders poses a key challenge in ITE estimation. Recent studies have incorporated the structural information of social networks to tackle this challenge, achieving notable advancements. However, these methods utilize g… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: Accepted by SIAM SDM'24

  11. arXiv:2312.08878  [pdf, other

    cs.CV cs.LG stat.AP

    Domain Prompt Learning with Quaternion Networks

    Authors: Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang

    Abstract: Prompt learning has emerged as an effective and data-efficient technique in large Vision-Language Models (VLMs). However, when adapting VLMs to specialized domains such as remote sensing and medical imaging, domain prompt learning remains underexplored. While large-scale domain-specific foundation models can help tackle this challenge, their concentration on a single vision level makes it challeng… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  12. arXiv:2311.15961  [pdf, ps, other

    stat.ML cs.LG math.ST

    Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift

    Authors: Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi **

    Abstract: A key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization -- generalizing to target data whose distribution differs from that of source data. Despite its significant importance, the fundamental question of ``what are the most effective algorithms for OOD generalization'' remains open even under the standard setting of covariate shift. This paper addr… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  13. arXiv:2311.11163  [pdf, other

    cs.SI stat.AP stat.CO

    Hate speech and hate crimes: a data-driven study of evolving discourse around marginalized groups

    Authors: Malvina Bozhidarova, Jonathn Chang, Aaishah Ale-rasool, Yuxiang Liu, Chongyao Ma, Andrea L. Bertozzi, P. Jeffrey Brantingham, Junyuan Lin, Sanjukta Krishnagopal

    Abstract: This study explores the dynamic relationship between online discourse, as observed in tweets, and physical hate crimes, focusing on marginalized groups. Leveraging natural language processing techniques, including keyword extraction and topic modeling, we analyze the evolution of online discourse after events affecting these groups. Examining sentiment and polarizing tweets, we establish correlati… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  14. arXiv:2311.01902  [pdf, other

    cs.LG stat.ME

    High Precision Causal Model Evaluation with Conditional Randomization

    Authors: Chao Ma, Cheng Zhang

    Abstract: The gold standard for causal model evaluation involves comparing model predictions with true effects estimated from randomized controlled trials (RCT). However, RCTs are not always feasible or ethical to perform. In contrast, conditionally randomized experiments based on inverse probability weighting (IPW) offer a more realistic approach but may suffer from high estimation variance. To tackle this… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023 Camera Ready version

  15. arXiv:2310.12010  [pdf, other

    stat.ME

    A Note on Improving Variational Estimation for Multidimensional Item Response Theory

    Authors: Chenchen Ma, **g Ouyang, Chun Wang, Gongjun Xu

    Abstract: Survey instruments and assessments are frequently used in many domains of social science. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. However, the computational challenge of estimating MIRT models prohibits its wid… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  16. arXiv:2310.07958  [pdf, other

    cs.SE cs.CR cs.LG stat.ME

    Towards Causal Deep Learning for Vulnerability Detection

    Authors: Md Mahbubur Rahman, Ira Ceka, Chengzhi Mao, Saikat Chakraborty, Baishakhi Ray, Wei Le

    Abstract: Deep learning vulnerability detection has shown promising results in recent years. However, an important challenge that still blocks it from being very useful in practice is that the model is not robust under perturbation and it cannot generalize well over the out-of-distribution (OOD) data, e.g., applying a trained model to unseen projects in real world. We hypothesize that this is because the mo… ▽ More

    Submitted 14 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: ICSE 2024, Camera Ready Version

  17. arXiv:2310.06159  [pdf, other

    cs.LG math.OC stat.ML

    Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization

    Authors: Cong Ma, Xingyu Xu, Tian Tong, Yuejie Chi

    Abstract: Many problems encountered in science and engineering can be formulated as estimating a low-rank object (e.g., matrices and tensors) from incomplete, and possibly corrupted, linear measurements. Through the lens of matrix and tensor factorization, one of the most popular approaches is to employ simple iterative algorithms such as gradient descent (GD) to recover the low-rank factors directly, which… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: Book chapter for "Explorations in the Mathematics of Data Science - The Inaugural Volume of the Center for Approximation and Mathematical Data Analytics". arXiv admin note: text overlap with arXiv:2104.14526

  18. arXiv:2310.00809  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Towards Causal Foundation Model: on Duality between Causal Inference and Attention

    Authors: Jiaqi Zhang, Joel Jennings, Agrin Hilmkil, Nick Pawlowski, Cheng Zhang, Chao Ma

    Abstract: Foundation models have brought changes to the landscape of machine learning, demonstrating sparks of human-level intelligence across a diverse array of tasks. However, a gap persists in complex tasks such as causal inference, primarily due to challenges associated with intricate reasoning steps and high numerical precision requirements. In this work, we take a first step towards building causally-… ▽ More

    Submitted 3 June, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

  19. arXiv:2307.13634  [pdf, other

    stat.AP

    Exact Methods of Homogeneity Test of Proportions for Bilateral and Unilateral Correlated Data

    Authors: Shuyi Liang, Chang-Xing Ma

    Abstract: Subjects in clinical studies that investigate paired body parts can carry a disease on either both sides (bilateral) or a single side (unilateral) of the organs. Data in such studies may consist of both bilateral and unilateral records. However, the correlation between the paired organs is often ignored, which may lead to biased interpretations. Recent literatures have taken the correlation into a… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  20. arXiv:2306.06405  [pdf, other

    stat.CO

    Effects of 3D Position Fluctuations on Air-to-Ground mmWave UAV Communications

    Authors: Cunyan Ma, Xiaoya Li, Chen He, **ye Peng, Z. Jane Wang

    Abstract: Millimeter wave (mmWave)-based unmanned aerial vehicle (UAV) communication is a promising candidate for future communications due to its flexibility and sufficient bandwidth. However, random fluctuations in the position of hovering UAVs will lead to random variations in the blockage and signal-to-noise ratio (SNR) of the UAV-user link, thus affecting the quality of service (QoS) of the system. To… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

  21. arXiv:2306.03335  [pdf, other

    stat.ML cs.LG math.ST

    Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage

    Authors: Yu Gui, Cong Ma, Yiqiao Zhong

    Abstract: We investigate the role of projection heads, also known as projectors, within the encoder-projector framework (e.g., SimCLR) used in contrastive learning. We aim to demystify the observed phenomenon where representations learned before projectors outperform those learned after -- measured using the downstream linear classification accuracy, even when the projectors themselves are linear. In this… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  22. arXiv:2305.19001  [pdf, other

    stat.ML cs.IT cs.LG math.OC math.ST

    High-probability sample complexities for policy evaluation with linear function approximation

    Authors: Gen Li, Weichen Wu, Yuejie Chi, Cong Ma, Alessandro Rinaldo, Yuting Wei

    Abstract: This paper is concerned with the problem of policy evaluation with linear function approximation in discounted infinite horizon Markov decision processes. We investigate the sample complexities required to guarantee a predefined estimation error of the best linear coefficients for two widely-used policy evaluation algorithms: the temporal difference (TD) learning algorithm and the two-timescale li… ▽ More

    Submitted 2 May, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: The first two authors contributed equally; paper accepted to IEEE Transactions on Information Theory

  23. arXiv:2305.12679  [pdf, other

    cs.LG stat.ML

    Offline Reinforcement Learning with Additional Covering Distributions

    Authors: Chenjie Mao

    Abstract: We study learning optimal policies from a logged dataset, i.e., offline RL, with function approximation. Despite the efforts devoted, existing algorithms with theoretic finite-sample guarantees typically assume exploratory data coverage or strong realizable function classes, which is hard to be satisfied in reality. While there are recent works that successfully tackle these strong assumptions, th… ▽ More

    Submitted 24 May, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

  24. arXiv:2305.10637  [pdf, other

    stat.ME

    Conformalized matrix completion

    Authors: Yu Gui, Rina Foygel Barber, Cong Ma

    Abstract: Matrix completion aims to estimate missing entries in a data matrix, using the assumption of a low-complexity structure (e.g., low rank) so that imputation is possible. While many effective estimation algorithms exist in the literature, uncertainty quantification for this problem has proved to be challenging, and existing methods are extremely sensitive to model misspecification. In this work, we… ▽ More

    Submitted 22 October, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: accepted to 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  25. arXiv:2304.08135  [pdf, ps, other

    cs.DS cs.CC math.ST stat.ML

    Detection of Dense Subhypergraphs by Low-Degree Polynomials

    Authors: Abhishek Dhawan, Cheng Mao, Alexander S. Wein

    Abstract: Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem that has been extensively studied in recent years. We study a hypergraph version of the problem. Let $G^r(n,p)$ denote the $r$-uniform Erdős-Rényi hypergraph model with $n$ vertices and edge density $p$. We consider detecting the presence of a planted $G^r(n^γ, n^{-α})$ subhypergraph in a… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: 31 pages

  26. Homogeneity Tests and Interval Estimations of Risk Differences for Stratified Bilateral and Unilateral Correlated Data

    Authors: Shuyi Liang, Kai-Tai Fang, Xin-Wei Huang, Yi**g Xin, Chang-Xing Ma

    Abstract: In clinical trials studying paired parts of a subject with binary outcomes, it is expected to collect measurements bilaterally. However, there are cases where subjects contribute measurements for only one part. By utilizing combined data, it is possible to gain additional information compared to using bilateral or unilateral data alone. With the combined data, this article investigates homogeneity… ▽ More

    Submitted 20 November, 2023; v1 submitted 31 March, 2023; originally announced April 2023.

    Comments: 59 pages

  27. arXiv:2303.13557  [pdf, other

    stat.ME

    Confidence Intervals for Ratios of Proportions in Stratified Bilateral Correlated Data

    Authors: Wanqing Tian, Chang-Xing Ma

    Abstract: Confidence interval (CI) methods for stratified bilateral studies use intraclass correlation to avoid misleading results. In this article, we propose four CI methods (sample-size weighted global MLE-based Wald-type CI, complete MLE-based Wald-type CI, profile likelihood CI, and complete MLE-based score CI) to investigate CIs of proportion ratios to clinical trial design with stratified bilateral d… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: arXiv admin note: text overlap with arXiv:2303.12943

  28. arXiv:2303.12943  [pdf, other

    stat.ME

    Testing Homogeneity of Proportion Ratios for Stratified Bilateral Correlated Data

    Authors: Wanqing Tian, Chang-Xing Ma

    Abstract: Intraclass correlation in bilateral data has been investigated in recent decades with various statistical methods. In practice, stratifying bilateral data by some control variables will provide more sophisticated statistical results to satisfy different research proposed in random clinical trials. In this article, we propose three test statistics (likelihood ratio test, score test, and Wald-type t… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

  29. arXiv:2303.12703  [pdf, other

    cs.LG stat.ME

    Causal Reasoning in the Presence of Latent Confounders via Neural ADMG Learning

    Authors: Matthew Ashman, Chao Ma, Agrin Hilmkil, Joel Jennings, Cheng Zhang

    Abstract: Latent confounding has been a long-standing obstacle for causal reasoning from observational data. One popular approach is to model the data using acyclic directed mixed graphs (ADMGs), which describe ancestral relations between variables using directed and bidirected edges. However, existing methods using ADMGs are based on either linear functional assumptions or a discrete search that is complic… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: Camera ready version for ICLR 2023

  30. arXiv:2302.14483  [pdf, other

    cs.LG cs.CV stat.ML

    RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data

    Authors: Sangwoo Mo, Jong-Chyi Su, Chih-Yao Ma, Mido Assran, Ishan Misra, Licheng Yu, Sean Bell

    Abstract: Semi-supervised learning aims to train a model using limited labels. State-of-the-art semi-supervised methods for image classification such as PAWS rely on self-supervised representations learned with large-scale unlabeled but curated data. However, PAWS is often less effective when using real-world unlabeled data that is uncurated, e.g., contains out-of-class data. We propose RoPAWS, a robust ext… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: ICLR 2023

  31. arXiv:2302.10066  [pdf, other

    math.ST cs.LG stat.ML

    Sharp analysis of EM for learning mixtures of pairwise differences

    Authors: Abhishek Dhawan, Cheng Mao, Ashwin Pananjady

    Abstract: We consider a symmetric mixture of linear regressions with random samples from the pairwise comparison design, which can be seen as a noisy version of a type of Euclidean distance geometry problem. We analyze the expectation-maximization (EM) algorithm locally around the ground truth and establish that the sequence converges linearly, providing an $\ell_\infty$-norm guarantee on the estimation err… ▽ More

    Submitted 22 June, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: 45 pages, 2 figures

  32. arXiv:2302.06737  [pdf, ps, other

    math.ST cs.DS stat.ML

    Detection-Recovery Gap for Planted Dense Cycles

    Authors: Cheng Mao, Alexander S. Wein, Shenduo Zhang

    Abstract: Planted dense cycles are a type of latent structure that appears in many applications, such as small-world networks in social sciences and sequence assembly in computational biology. We consider a model where a dense cycle with expected bandwidth $n τ$ and edge density $p$ is planted in an Erdős-Rényi graph $G(n,q)$. We characterize the computational thresholds for the associated detection and rec… ▽ More

    Submitted 20 June, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: 41 pages, 1 figure

  33. arXiv:2302.01186  [pdf, other

    cs.LG eess.SP math.OC stat.ML

    The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing

    Authors: Xingyu Xu, Yandi Shen, Yuejie Chi, Cong Ma

    Abstract: We propose $\textsf{ScaledGD($λ$)}$, a preconditioned gradient descent method to tackle the low-rank matrix sensing problem when the true rank is unknown, and when the matrix is possibly ill-conditioned. Using overparametrized factor representations, $\textsf{ScaledGD($λ$)}$ starts from a small random initialization, and proceeds by gradient descent with a specific form of damped preconditioning t… ▽ More

    Submitted 6 November, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Comments: New analysis in the noisy and the approximately low-rank settings

  34. arXiv:2210.03820  [pdf, other

    cs.LG stat.ML

    The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks

    Authors: Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli

    Abstract: In this work, we explore the maximum-margin bias of quasi-homogeneous neural networks trained with gradient flow on an exponential loss and past a point of separability. We introduce the class of quasi-homogeneous models, which is expressive enough to describe nearly all neural networks with homogeneous activations, even those with biases, residual connections, and normalization layers, while stru… ▽ More

    Submitted 16 February, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: 41 pages, 5 figures, ICLR 2023

  35. arXiv:2210.03612  [pdf, ps, other

    stat.ML cs.AI cs.CR cs.CV cs.LG

    1st ICLR International Workshop on Privacy, Accountability, Interpretability, Robustness, Reasoning on Structured Data (PAIR^2Struct)

    Authors: Hao Wang, Wanyu Lin, Hao He, Di Wang, Chengzhi Mao, Muhan Zhang

    Abstract: Recent years have seen advances on principles and guidance relating to accountable and ethical use of artificial intelligence (AI) spring up around the globe. Specifically, Data Privacy, Accountability, Interpretability, Robustness, and Reasoning have been broadly recognized as fundamental principles of using machine learning (ML) technologies on decision-critical and/or privacy-sensitive applicat… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  36. arXiv:2210.02624  [pdf

    stat.AP

    Revealing Spatial-temporal Taxi Demand Patterns after Vaccination in COVID-19 Pandemic

    Authors: Zihao Li, Cheng Zhang, Xiaoqiang Kong, Yunlong Zhang, Chaolun Ma

    Abstract: The COVID-19 pandemic has had an unprecedented impact on our daily lives. With the increase in vaccination rate, normalcy gradually returns, so is the taxi demand. However, the changes in the spatial-temporal taxi demand pattern and factors impacting the recovery of this demand after COVID-19 vaccination started remain unclear. With the multisource time-series data from Chicago, including pandemic… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

  37. arXiv:2209.12313  [pdf, other

    cs.DS math.ST stat.ML

    Random graph matching at Otter's threshold via counting chandeliers

    Authors: Cheng Mao, Yihong Wu, Jiaming Xu, Sophie H. Yu

    Abstract: We propose an efficient algorithm for graph matching based on similarity scores constructed from counting a certain family of weighted trees rooted at each vertex. For two Erdős-Rényi graphs $\mathcal{G}(n,q)$ whose edges are correlated through a latent vertex correspondence, we show that this algorithm correctly matches all but a vanishing fraction of the vertices with high probability, provided… ▽ More

    Submitted 13 February, 2023; v1 submitted 25 September, 2022; originally announced September 2022.

  38. arXiv:2208.12972  [pdf, other

    stat.ME

    Generally-Altered, -Inflated, -Truncated and -Deflated Regression, With Application to Heaped and Seeped Data

    Authors: Thomas W. Yee, Chenchen Ma

    Abstract: Models such as the zero-inflated and zero-altered Poisson and zero-truncated binomial are well-established in modern regression analysis. We propose a super model that jointly and maximally unifies alteration, inflation, truncation and deflation for counts, given a 1- or 2-parameter parent (base) distribution. Seven disjoint sets of special value types are accommodated because all but truncation h… ▽ More

    Submitted 27 August, 2022; originally announced August 2022.

    Comments: 25 pages, 9 figures, 2 tables

    MSC Class: 62E10 ACM Class: G.3

  39. arXiv:2208.07996  [pdf, other

    stat.ME math.ST

    Correcting Convexity Bias in Function and Functional Estimate

    Authors: Chao Ma, Lexing Ying

    Abstract: A general framework with a series of different methods is proposed to improve the estimate of convex function (or functional) values when only noisy observations of the true input are available. Technically, our methods catch the bias introduced by the convexity and remove this bias from a baseline estimate. Theoretical analysis are conducted to show that the proposed methods can strictly reduce t… ▽ More

    Submitted 14 September, 2022; v1 submitted 16 August, 2022; originally announced August 2022.

    MSC Class: 62G05; 65K99

  40. arXiv:2207.06559  [pdf, other

    cs.LG cs.AI cs.MA math.OC stat.ML

    Scalable Model-based Policy Optimization for Decentralized Networked Systems

    Authors: Yali Du, Chengdong Ma, Yuchen Liu, Runji Lin, Hao Dong, Jun Wang, Yaodong Yang

    Abstract: Reinforcement learning algorithms require a large amount of samples; this often limits their real-world applications on even simple tasks. Such a challenge is more outstanding in multi-agent tasks, as each step of operation is more costly requiring communications or shifting or resources. This work aims to improve data efficiency of multi-agent control by model-based learning. We consider networke… ▽ More

    Submitted 1 September, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: 8 pages, 7 figures, accepted by The 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)

  41. arXiv:2206.09109  [pdf, other

    stat.ML cs.LG eess.SP math.OC

    Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent

    Authors: Harry Dong, Tian Tong, Cong Ma, Yuejie Chi

    Abstract: An increasing number of data science and machine learning problems rely on computation with tensors, which better capture the multi-way relationships and interactions of data than matrices. When tap** into this critical advantage, a key challenge is to develop computationally efficient and provably correct algorithms for extracting useful information from tensor data that are simultaneously robu… ▽ More

    Submitted 22 February, 2023; v1 submitted 18 June, 2022; originally announced June 2022.

  42. arXiv:2206.03299  [pdf, other

    cs.LG stat.ML

    Generalization Error Bounds for Deep Neural Networks Trained by SGD

    Authors: Mingze Wang, Chao Ma

    Abstract: Generalization error bounds for deep neural networks trained by stochastic gradient descent (SGD) are derived by combining a dynamical control of an appropriate parameter norm and the Rademacher complexity estimate based on parameter norms. The bounds explicitly depend on the loss along the training trajectory, and work for a wide range of network architectures including multilayer perceptron (MLP… ▽ More

    Submitted 29 May, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: 32 pages

  43. arXiv:2205.10671  [pdf, other

    cs.LG stat.ML

    Pessimism for Offline Linear Contextual Bandits using $\ell_p$ Confidence Sets

    Authors: Gene Li, Cong Ma, Nathan Srebro

    Abstract: We present a family $\{\hatπ\}_{p\ge 1}$ of pessimistic learning rules for offline learning of linear contextual bandits, relying on confidence sets with respect to different $\ell_p$ norms, where $\hatπ_2$ corresponds to Bellman-consistent pessimism (BCP), while $\hatπ_\infty$ is a novel generalization of lower confidence bound (LCB) to the linear setting. We show that the novel $\hatπ_\infty$ le… ▽ More

    Submitted 4 October, 2022; v1 submitted 21 May, 2022; originally announced May 2022.

    Comments: Accepted to NeurIPS 2022

  44. arXiv:2205.02986  [pdf, other

    math.ST cs.LG stat.ML

    Optimally tackling covariate shift in RKHS-based nonparametric regression

    Authors: Cong Ma, Reese Pathak, Martin J. Wainwright

    Abstract: We study the covariate shift problem in the context of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We focus on two natural families of covariate shift problems defined using the likelihood ratios between the source and target distributions. When the likelihood ratios are uniformly bounded, we prove that the kernel ridge regression (KRR) estimator with a carefully chose… ▽ More

    Submitted 6 June, 2023; v1 submitted 5 May, 2022; originally announced May 2022.

    Comments: to appear in the Annals of Statistics

  45. arXiv:2202.04599  [pdf, other

    cs.LG stat.ML

    Missing Data Imputation and Acquisition with Deep Hierarchical Models and Hamiltonian Monte Carlo

    Authors: Ignacio Peis, Chao Ma, José Miguel Hernández-Lobato

    Abstract: Variational Autoencoders (VAEs) have recently been highly successful at imputing and acquiring heterogeneous missing data. However, within this specific application domain, existing VAE methods are restricted by using only one layer of latent variables and strictly Gaussian posterior approximations. To address these limitations, we present HH-VAEM, a Hierarchical VAE model for mixed-type incomplet… ▽ More

    Submitted 22 December, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

    Comments: Published at NeurIPS 2022

  46. arXiv:2202.02837  [pdf, other

    math.ST cs.LG stat.ML

    A new similarity measure for covariate shift with applications to nonparametric regression

    Authors: Reese Pathak, Cong Ma, Martin J. Wainwright

    Abstract: We study covariate shift in the context of nonparametric regression. We introduce a new measure of distribution mismatch between the source and target distributions that is based on the integrated ratio of probabilities of balls at a given radius. We use the scaling of this measure with respect to the radius to characterize the minimax rate of estimation over a family of Hölder continuous function… ▽ More

    Submitted 6 February, 2022; originally announced February 2022.

    Comments: 22 pages, 2 figures, 1 table

  47. arXiv:2202.02195  [pdf, other

    stat.ML cs.LG

    Deep End-to-end Causal Inference

    Authors: Tomas Geffner, Javier Antoran, Adam Foster, Wenbo Gong, Chao Ma, Emre Kiciman, Amit Sharma, Angus Lamb, Martin Kukla, Nick Pawlowski, Miltiadis Allamanis, Cheng Zhang

    Abstract: Causal inference is essential for data-driven decision making across domains such as business engagement, medical treatment and policy making. However, research on causal discovery has evolved separately from inference methods, preventing straight-forward combination of methods from both fields. In this work, we develop Deep End-to-end Causal Inference (DECI), a single flow-based non-linear additi… ▽ More

    Submitted 20 June, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

  48. arXiv:2110.15468  [pdf, other

    stat.ME math.ST

    Interval Estimation of Relative Risks for Combined Unilateral and Bilateral Correlated Data

    Authors: Kejia Wang, Chang-Xing Ma

    Abstract: Measurements are generally collected as unilateral or bilateral data in clinical trials or observational studies. For example, in ophthalmology studies, the primary outcome is often obtained from one eye or both eyes of an individual. In medical studies, the relative risk is usually the parameter of interest and is commonly used. In this article, we develop three confidence intervals for the relat… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

  49. arXiv:2110.14708  [pdf, other

    cs.LG stat.ML

    Identifiable Generative Models for Missing Not at Random Data Imputation

    Authors: Chao Ma, Cheng Zhang

    Abstract: Real-world datasets often have missing values associated with complex generative processes, where the cause of the missingness may not be fully observed. This is known as missing not at random (MNAR) data. However, many imputation methods do not take into account the missingness mechanism, resulting in biased imputation values when MNAR data is present. Although there are a few methods that have c… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  50. arXiv:2110.11816  [pdf, other

    math.ST stat.ML

    Testing network correlation efficiently via counting trees

    Authors: Cheng Mao, Yihong Wu, Jiaming Xu, Sophie H. Yu

    Abstract: We propose a new procedure for testing whether two networks are edge-correlated through some latent vertex correspondence. The test statistic is based on counting the co-occurrences of signed trees for a family of non-isomorphic trees. When the two networks are Erdős-Rényi random graphs $\mathcal{G}(n,q)$ that are either independent or correlated with correlation coefficient $ρ$, our test runs in… ▽ More

    Submitted 1 April, 2022; v1 submitted 22 October, 2021; originally announced October 2021.