Skip to main content

Showing 1–50 of 191 results for author: Mao, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.19049  [pdf, other

    cs.LG cs.AI stat.ML

    Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation

    Authors: Amartya Sanyal, Yaxi Hu, Yaodong Yu, Yian Ma, Yixin Wang, Bernhard Schölkopf

    Abstract: "Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisan… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2406.03296  [pdf, other

    stat.ME

    Multi-relational Network Autoregression Model with Latent Group Structures

    Authors: Yimeng Ren, Xuening Zhu, Ganggang Xu, Yanyuan Ma

    Abstract: Multi-relational networks among entities are frequently observed in the era of big data. Quantifying the effects of multiple networks have attracted significant research interest recently. In this work, we model multiple network effects through an autoregressive framework for tensor-valued time series. To characterize the potential heterogeneity of the networks and handle the high dimensionality o… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2212.02107

  3. arXiv:2406.00920  [pdf, ps, other

    stat.ML cs.LG math.OC

    Demystifying SGD with Doubly Stochastic Gradients

    Authors: Kyurae Kim, Joohwan Ko, Yi-An Ma, Jacob R. Gardner

    Abstract: Optimization objectives in the form of a sum of intractable expectations are rising in importance (e.g., diffusion models, variational autoencoders, and many more), a setting also known as "finite sum with infinite data." For these problems, a popular strategy is to employ SGD with doubly stochastic gradients (doubly SGD): the expectations are estimated using the gradient estimator of each compone… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML'24

  4. arXiv:2405.16734  [pdf, other

    stat.ML cs.LG

    Faster Sampling via Stochastic Gradient Proximal Sampler

    Authors: Xunpeng Huang, Difan Zou, Yi-An Ma, Hanze Dong, Tong Zhang

    Abstract: Stochastic gradients have been widely integrated into Langevin-based methods to improve their scalability and efficiency in solving large-scale sampling problems. However, the proximal sampler, which exhibits much faster convergence than Langevin-based algorithms in the deterministic setting Lee et al. (2021), has yet to be explored in its stochastic variants. In this paper, we study the Stochasti… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 48 pages, 2 figures, 5 tables

  5. arXiv:2405.16387  [pdf, other

    stat.ML cs.LG

    Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference

    Authors: Xunpeng Huang, Difan Zou, Hanze Dong, Yi Zhang, Yi-An Ma, Tong Zhang

    Abstract: To generate data from trained diffusion models, most inference algorithms, such as DDPM, DDIM, and other variants, rely on discretizing the reverse SDEs or their equivalent ODEs. In this paper, we view such approaches as decomposing the entire denoising diffusion process into several segments, each corresponding to a reverse transition kernel (RTK) sampling subproblem. Specifically, DDPM uses a Ga… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 68 pages, 2 figures

  6. arXiv:2405.13481  [pdf, other

    stat.ML cs.CR cs.LG

    Locally Private Estimation with Public Features

    Authors: Yuheng Ma, Ke Jia, Hanfang Yang

    Abstract: We initiate the study of locally differentially private (LDP) learning with public features. We define semi-feature LDP, where some features are publicly available while the remaining ones, along with the label, require protection under local differential privacy. Under semi-feature LDP, we demonstrate that the mini-max convergence rate for non-parametric regression is significantly reduced compar… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  7. arXiv:2405.10461  [pdf, other

    stat.ME

    Prediction in Measurement Error Models

    Authors: Fei Jiang, Yanyuan Ma

    Abstract: We study the well known difficult problem of prediction in measurement error models. By targeting directly at the prediction interval instead of the point prediction, we construct a prediction interval by providing estimators of both the center and the length of the interval which achieves a pre-determined prediction level. The constructing procedure requires a working model for the distribution o… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  8. arXiv:2405.06889  [pdf, other

    stat.ME math.OC

    Tuning parameter selection for the adaptive nuclear norm regularized trace regression

    Authors: Pan Shang, Lingchen Kong, Yiting Ma

    Abstract: Regularized models have been applied in lots of areas, with high-dimensional data sets being popular. Because tuning parameter decides the theoretical performance and computational efficiency of the regularized models, tuning parameter selection is a basic and important issue. We consider the tuning parameter selection for adaptive nuclear norm regularized trace regression, which achieves by the B… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  9. arXiv:2404.08913  [pdf, ps, other

    math.ST cs.IT cs.LG stat.ML

    On the best approximation by finite Gaussian mixtures

    Authors: Yun Ma, Yihong Wu, Pengkun Yang

    Abstract: We consider the problem of approximating a general Gaussian location mixture by finite mixtures. The minimum order of finite mixtures that achieve a prescribed accuracy (measured by various $f$-divergences) is determined within constant factors for the family of mixing distributions with compactly support or appropriate assumptions on the tail probability including subgaussian and subexponential.… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  10. arXiv:2404.02446  [pdf, other

    cs.LG stat.ML

    Masked Completion via Structured Diffusion with White-Box Transformers

    Authors: Druv Pai, Ziyang Wu, Sam Buchanan, Yaodong Yu, Yi Ma

    Abstract: Modern learning frameworks often train deep neural networks with massive amounts of unlabeled data to learn representations by solving simple pretext tasks, then use the representations as foundations for downstream tasks. These networks are empirically designed; as such, they are usually not interpretable, their representations are not structured, and their designs are potentially redundant. Whit… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: To be published at ICLR 2024; 44 pages. arXiv admin note: substantial text overlap with arXiv:2311.13110

  11. arXiv:2403.11163  [pdf, ps, other

    stat.ME cs.LG math.ST stat.CO

    A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques

    Authors: Xuetong Li, Yuan Gao, Hong Chang, Danyang Huang, Yingying Ma, Rui Pan, Haobo Qi, Feifei Wang, Shuyuan Wu, Ke Xu, **g Zhou, Xuening Zhu, Yingqiu Zhu, Hansheng Wang

    Abstract: This paper presents a selective review of statistical computation methods for massive data analysis. A huge amount of statistical methods for massive data computation have been rapidly developed in the past decades. In this work, we focus on three categories of statistical computation methods: (1) distributed computing, (2) subsampling methods, and (3) minibatch gradient techniques. The first clas… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  12. arXiv:2402.18533  [pdf, other

    stat.ME stat.CO

    Constructing Bayesian Optimal Designs for Discrete Choice Experiments by Simulated Annealing

    Authors: Yicheng Mao, Roselinde Kessels, Tom van der Zanden

    Abstract: Discrete Choice Experiments (DCEs) investigate the attributes that influence individuals' choices when selecting among various options. To enhance the quality of the estimated choice models, researchers opt for Bayesian optimal designs that utilize existing information about the attributes' preferences. Given the nonlinear nature of choice models, the construction of an appropriate design requires… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  13. arXiv:2402.15086  [pdf, other

    stat.ME

    A modified debiased inverse-variance weighted estimator in two-sample summary-data Mendelian randomization

    Authors: Youpeng Su, Siqi Xu, Yilei Ma, ** Yin, Wing Kam Fung, Hongwei Jiang, Peng Wang

    Abstract: Mendelian randomization uses genetic variants as instrumental variables to make causal inferences about the effects of modifiable risk factors on diseases from observational data. One of the major challenges in Mendelian randomization is that many genetic variants are only modestly or even weakly associated with the risk factor of interest, a setting known as many weak instruments. Many existing m… ▽ More

    Submitted 18 March, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: 33 pages, 6 figures

  14. arXiv:2402.03726  [pdf, other

    cs.LG stat.ML

    Learning Granger Causality from Instance-wise Self-attentive Hawkes Processes

    Authors: Dongxia Wu, Tsuyoshi Idé, Aurélie Lozano, Georgios Kollias, Jiří Navrátil, Naoki Abe, Yi-An Ma, Rose Yu

    Abstract: We address the problem of learning Granger causality from asynchronous, interdependent, multi-type event sequences. In particular, we are interested in discovering instance-level causal structures in an unsupervised manner. Instance-level causality identifies causal relationships among individual events, providing more fine-grained information for decision-making. Existing work in the literature e… ▽ More

    Submitted 29 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  15. arXiv:2402.01887  [pdf, other

    stat.ML cs.CV cs.LG

    On f-Divergence Principled Domain Adaptation: An Improved Framework

    Authors: Ziqiao Wang, Yongyi Mao

    Abstract: Unsupervised domain adaptation (UDA) plays a crucial role in addressing distribution shifts in machine learning. In this work, we improve the theoretical foundations of UDA proposed by Acuna et al. (2021) by refining their f-divergence-based discrepancy and additionally introducing a new measure, f-domain discrepancy (f-DD). By removing the absolute value function and incorporating a scaling param… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  16. arXiv:2402.01710  [pdf

    cs.CY cs.LG stat.AP

    Exploring Educational Equity: A Machine Learning Approach to Unravel Achievement Disparities in Georgia

    Authors: Yichen Ma, Dima Nazzal

    Abstract: The COVID-19 pandemic has significantly exacerbated existing educational disparities in Georgia's K-12 system, particularly in terms of racial and ethnic achievement gaps. Utilizing machine learning methods, the study conducts a comprehensive analysis of student achievement rates across different demographics, regions, and subjects. The findings highlight a significant decline in proficiency in En… ▽ More

    Submitted 25 January, 2024; originally announced February 2024.

  17. arXiv:2401.11742  [pdf

    cs.IR cs.DL stat.AP

    Knowledge Navigation: Inferring the Interlocking Map of Knowledge from Research Trajectories

    Authors: Shibing Xiang, Xin Jiang, Bing Liu, Yurui Huang, Chaolin Tian, Yifang Ma

    Abstract: "If I have seen further, it is by standing on the shoulders of giants," Isaac Newton's renowned statement hints that new knowledge builds upon existing foundations, which means there exists an interdependent relationship between knowledge, which, yet uncovered, is implied in the historical development of scientific systems for hundreds of years. By leveraging natural language processing techniques… ▽ More

    Submitted 27 January, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 28 pages, 9 figures, 5 tables

  18. arXiv:2401.06325  [pdf, other

    stat.ML cs.LG math.OC stat.CO

    Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo

    Authors: Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang

    Abstract: To sample from a general target distribution $p_*\propto e^{-f_*}$ beyond the isoperimetric condition, Huang et al. (2023) proposed to perform sampling through reverse diffusion, giving rise to Diffusion-based Monte Carlo (DMC). Specifically, DMC follows the reverse SDE of a diffusion process that transforms the target distribution to the standard Gaussian, utilizing a non-parametric score estimat… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: 54 pages

  19. arXiv:2312.02199  [pdf, other

    cs.CV cs.AI cs.LG eess.IV stat.AP

    USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery

    Authors: Jeremy Irvin, Lucas Tao, Joanne Zhou, Yuntao Ma, Langston Nashold, Benjamin Liu, Andrew Y. Ng

    Abstract: Large, self-supervised vision models have led to substantial advancements for automatically interpreting natural images. Recent works have begun tailoring these methods to remote sensing data which has rich structure with multi-sensor, multi-spectral, and temporal information providing massive amounts of self-labeled data that can be used for self-supervised pre-training. In this work, we develop… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  20. arXiv:2312.01046  [pdf, other

    stat.ML cs.LG math.ST

    Bagged Regularized $k$-Distances for Anomaly Detection

    Authors: Yuchao Cai, Yuheng Ma, Hanfang Yang, Hanyuan Hang

    Abstract: We consider the paradigm of unsupervised anomaly detection, which involves the identification of anomalies within a dataset in the absence of labeled examples. Though distance-based methods are top-performing for unsupervised anomaly detection, they suffer heavily from the sensitivity to the choice of the number of the nearest neighbors. In this paper, we propose a new distance-based algorithm cal… ▽ More

    Submitted 13 February, 2024; v1 submitted 2 December, 2023; originally announced December 2023.

  21. arXiv:2311.11369  [pdf, other

    stat.ML cs.CR cs.LG

    Optimal Locally Private Nonparametric Classification with Public Data

    Authors: Yuheng Ma, Hanfang Yang

    Abstract: In this work, we investigate the problem of public data assisted non-interactive Local Differentially Private (LDP) learning with a focus on non-parametric classification. Under the posterior drift assumption, we for the first time derive the mini-max optimal convergence rate with LDP constraint. Then, we present a novel approach, the locally differentially private classification tree, which attai… ▽ More

    Submitted 2 June, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

  22. arXiv:2310.20102  [pdf, ps, other

    stat.ML cs.IT cs.LG

    Sample-Conditioned Hypothesis Stability Sharpens Information-Theoretic Generalization Bounds

    Authors: Ziqiao Wang, Yongyi Mao

    Abstract: We present new information-theoretic generalization guarantees through the a novel construction of the "neighboring-hypothesis" matrix and a new family of stability notions termed sample-conditioned hypothesis (SCH) stability. Our approach yields sharper bounds that improve upon previous information-theoretic bounds in various learning scenarios. Notably, these bounds address the limitations of ex… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023

  23. arXiv:2310.18919  [pdf, other

    cs.LG cs.AI stat.ML

    Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation

    Authors: Nikki Li**g Kuang, Ming Yin, Mengdi Wang, Yu-Xiang Wang, Yi-An Ma

    Abstract: Recent studies in reinforcement learning (RL) have made significant progress by leveraging function approximation to alleviate the sample complexity hurdle for better performance. Despite the success, existing provably efficient algorithms typically rely on the accessibility of immediate feedback upon taking actions. The failure to account for the impact of delay in observations can significantly… ▽ More

    Submitted 3 November, 2023; v1 submitted 29 October, 2023; originally announced October 2023.

  24. arXiv:2310.14661  [pdf, other

    cs.LG stat.ML

    Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy

    Authors: Yingyu Lin, Yi-An Ma, Yu-Xiang Wang, Rachel Redberg, Zhiqi Bu

    Abstract: Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,δ)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the… ▽ More

    Submitted 1 May, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

  25. arXiv:2310.10048  [pdf, other

    stat.ME

    Evaluation of transplant benefits with the U.S. Scientific Registry of Transplant Recipients by semiparametric regression of mean residual life

    Authors: Ge Zhao, Yanyuan Ma, Huazhen Lin, Yi Li

    Abstract: Kidney transplantation is the most effective renal replacement therapy for end stage renal disease patients. With the severe shortage of kidney supplies and for the clinical effectiveness of transplantation, patient's life expectancy post transplantation is used to prioritize patients for transplantation; however, severe comorbidity conditions and old age are the most dominant factors that negativ… ▽ More

    Submitted 17 October, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: 68 pages, 13 figures. arXiv admin note: text overlap with arXiv:2011.04067

  26. arXiv:2310.06312  [pdf, other

    cs.LG stat.ML

    Discovering Mixtures of Structural Causal Models from Time Series Data

    Authors: Sumanth Varambally, Yi-An Ma, Rose Yu

    Abstract: Discovering causal relationships from time series data is significant in fields such as finance, climate science, and neuroscience. However, contemporary techniques rely on the simplifying assumption that data originates from the same causal model, while in practice, data is heterogeneous and can stem from different causal models. In this work, we relax this assumption and perform causal discovery… ▽ More

    Submitted 23 June, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  27. arXiv:2309.08543  [pdf, other

    stat.ME

    Fisher's combined probability test for cross-sectional independence in panel data models with serial correlation

    Authors: Hongfei Wang, Binghui Liu, Long Feng, Yanyuan Ma

    Abstract: Testing cross-sectional independence in panel data models is of fundamental importance in econometric analysis with high-dimensional panels. Recently, econometricians began to turn their attention to the problem in the presence of serial dependence. The existing procedure for testing cross-sectional independence with serial correlation is based on the sum of the sample cross-sectional correlations… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  28. arXiv:2307.14642  [pdf, ps, other

    stat.ML cs.LG stat.CO

    Linear Convergence of Black-Box Variational Inference: Should We Stick the Landing?

    Authors: Kyurae Kim, Yian Ma, Jacob R. Gardner

    Abstract: We prove that black-box variational inference (BBVI) with control variates, particularly the sticking-the-landing (STL) estimator, converges at a geometric (traditionally called "linear") rate under perfect variational family specification. In particular, we prove a quadratic bound on the gradient variance of the STL estimator, one which encompasses misspecified variational families. Combined with… ▽ More

    Submitted 18 June, 2024; v1 submitted 27 July, 2023; originally announced July 2023.

    Comments: Accepted to AISTATS'24; v5: fixed missing expectations in iteration complexity statements; v6: changed to an indexing-friendly bibliography style

  29. arXiv:2307.13381  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Scaff-PD: Communication Efficient Fair and Robust Federated Learning

    Authors: Yaodong Yu, Sai Praneeth Karimireddy, Yi Ma, Michael I. Jordan

    Abstract: We present Scaff-PD, a fast and communication-efficient algorithm for distributionally robust federated learning. Our approach improves fairness by optimizing a family of distributionally robust objectives tailored to heterogeneous clients. We leverage the special structure of these objectives, and design an accelerated primal dual (APD) algorithm which uses bias corrected local steps (as in Scaff… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4

  30. arXiv:2307.04250  [pdf, ps, other

    stat.ME

    Doubly Flexible Estimation under Label Shift

    Authors: Seong-ho Lee, Yanyuan Ma, Jiwei Zhao

    Abstract: In studies ranging from clinical medicine to policy research, complete data are usually available from a population $\mathscr{P}$, but the quantity of interest is often sought for a related but different population $\mathscr{Q}$ which only has partial data. In this paper, we consider the setting that both outcome $Y$ and covariate ${\bf X}$ are available from $\mathscr{P}$ whereas only ${\bf X}$ i… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

  31. arXiv:2307.02037  [pdf, other

    stat.ML cs.LG math.OC

    Reverse Diffusion Monte Carlo

    Authors: Xunpeng Huang, Hanze Dong, Yifan Hao, Yi-An Ma, Tong Zhang

    Abstract: We propose a Monte Carlo sampler from the reverse diffusion process. Unlike the practice of diffusion models, where the intermediary updates -- the score functions -- are learned with a neural network, we transform the score matching problem into a mean estimation one. By estimating the means of the regularized posterior distributions, we derive a novel Monte Carlo sampling algorithm called revers… ▽ More

    Submitted 13 March, 2024; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: 44 pages, 16 figures, ICLR 2024

  32. arXiv:2306.08803  [pdf, other

    cs.LG cs.AI stat.ML

    Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning

    Authors: Amin Karbasi, Nikki Li**g Kuang, Yi-An Ma, Siddharth Mitra

    Abstract: Thompson sampling (TS) is widely used in sequential decision making due to its ease of use and appealing empirical performance. However, many existing analytical and empirical results for TS rely on restrictive assumptions on reward distributions, such as belonging to conjugate families, which limits their applicability in realistic scenarios. Moreover, sequential decision making problems are ofte… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: ICML 2023

    ACM Class: G.3; I.2.0

  33. arXiv:2306.07549  [pdf, other

    cs.LG stat.ML

    Fixed-Budget Best-Arm Identification with Heterogeneous Reward Variances

    Authors: Anusha Lalitha, Kousha Kalantari, Yifei Ma, Anoop Deoras, Branislav Kveton

    Abstract: We study the problem of best-arm identification (BAI) in the fixed-budget setting with heterogeneous reward variances. We propose two variance-adaptive BAI algorithms for this setting: SHVar for known reward variances and SHAdaVar for unknown reward variances. Our algorithms rely on non-uniform budget allocations among the arms where the arms with higher reward variances are pulled more often than… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  34. arXiv:2306.02601  [pdf, other

    cs.LG math.OC stat.ML

    Aiming towards the minimizers: fast convergence of SGD for overparametrized problems

    Authors: Chaoyue Liu, Dmitriy Drusvyatskiy, Mikhail Belkin, Damek Davis, Yi-An Ma

    Abstract: Modern machine learning paradigms, such as deep learning, occur in or close to the interpolation regime, wherein the number of model parameters is much larger than the number of data samples. In this work, we propose a regularity condition within the interpolation regime which endows the stochastic gradient method with the same worst-case iteration complexity as the deterministic gradient method,… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  35. arXiv:2305.15349  [pdf, other

    cs.LG eess.SP math.OC stat.CO stat.ML

    On the Convergence of Black-Box Variational Inference

    Authors: Kyurae Kim, Jisu Oh, Kaiwen Wu, Yi-An Ma, Jacob R. Gardner

    Abstract: We provide the first convergence guarantee for full black-box variational inference (BBVI), also known as Monte Carlo variational inference. While preliminary investigations worked on simplified versions of BBVI (e.g., bounded domain, bounded support, only optimizing for the scale, and such), our setup does not need any such algorithmic modifications. Our results hold for log-smooth posterior dens… ▽ More

    Submitted 10 January, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted to NeurIPS'23; previous title: "Black-Box Variational Inference Converges"

  36. arXiv:2304.08974  [pdf, ps, other

    econ.EM stat.ME

    Doubly Robust Estimators with Weak Overlap

    Authors: Yukun Ma, Pedro H. C. Sant'Anna, Yuya Sasaki, Takuya Ura

    Abstract: In this paper, we derive a new class of doubly robust estimators for treatment effect estimands that is also robust against weak covariate overlap. Our proposed estimator relies on trimming observations with extreme propensity scores and uses a bias correction device for trimming bias. Our framework accommodates many research designs, such as unconfoundedness, local treatment effects, and differen… ▽ More

    Submitted 22 April, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

  37. arXiv:2303.14900  [pdf, other

    stat.AP

    Nonparametric approaches for analyzing carbon emission: from statistical and machine learning perspectives

    Authors: Yiming Ma, Hang Liu, Shanyong Wang

    Abstract: Linear regression models, especially the extended STIRPAT model, are routinely-applied for analyzing carbon emissions data. However, since the relationship between carbon emissions and the influencing factors is complex, fitting a simple parametric model may not be an ideal solution. This paper investigated various nonparametric approaches in statistics and machine learning (ML) for modeling carbo… ▽ More

    Submitted 26 March, 2023; originally announced March 2023.

  38. arXiv:2303.11054  [pdf, other

    stat.AP

    Some novel aspects of quantile regression: local stationarity, random forests and optimal transportation

    Authors: Manon Felix, Davide La Vecchia, Hang Liu, Yiming Ma

    Abstract: This paper is written for a Festschrift in honour of Professor Marc Hallin and it proposes some developments on quantile regression. We connect our investigation to Marc's scientific production and we present some theoretical and methodological advances for quantiles estimation in non standard settings. We split our contributions in two parts. The first part is about conditional quantiles estimati… ▽ More

    Submitted 9 September, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

  39. arXiv:2302.07533  [pdf, ps, other

    stat.ME

    Optimal Subsampling Bootstrap for Massive Data

    Authors: Yingying Ma, Chenlei Leng, Hansheng Wang

    Abstract: The bootstrap is a widely used procedure for statistical inference because of its simplicity and attractive statistical properties. However, the vanilla version of bootstrap is no longer feasible computationally for many modern massive datasets due to the need to repeatedly resample the entire data. Therefore, several improvements to the bootstrap method have been made in recent years, which asses… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  40. arXiv:2302.02768  [pdf, other

    stat.ME

    Network Autoregression for Incomplete Matrix-Valued Time Series

    Authors: Xuening Zhu, Feifei Wang, Zeng Li, Yanyuan Ma

    Abstract: We study the dynamics of matrix-valued time series with observed network structures by proposing a matrix network autoregression model with row and column networks of the subjects. We incorporate covariate information and a low rank intercept matrix. We allow incomplete observations in the matrices and the missing mechanism can be covariate dependent. To estimate the model, a two-step estimation p… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

  41. arXiv:2302.02432  [pdf, other

    stat.ML cs.IT cs.LG

    Tighter Information-Theoretic Generalization Bounds from Supersamples

    Authors: Ziqiao Wang, Yongyi Mao

    Abstract: In this work, we present a variety of novel information-theoretic generalization bounds for learning algorithms, from the supersample setting of Steinke & Zakynthinou (2020)-the setting of the "conditional mutual information" framework. Our development exploits projecting the loss pair (obtained from a training instance and a testing instance) down to a single number and correlating loss values wi… ▽ More

    Submitted 15 June, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: Accepted to ICML 2023, fixed some typos in the camera-ready version

  42. arXiv:2301.06297  [pdf, other

    math.ST stat.ML

    Inference via robust optimal transportation: theory and methods

    Authors: Yiming Ma, Hang Liu, Davide La Vecchia, Metthieu Lerasle

    Abstract: Optimal transportation theory and the related $p$-Wasserstein distance ($W_p$, $p\geq 1$) are widely-applied in statistics and machine learning. In spite of their popularity, inference based on these tools has some issues. For instance, it is sensitive to outliers and it may not be even defined when the underlying model has infinite moments. To cope with these problems, first we consider a robust… ▽ More

    Submitted 29 February, 2024; v1 submitted 16 January, 2023; originally announced January 2023.

  43. arXiv:2212.06338  [pdf, other

    stat.ML cs.LG

    Minimax Optimal Estimation of Stability Under Distribution Shift

    Authors: Hongseok Namkoong, Yuanzhe Ma, Peter W. Glynn

    Abstract: The performance of decision policies and prediction models often deteriorates when applied to environments different from the ones seen during training. To ensure reliable operation, we analyze the stability of a system under distribution shift, which is defined as the smallest change in the underlying environment that causes the system's performance to deteriorate beyond a permissible threshold.… ▽ More

    Submitted 24 June, 2024; v1 submitted 12 December, 2022; originally announced December 2022.

  44. arXiv:2212.02107  [pdf, other

    stat.ME

    Matrix-valued Network Autoregression Model with Latent Group Structure

    Authors: Yimeng Ren, Xuening Zhu, Yanyuan Ma

    Abstract: Matrix-valued time series data are frequently observed in a broad range of areas and have attracted great attention recently. In this work, we model network effects for high dimensional matrix-valued time series data in a matrix autoregression framework. To characterize the potential heterogeneity of the subjects and handle the high dimensionality simultaneously, we assume that each subject has a… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

  45. arXiv:2211.13549  [pdf, ps, other

    stat.ML cs.LG

    Online Regularized Learning Algorithm for Functional Data

    Authors: Yuan Mao, Zheng-Chu Guo

    Abstract: In recent years, functional linear models have attracted growing attention in statistics and machine learning, with the aim of recovering the slope function or its functional predictor. This paper considers online regularized learning algorithm for functional linear models in reproducing kernel Hilbert spaces. Convergence analysis of excess prediction error and estimation error are provided with p… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Comments: 32 pages

  46. arXiv:2211.02964  [pdf, other

    stat.ME

    Testing for high-dimensional white noise

    Authors: Long Feng, Binghui Liu, Yanyuan Ma

    Abstract: Testing for multi-dimensional white noise is an important subject in statistical inference. Such test in the high-dimensional case becomes an open problem waiting to be solved, especially when the dimension of a time series is comparable to or even greater than the sample size. To detect an arbitrary form of departure from high-dimensional white noise, a few tests have been developed. Some of thes… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

    Comments: 84 pages

    MSC Class: 62H15

  47. arXiv:2209.15261  [pdf, other

    cs.LG cs.CV stat.ML

    Minimalistic Unsupervised Learning with the Sparse Manifold Transform

    Authors: Yubei Chen, Zeyu Yun, Yi Ma, Bruno Olshausen, Yann LeCun

    Abstract: We describe a minimalistic and interpretable method for unsupervised learning, without resorting to data augmentation, hyperparameter tuning, or other engineering designs, that achieves performance close to the SOTA SSL methods. Our approach leverages the sparse manifold transform, which unifies sparse coding, manifold learning, and slow feature analysis. With a one-layer deterministic sparse mani… ▽ More

    Submitted 27 April, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: This paper is published at ICLR 2023

    Journal ref: The Eleventh International Conference on Learning Representations (2023)

  48. arXiv:2208.12427  [pdf, ps, other

    stat.ML cs.LG

    Coefficient-based Regularized Distribution Regression

    Authors: Yuan Mao, Lei Shi, Zheng-Chu Guo

    Abstract: In this paper, we consider the coefficient-based regularized distribution regression which aims to regress from probability measures to real-valued responses over a reproducing kernel Hilbert space (RKHS), where the regularization is put on the coefficients and kernels are assumed to be indefinite. The algorithm involves two stages of sampling, the first stage sample consists of distributions and… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

  49. arXiv:2207.11208  [pdf, other

    stat.ML cs.LG

    Statistical and Computational Trade-offs in Variational Inference: A Case Study in Inferential Model Selection

    Authors: Kush Bhatia, Nikki Li**g Kuang, Yi-An Ma, Yixin Wang

    Abstract: Variational inference has recently emerged as a popular alternative to the classical Markov chain Monte Carlo (MCMC) in large-scale Bayesian inference. The core idea is to trade statistical accuracy for computational efficiency. In this work, we study these statistical and computational trade-offs in variational inference via a case study in inferential model selection. Focusing on Gaussian infere… ▽ More

    Submitted 6 August, 2023; v1 submitted 22 July, 2022; originally announced July 2022.

    Comments: 57 pages, 8 figures

  50. arXiv:2207.06343  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels

    Authors: Yaodong Yu, Alexander Wei, Sai Praneeth Karimireddy, Yi Ma, Michael I. Jordan

    Abstract: State-of-the-art federated learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions. For neural networks, even when centralized SGD easily finds a solution that is simultaneously performant for all clients, current federated optimization methods fail to converge to a comparable solution. We show that this performance disparity can l… ▽ More

    Submitted 5 October, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: Accepted at Neural Information Processing Systems (NeurIPS) 2022. V2 releases code

    MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4