Skip to main content

Showing 1–50 of 205 results for author: Li, P

Searching in archive stat. Search in all archives.
.
  1. arXiv:2404.14786  [pdf, other

    cs.AI cs.LG stat.ME

    RealTCD: Temporal Causal Discovery from Interventional Data with Large Language Model

    Authors: Peiwen Li, Xin Wang, Zeyang Zhang, Yuan Meng, Fang Shen, Yue Li, Jialong Wang, Yang Li, Wenweu Zhu

    Abstract: In the field of Artificial Intelligence for Information Technology Operations, causal discovery is pivotal for operation and maintenance of graph construction, facilitating downstream industrial tasks such as root cause analysis. Temporal causal discovery, as an emerging method, aims to identify temporal causal relationships between variables directly from observations by utilizing interventional… ▽ More

    Submitted 26 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  2. arXiv:2404.06391  [pdf, other

    cs.LG stat.ML

    Exploring Neural Network Landscapes: Star-Shaped and Geodesic Connectivity

    Authors: Zhanran Lin, Puheng Li, Lei Wu

    Abstract: One of the most intriguing findings in the structure of neural network landscape is the phenomenon of mode connectivity: For two typical global minima, there exists a path connecting them without barrier. This concept of mode connectivity has played a crucial role in understanding important phenomena in deep learning. In this paper, we conduct a fine-grained analysis of this connectivity phenome… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: The first two authors contributed equally

  3. arXiv:2404.05976  [pdf, other

    cs.LG eess.SY stat.ME

    A Cyber Manufacturing IoT System for Adaptive Machine Learning Model Deployment by Interactive Causality Enabled Self-Labeling

    Authors: Yutian Ren, Yuqi He, Xuyin Zhang, Aaron Yen, G. P. Li

    Abstract: Machine Learning (ML) has been demonstrated to improve productivity in many manufacturing applications. To host these ML applications, several software and Industrial Internet of Things (IIoT) systems have been proposed for manufacturing applications to deploy ML applications and provide real-time intelligence. Recently, an interactive causality enabled self-labeling method has been proposed to ad… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  4. arXiv:2404.05809  [pdf, other

    cs.LG cs.AI stat.ME

    Self-Labeling in Multivariate Causality and Quantification for Adaptive Machine Learning

    Authors: Yutian Ren, Aaron Haohua Yen, G. P. Li

    Abstract: Adaptive machine learning (ML) aims to allow ML models to adapt to ever-changing environments with potential concept drift after model deployment. Traditionally, adaptive ML requires a new dataset to be manually labeled to tailor deployed models to altered data distributions. Recently, an interactive causality based self-labeling method was proposed to autonomously associate causally related data… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  5. arXiv:2401.14549  [pdf, other

    stat.ME

    Privacy-preserving Quantile Treatment Effect Estimation for Randomized Controlled Trials

    Authors: Leon Yao, Paul Yiming Li, Jiannan Lu

    Abstract: In accordance with the principle of "data minimization", many internet companies are opting to record less data. However, this is often at odds with A/B testing efficacy. For experiments with units with multiple observations, one popular data minimizing technique is to aggregate data for each unit. However, exact quantile estimation requires the full observation-level data. In this paper, we devel… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted to 2023 CODE conference as a parallel presentation

  6. arXiv:2311.01806  [pdf, other

    math.OC cs.LG math.ST stat.ML

    Sketching for Convex and Nonconvex Regularized Least Squares with Sharp Guarantees

    Authors: Yingzhen Yang, ** Li

    Abstract: Randomized algorithms are important for solving large-scale optimization problems. In this paper, we propose a fast sketching algorithm for least square problems regularized by convex or nonconvex regularization functions, Sketching for Regularized Optimization (SRO). Our SRO algorithm first generates a sketch of the original data matrix, then solves the sketched problem. Different from existing r… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  7. arXiv:2311.01797  [pdf, other

    cs.LG stat.ML

    On the Generalization Properties of Diffusion Models

    Authors: Puheng Li, Zhong Li, Huishuai Zhang, Jiang Bian

    Abstract: Diffusion models are a class of generative models that serve to establish a stochastic transport map between an empirically observed, yet unknown, target distribution and a known prior. Despite their remarkable success in real-world applications, a theoretical understanding of their generalization capabilities remains underdeveloped. This work embarks on a comprehensive theoretical exploration of… ▽ More

    Submitted 12 January, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: 42 pages, 11 figures

  8. arXiv:2311.00674  [pdf, other

    stat.ML cs.LG

    Recovering Linear Causal Models with Latent Variables via Cholesky Factorization of Covariance Matrix

    Authors: Yunfeng Cai, Xu Li, Minging Sun, ** Li

    Abstract: Discovering the causal relationship via recovering the directed acyclic graph (DAG) structure from the observed data is a well-known challenging combinatorial problem. When there are latent variables, the problem becomes even more difficult. In this paper, we first propose a DAG structure recovering algorithm, which is based on the Cholesky factorization of the covariance matrix of the observed da… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  9. arXiv:2310.20438  [pdf, ps, other

    stat.ML cs.LG

    The Phase Transition Phenomenon of Shuffled Regression

    Authors: Hang Zhang, ** Li

    Abstract: We study the phase transition phenomenon inherent in the shuffled (permuted) regression problem, which has found numerous applications in databases, privacy, data analysis, etc. In this study, we aim to precisely identify the locations of the phase transition points by leveraging techniques from message passing (MP). In our analysis, we first transform the permutation recovery problem into a proba… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

  10. arXiv:2310.12667  [pdf, other

    stat.ML cs.LG

    STANLEY: Stochastic Gradient Anisotropic Langevin Dynamics for Learning Energy-Based Models

    Authors: Belhal Karimi, Jianwen Xie, ** Li

    Abstract: We propose in this paper, STANLEY, a STochastic gradient ANisotropic LangEvin dYnamics, for sampling high dimensional data. With the growing efficacy and potential of Energy-Based modeling, also known as non-normalized probabilistic modeling, for modeling a generative process of different natures of high dimensional data observations, we present an end-to-end learning algorithm for Energy-Based mo… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: text overlap with arXiv:1207.5938 by other authors

  11. arXiv:2310.11377  [pdf, other

    cs.DS cs.LG stat.ML

    Faster Algorithms for Generalized Mean Densest Subgraph Problem

    Authors: Chenglin Fan, ** Li, Hanyu Peng

    Abstract: The densest subgraph of a large graph usually refers to some subgraph with the highest average degree, which has been extended to the family of $p$-means dense subgraph objectives by~\citet{veldt2021generalized}. The $p$-mean densest subgraph problem seeks a subgraph with the highest average $p$-th-power degree, whereas the standard densest subgraph problem seeks a subgraph with a simple highest a… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: text overlap with arXiv:2106.00909 by other authors

  12. arXiv:2310.01326  [pdf, other

    stat.ML cs.LG

    Optimal Estimator for Linear Regression with Shuffled Labels

    Authors: Hang Zhang, ** Li

    Abstract: This paper considers the task of linear regression with shuffled labels, i.e., $\mathbf Y = \mathbf Π\mathbf X \mathbf B + \mathbf W$, where $\mathbf Y \in \mathbb R^{n\times m}, \mathbf Pi \in \mathbb R^{n\times n}, \mathbf X\in \mathbb R^{n\times p}, \mathbf B \in \mathbb R^{p\times m}$, and $\mathbf W\in \mathbb R^{n\times m}$, respectively, represent the sensing results, (unknown or missing) c… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  13. arXiv:2309.03818  [pdf, other

    stat.ML cs.LG

    Empirical Risk Minimization for Losses without Variance

    Authors: Guanhua Fang, ** Li, Gennady Samorodnitsky

    Abstract: This paper considers an empirical risk minimization problem under heavy-tailed settings, where data does not have finite variance, but only has $p$-th moment with $p \in (1,2)$. Instead of using estimation procedure based on truncated observed data, we choose the optimizer by minimizing the risk value. Those risk values can be robustly estimated via using the remarkable Catoni's method (Catoni, 20… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  14. arXiv:2308.08165  [pdf, other

    math.OC cs.DC cs.LG stat.ML

    Stochastic Controlled Averaging for Federated Learning with Communication Compression

    Authors: Xinmeng Huang, ** Li, Xiaoyun Li

    Abstract: Communication compression, a technique aiming to reduce the information volume to be transmitted over the air, has gained great interests in Federated Learning (FL) for the potential of alleviating its communication overhead. However, communication compression brings forth new challenges in FL due to the interplay of compression-incurred information distortion and inherent characteristics of FL su… ▽ More

    Submitted 9 April, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: 45 pages, 4 figures

  15. arXiv:2306.07674  [pdf, ps, other

    stat.ML cs.CR cs.DS cs.LG

    Differentially Private One Permutation Hashing and Bin-wise Consistent Weighted Sampling

    Authors: Xiaoyun Li, ** Li

    Abstract: Minwise hashing (MinHash) is a standard algorithm widely used in the industry, for large-scale search and learning applications with the binary (0/1) Jaccard similarity. One common use of MinHash is for processing massive n-gram text representations so that practitioners do not have to materialize the original data (which would be prohibitive). Another popular use of MinHash is for building hash t… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  16. arXiv:2306.07607  [pdf, other

    cs.IR stat.ML

    Practice with Graph-based ANN Algorithms on Sparse Data: Chi-square Two-tower model, HNSW, Sign Cauchy Projections

    Authors: ** Li, Weijie Zhao, Chao Wang, Qi Xia, Alice Wu, Lijun Peng

    Abstract: Sparse data are common. The traditional ``handcrafted'' features are often sparse. Embedding vectors from trained models can also be very sparse, for example, embeddings trained via the ``ReLu'' activation function. In this paper, we report our exploration of efficient search in sparse data with graph-based ANN algorithms (e.g., HNSW, or SONG which is the GPU version of HNSW), which are popular in… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  17. arXiv:2306.01751  [pdf, ps, other

    cs.CR cs.LG stat.ML

    Differential Privacy with Random Projections and Sign Random Projections

    Authors: ** Li, Xiaoyun Li

    Abstract: In this paper, we develop a series of differential privacy (DP) algorithms from a family of random projections (RP) for general applications in machine learning, data mining, and information retrieval. Among the presented algorithms, iDP-SignRP is remarkably effective under the setting of ``individual differential privacy'' (iDP), based on sign random projections (SignRP). Also, DP-SignOPORP consi… ▽ More

    Submitted 13 June, 2023; v1 submitted 22 May, 2023; originally announced June 2023.

  18. arXiv:2306.01435  [pdf, other

    cs.LG stat.ML

    Improving Adversarial Robustness of DEQs with Explicit Regulations Along the Neural Dynamics

    Authors: Zonghan Yang, Peng Li, Tianyu Pang, Yang Liu

    Abstract: Deep equilibrium (DEQ) models replace the multiple-layer stacking of conventional deep networks with a fixed-point iteration of a single-layer transformation. Having been demonstrated to be competitive in a variety of real-world scenarios, the adversarial robustness of general DEQs becomes increasingly crucial for their reliable deployment. Existing works improve the robustness of general DEQ mode… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: Accepted at ICML 2023. Our code is available at https://github.com/minicheshire/DEQ-Regulating-Neural-Dynamics

  19. arXiv:2304.10499  [pdf, other

    math.OC stat.ML

    Projective Proximal Gradient Descent for A Class of Nonconvex Nonsmooth Optimization Problems: Fast Convergence Without Kurdyka-Lojasiewicz (KL) Property

    Authors: Yingzhen Yang, ** Li

    Abstract: Nonconvex and nonsmooth optimization problems are important and challenging for statistics and machine learning. In this paper, we propose Projected Proximal Gradient Descent (PPGD) which solves a class of nonconvex and nonsmooth optimization problems, where the nonconvexity and nonsmoothness come from a nonsmooth regularization term which is nonconvex but piecewise convex. In contrast with existi… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted by ICLR2023

  20. arXiv:2304.07918  [pdf, other

    cs.CV stat.ML

    Likelihood-Based Generative Radiance Field with Latent Space Energy-Based Model for 3D-Aware Disentangled Image Representation

    Authors: Yaxuan Zhu, Jianwen Xie, ** Li

    Abstract: We propose the NeRF-LEBM, a likelihood-based top-down 3D-aware 2D image generative model that incorporates 3D representation via Neural Radiance Fields (NeRF) and 2D imaging process via differentiable volume rendering. The model represents an image as a rendering process from 3D object to 2D image and is conditioned on some latent variables that account for object characteristics and are assumed t… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

  21. arXiv:2304.04359  [pdf, other

    stat.ME

    Privacy-preserving Inference of Group Mean Difference in Zero-inflated Right Skewed Data with Partitioning and Censoring

    Authors: Fang Liu, Ruyu Zhou, Yiming Paul Li, James Honaker, Milan Shen

    Abstract: We examine privacy-preserving inferences of group mean differences in zero-inflated right-skewed (zirs) data. Zero inflation and right skewness are typical characteristics of ads clicks and purchases data collected from e-commerce and social media platforms, where we also want to preserve user privacy to ensure that individual data is protected. In this work, we develop likelihood-based and model-… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

  22. arXiv:2303.18067  [pdf, other

    physics.ao-ph stat.AP

    Rediscover Climate Change during Global Warming Slowdown via Wasserstein Stability Analysis

    Authors: Zhiang Xie, Dongwei Chen, Puxi Li

    Abstract: Climate change is one of the key topics in climate science. However, previous research has predominantly concentrated on changes in mean values, and few research examines changes in Probability Distribution Function (PDF). In this study, a novel method called Wasserstein Stability Analysis (WSA) is developed to identify PDF changes, especially the extreme event shift and non-linear physical value… ▽ More

    Submitted 28 May, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: 14 pages, 4 figures, 1 Algorithm, and 3-page supplementary materials

  23. arXiv:2303.11233  [pdf, other

    cs.IT stat.ML

    Sparse Recovery with Shuffled Labels: Statistical Limits and Practical Estimators

    Authors: Hang Zhang, ** Li

    Abstract: This paper considers the sparse recovery with shuffled labels, i.e., $\by = \bPitrue \bX \bbetatrue + \bw$, where $\by \in \RR^n$, $\bPi\in \RR^{n\times n}$, $\bX\in \RR^{n\times p}$, $\bbetatrue\in \RR^p$, $\bw \in \RR^n$ denote the sensing result, the unknown permutation matrix, the design matrix, the sparse signal, and the additive noise, respectively. Our goal is to reconstruct both the permut… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

  24. arXiv:2302.03505  [pdf, ps, other

    stat.ML cs.LG

    OPORP: One Permutation + One Random Projection

    Authors: ** Li, Xiaoyun Li

    Abstract: Consider two $D$-dimensional data vectors (e.g., embeddings): $u, v$. In many embedding-based retrieval (EBR) applications where the vectors are generated from trained models, $D=256\sim 1024$ are common. In this paper, OPORP (one permutation + one random projection) uses a variant of the ``count-sketch'' type of data structures for achieving data reduction/compression. With OPORP, we first apply… ▽ More

    Submitted 23 May, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

  25. arXiv:2301.09300  [pdf, other

    stat.ML cs.LG

    A Tale of Two Latent Flows: Learning Latent Space Normalizing Flow with Short-run Langevin Flow for Approximate Inference

    Authors: Jianwen Xie, Yaxuan Zhu, Yifei Xu, Dingcheng Li, ** Li

    Abstract: We study a normalizing flow in the latent space of a top-down generator model, in which the normalizing flow model plays the role of the informative prior model of the generator. We propose to jointly learn the latent space normalizing flow prior model and the top-down generator model by a Markov chain Monte Carlo (MCMC)-based maximum likelihood algorithm, where a short-run Langevin sampling from… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

    Comments: The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI) 2023

  26. arXiv:2301.03142  [pdf, ps, other

    stat.ML cs.LG

    Exploration in Model-based Reinforcement Learning with Randomized Reward

    Authors: Lingxiao Wang, ** Li

    Abstract: Model-based Reinforcement Learning (MBRL) has been widely adapted due to its sample efficiency. However, existing worst-case regret analysis typically requires optimistic planning, which is not realistic in general. In contrast, motivated by the theory, empirical study utilizes ensemble of models, which achieve state-of-the-art performance on various testing environments. Such deviation between th… ▽ More

    Submitted 8 January, 2023; originally announced January 2023.

  27. arXiv:2301.03125  [pdf, ps, other

    stat.ML cs.LG math.OC

    Sharper Analysis for Minibatch Stochastic Proximal Point Methods: Stability, Smoothness, and Deviation

    Authors: Xiao-Tong Yuan, ** Li

    Abstract: The stochastic proximal point (SPP) methods have gained recent attention for stochastic optimization, with strong convergence guarantees and superior robustness to the classic stochastic gradient descent (SGD) methods showcased at little to no cost of computational overhead added. In this article, we study a minibatch variant of SPP, namely M-SPP, for solving convex composite risk minimization pro… ▽ More

    Submitted 8 January, 2023; originally announced January 2023.

  28. arXiv:2212.03070  [pdf, ps, other

    stat.ME stat.AP

    Hypothesis test on a mixture forward-incubation-time epidemic model with application to COVID-19 outbreak

    Authors: Chunlin Wang, Pengfei Li, Yukun Liu, Xiao-Hua Zhou, **g Qin

    Abstract: The distribution of the incubation period of the novel coronavirus disease that emerged in 2019 (COVID-19) has crucial clinical implications for understanding this disease and devising effective disease-control measures. Qin et al. (2020) designed a cross-sectional and forward follow-up study to collect the duration times between a specific observation time and the onset of COVID-19 symptoms for a… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

    Comments: 34 pages, 2 figures, 2 tables

    Journal ref: Statistica Sinica (2023)

  29. arXiv:2212.02083  [pdf, other

    cs.LG stat.ML

    On the Overlooked Structure of Stochastic Gradients

    Authors: Zeke Xie, Qian-Yuan Tang, Mingming Sun, ** Li

    Abstract: Stochastic gradients closely relate to both optimization and generalization of deep neural networks (DNNs). Some works attempted to explain the success of stochastic optimization for deep learning by the arguably heavy-tail properties of gradient noise, while other works presented theoretical and empirical evidence against the heavy-tail hypothesis on gradient noise. Unfortunately, formal statisti… ▽ More

    Submitted 20 October, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

    Comments: NeurIPS 2023. 20 pages, 16 figures, 17 Tables; Key Words: Deep Learning, Stochastic Gradient, Optimization. arXiv admin note: text overlap with arXiv:2201.13011

  30. arXiv:2211.15072  [pdf, other

    stat.ML cs.LG

    FaiREE: Fair Classification with Finite-Sample and Distribution-Free Guarantee

    Authors: Puheng Li, James Zou, Linjun Zhang

    Abstract: Algorithmic fairness plays an increasingly critical role in machine learning research. Several group fairness notions and algorithms have been proposed. However, the fairness guarantee of existing fair classification methods mainly depends on specific data distributional assumptions, often requiring large sample sizes, and fairness could be violated when there is a modest number of samples, which… ▽ More

    Submitted 9 October, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: 45 pages, 9 figures

  31. arXiv:2211.14292  [pdf, ps, other

    stat.ML cs.IT cs.LG

    Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression

    Authors: Xiaoyun Li, ** Li

    Abstract: In federated learning (FL) systems, e.g., wireless networks, the communication cost between the clients and the central server can often be a bottleneck. To reduce the communication cost, the paradigm of communication compression has become a popular strategy in the literature. In this paper, we focus on biased gradient compression techniques in non-convex FL problems. In the classical setting of… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

  32. arXiv:2211.08311  [pdf, ps, other

    stat.ML cs.LG

    On Penalization in Stochastic Multi-armed Bandits

    Authors: Guanhua Fang, ** Li, Gennady Samorodnitsky

    Abstract: We study an important variant of the stochastic multi-armed bandit (MAB) problem, which takes penalization into consideration. Instead of directly maximizing cumulative expected reward, we need to balance between the total reward and fairness level. In this paper, we present some new insights in MAB and formulate the problem in the penalization framework, where rigorous penalized regret can be wel… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  33. arXiv:2208.05635  [pdf, other

    stat.ME

    Penalized empirical likelihood estimation and EM algorithms for closed-population capture-recapture models

    Authors: Yang Liu, Pengfei Li, Yukun Liu

    Abstract: Capture-recapture experiments are widely used to estimate the abundance of a finite population. Based on capture-recapture data, the empirical likelihood (EL) method has been shown to outperform the conventional conditional likelihood (CL) method. However, the current literature on EL abundance estimation ignores behavioral effects, and the EL estimates may not be stable, especially when the captu… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

  34. arXiv:2208.03185  [pdf, ps, other

    math.ST cs.LG stat.ML

    Catoni-style Confidence Sequences under Infinite Variance

    Authors: Sujay Bhatt, Guanhua Fang, ** Li, Gennady Samorodnitsky

    Abstract: In this paper, we provide an extension of confidence sequences for settings where the variance of the data-generating distribution does not exist or is infinite. Confidence sequences furnish confidence intervals that are valid at arbitrary data-dependent stop** times, naturally having a wide range of applications. We first establish a lower bound for the width of the Catoni-style confidence sequ… ▽ More

    Submitted 5 August, 2022; originally announced August 2022.

    Comments: 10 pages

  35. arXiv:2207.08770  [pdf, ps, other

    stat.ML cs.LG

    Package for Fast ABC-Boost

    Authors: ** Li, Weijie Zhao

    Abstract: This report presents the open-source package which implements the series of our boosting works in the past years. In particular, the package includes mainly three lines of techniques, among which the following two are already the standard implementations in popular boosted tree platforms: (i) The histogram-based (feature-binning) approach makes the tree implementation convenient and efficient. I… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

  36. arXiv:2207.08667  [pdf, ps, other

    stat.ML cs.LG

    pGMM Kernel Regression and Comparisons with Boosted Trees

    Authors: ** Li, Weijie Zhao

    Abstract: In this work, we demonstrate the advantage of the pGMM (``powered generalized min-max'') kernel in the context of (ridge) regression. In recent prior studies, the pGMM kernel has been extensively evaluated for classification tasks, for logistic regression, support vector machines, as well as deep neural networks. In this paper, we provide an experimental study on ridge regression, to compare the p… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

  37. arXiv:2207.02722  [pdf, other

    stat.ML cs.AI cs.LG

    Variational Flow Graphical Model

    Authors: Shaogang Ren, Belhal Karimi, Dingcheng Li, ** Li

    Abstract: This paper introduces a novel approach to embed flow-based models with hierarchical structures. The proposed framework is named Variational Flow Graphical (VFG) Model. VFGs learn the representation of high dimensional data via a message-passing scheme by integrating flow-based functions through variational inference. By leveraging the expressive power of neural networks, VFGs produce a representat… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

  38. arXiv:2207.02058  [pdf, ps, other

    stat.ME stat.CO stat.ML

    Best Subset Selection with Efficient Primal-Dual Algorithm

    Authors: Shaogang Ren, Guanhua Fang, ** Li

    Abstract: Best subset selection is considered the `gold standard' for many sparse learning problems. A variety of optimization techniques have been proposed to attack this non-convex and NP-hard problem. In this paper, we investigate the dual forms of a family of $\ell_0$-regularized problems. An efficient primal-dual method has been developed based on the primal and dual problem structures. By leveraging t… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: arXiv admin note: text overlap with arXiv:1703.00119 by other authors

  39. arXiv:2206.12895  [pdf, other

    cs.DS stat.ML

    $k$-Median Clustering via Metric Embedding: Towards Better Initialization with Differential Privacy

    Authors: Chenglin Fan, ** Li, Xiaoyun Li

    Abstract: When designing clustering algorithms, the choice of initial centers is crucial for the quality of the learned clusters. In this paper, we develop a new initialization scheme, called HST initialization, for the $k$-median problem in the general metric space (e.g., discrete space induced by graphs), based on the construction of metric embedding tree structure of the data. From the tree, we propose a… ▽ More

    Submitted 8 July, 2022; v1 submitted 26 June, 2022; originally announced June 2022.

  40. arXiv:2206.11775  [pdf, ps, other

    stat.ME

    Regression with Label Permutation in Generalized Linear Model

    Authors: Guanhua Fang, ** Li

    Abstract: The assumption that response and predictor belong to the same statistical unit may be violated in practice. Unbiased estimation and recovery of true label ordering based on unlabeled data are challenging tasks and have attracted increasing attentions in the recent literature. In this paper, we present a relatively complete analysis of label permutation problem for the generalized linear model with… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

  41. arXiv:2206.11214  [pdf, other

    stat.ME

    Offline Change Detection under Contamination

    Authors: Sujay Bhatt, Guanhua Fang, ** Li

    Abstract: In this work, we propose a non-parametric and robust change detection algorithm to detect multiple change points in time series data under contamination. The contamination model is sufficiently general, in that, the most common model used in the context of change detection -- Huber contamination model -- is a special case. Also, the contamination model is oblivious and arbitrary. The change detect… ▽ More

    Submitted 23 June, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

  42. arXiv:2206.11079  [pdf, other

    stat.ML cs.LG

    Noisy $\ell^{0}$-Sparse Subspace Clustering on Dimensionality Reduced Data

    Authors: Yingzhen Yang, ** Li

    Abstract: Sparse subspace clustering methods with sparsity induced by $\ell^{0}$-norm, such as $\ell^{0}$-Sparse Subspace Clustering ($\ell^{0}$-SSC)~\citep{YangFJYH16-L0SSC-ijcv}, are demonstrated to be more effective than its $\ell^{1}$ counterpart such as Sparse Subspace Clustering (SSC)~\citep{ElhamifarV13}. However, the theoretical analysis of $\ell^{0}$-SSC is restricted to clean data that lie exactly… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

    Comments: Appeared at UAI 2022

  43. arXiv:2206.05187  [pdf, other

    stat.ML cs.LG

    On Convergence of FedProx: Local Dissimilarity Invariant Bounds, Non-smoothness and Beyond

    Authors: Xiao-Tong Yuan, ** Li

    Abstract: The FedProx algorithm is a simple yet powerful distributed proximal point optimization method widely used for federated learning (FL) over heterogeneous data. Despite its popularity and remarkable success witnessed in practice, the theoretical understanding of FedProx is largely underinvestigated: the appealing convergence behavior of FedProx is so far characterized under certain non-standard and… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

  44. arXiv:2206.03834  [pdf, ps, other

    stat.ML cs.LG

    Boosting the Confidence of Generalization for $L_2$-Stable Randomized Learning Algorithms

    Authors: Xiao-Tong Yuan, ** Li

    Abstract: Exponential generalization bounds with near-tight rates have recently been established for uniformly stable learning algorithms. The notion of uniform stability, however, is stringent in the sense that it is invariant to the data-generating distribution. Under the weaker and distribution dependent notions of stability such as hypothesis stability and $L_2$-stability, the literature suggests that o… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

  45. arXiv:2205.10927  [pdf, ps, other

    cs.LG stat.ML

    Fast ABC-Boost: A Unified Framework for Selecting the Base Class in Multi-Class Classification

    Authors: ** Li, Weijie Zhao

    Abstract: The work in ICML'09 showed that the derivatives of the classical multi-class logistic regression loss function could be re-written in terms of a pre-chosen "base class" and applied the new derivatives in the popular boosting framework. In order to make use of the new derivatives, one must have a strategy to identify/choose the base class at each boosting iteration. The idea of "adaptive base class… ▽ More

    Submitted 26 June, 2022; v1 submitted 22 May, 2022; originally announced May 2022.

  46. arXiv:2205.06924  [pdf, other

    stat.ML cs.LG

    A Tale of Two Flows: Cooperative Learning of Langevin Flow and Normalizing Flow Toward Energy-Based Model

    Authors: Jianwen Xie, Yaxuan Zhu, Jun Li, ** Li

    Abstract: This paper studies the cooperative learning of two generative flow models, in which the two models are iteratively updated based on the jointly synthesized examples. The first flow model is a normalizing flow that transforms an initial simple density to a target density by applying a sequence of invertible transformations. The second flow model is a Langevin flow that runs finite steps of gradient… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: 23 pages

    Journal ref: ICLR 2022

  47. arXiv:2205.05632  [pdf, other

    stat.ML cs.LG

    On Distributed Adaptive Optimization with Gradient Compression

    Authors: Xiaoyun Li, Belhal Karimi, ** Li

    Abstract: We study COMP-AMS, a distributed optimization framework based on gradient averaging and adaptive AMSGrad algorithm. Gradient compression with error feedback is applied to reduce the communication cost in the gradient transmission process. Our convergence analysis of COMP-AMS shows that such compressed gradient averaging strategy yields same convergence rate as standard AMSGrad, and also exhibits t… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

  48. arXiv:2205.00505  [pdf, ps, other

    stat.ME

    Statistical inference for the two-sample problem under likelihood ratio ordering, with application to the ROC curve estimation

    Authors: Dingding Hu, Meng Yuan, Tao Yu, Pengfei Li

    Abstract: The receiver operating characteristic (ROC) curve is a powerful statistical tool and has been widely applied in medical research. In the ROC curve estimation, a commonly used assumption is that larger the biomarker value, greater severity the disease. In this paper, we mathematically interpret ``greater severity of the disease" as ``larger probability of being diseased". This in turn is equivalent… ▽ More

    Submitted 22 February, 2023; v1 submitted 1 May, 2022; originally announced May 2022.

    Comments: 35 pages, 2 figure

  49. arXiv:2204.04567  [pdf, other

    cs.CV cs.LG stat.ML

    Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification

    Authors: Jiangtao Xie, Fei Long, Jiaming Lv, Qilong Wang, Peihua Li

    Abstract: Few-shot classification is a challenging problem as only very few training examples are given for each new task. One of the effective research lines to address this challenge focuses on learning deep representations driven by a similarity measure between a query image and few support images of some class. Statistically, this amounts to measure the dependency of image features, viewed as random vec… ▽ More

    Submitted 9 April, 2022; originally announced April 2022.

    Comments: Accepted to CVPR 2022 as an oral presentation. Equal contribution from first two authors

  50. arXiv:2203.10186  [pdf, other

    stat.ML cs.LG

    A Class of Two-Timescale Stochastic EM Algorithms for Nonconvex Latent Variable Models

    Authors: Belhal Karimi, ** Li

    Abstract: The Expectation-Maximization (EM) algorithm is a popular choice for learning latent variable models. Variants of the EM have been initially introduced, using incremental updates to scale to large datasets, and using Monte Carlo (MC) approximations to bypass the intractable conditional expectation of the latent data for most nonconvex models. In this paper, we propose a general class of methods cal… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.