Skip to main content

Showing 1–50 of 228 results for author: Li, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.16306  [pdf, other

    cs.CL cs.LG stat.ML

    Cascade Reward Sampling for Efficient Decoding-Time Alignment

    Authors: Bolian Li, Yifan Wang, Ananth Grama, Ruqi Zhang

    Abstract: Aligning large language models (LLMs) with human preferences is critical for their deployment. Recently, decoding-time alignment has emerged as an effective plug-and-play technique that requires no fine-tuning of model parameters. However, generating text that achieves both high reward and high likelihood remains a significant challenge. Existing methods often fail to generate high-reward text or… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2406.06920  [pdf, other

    stat.AP

    Where to place a mosquito trap for West Nile Virus surveillance?

    Authors: Anwesha Chakravarti, Bo Li, Dan Bartlett, Patrick Irwin, Rebecca Smith

    Abstract: The rapid spread of West Nile Virus (WNV) is a growing concern. With no vaccines or specific medications available, prevention through mosquito control is the only solution to curb the spread. Mosquito traps, used to detect viral presence in mosquito populations, are essential tools for WNV surveillance. But how do we decide where to place a mosquito trap? And what makes a good trap location, anyw… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 22 pages, 9 figures

  3. arXiv:2405.04254  [pdf, ps, other

    stat.ME

    Distributed variable screening for generalized linear models

    Authors: Tianbo Diao, Lianqiang Qu, Bo Li, Liuquan Sun

    Abstract: In this article, we develop a distributed variable screening method for generalized linear models. This method is designed to handle situations where both the sample size and the number of covariates are large. Specifically, the proposed method selects relevant covariates by using a sparsity-restricted surrogate likelihood estimator. It takes into account the joint effects of the covariates rather… ▽ More

    Submitted 7 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  4. arXiv:2404.06735  [pdf, other

    stat.ML cs.LG math.ST stat.AP stat.ME

    A Copula Graphical Model for Multi-Attribute Data using Optimal Transport

    Authors: Qi Zhang, Bing Li, Lingzhou Xue

    Abstract: Motivated by modern data forms such as images and multi-view data, the multi-attribute graphical model aims to explore the conditional independence structure among vectors. Under the Gaussian assumption, the conditional independence between vectors is characterized by blockwise zeros in the precision matrix. To relax the restrictive Gaussian assumption, in this paper, we introduce a novel semipara… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 37 pages

  5. arXiv:2403.11348  [pdf, other

    cs.LG cs.AI stat.ML

    COLEP: Certifiably Robust Learning-Reasoning Conformal Prediction via Probabilistic Circuits

    Authors: Mintong Kang, Nezihe Merve Gürel, Linyi Li, Bo Li

    Abstract: Conformal prediction has shown spurring performance in constructing statistically rigorous prediction sets for arbitrary black-box machine learning models, assuming the data is exchangeable. However, even small adversarial perturbations during the inference can violate the exchangeability assumption, challenge the coverage guarantees, and result in a subsequent decline in empirical coverage. In th… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted to ICLR 2024

  6. arXiv:2402.14260  [pdf, other

    stat.ME

    Linear Discriminant Regularized Regression

    Authors: Xin Bing, Bingqing Li, Marten Wegkamp

    Abstract: Linear Discriminant Analysis (LDA) is an important classification approach. Its simple linear form makes it easy to interpret and it is capable to handle multi-class responses. It is closely related to other classical multivariate statistical techniques, such as Fisher's discriminant analysis, canonical correlation analysis and linear regression. In this paper we strengthen its connection to multi… ▽ More

    Submitted 18 May, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  7. arXiv:2402.07355  [pdf, ps, other

    math.ST cs.LG stat.ML

    Sampling from the Mean-Field Stationary Distribution

    Authors: Yunbum Kook, Matthew S. Zhang, Sinho Chewi, Murat A. Erdogdu, Mufan Bill Li

    Abstract: We study the complexity of sampling from the stationary distribution of a mean-field SDE, or equivalently, the complexity of minimizing a functional over the space of probability measures which includes an interaction term. Our main insight is to decouple the two key aspects of this problem: (1) approximation of the mean-field SDE via a finite-particle system, via uniform-in-time propagation of ch… ▽ More

    Submitted 18 February, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  8. arXiv:2401.14657  [pdf, other

    stat.AP cs.LG physics.ao-ph stat.ML

    Validating Climate Models with Spherical Convolutional Wasserstein Distance

    Authors: Robert C. Garrett, Trevor Harris, Bo Li, Zhuo Wang

    Abstract: The validation of global climate models is crucial to ensure the accuracy and efficacy of model output. We introduce the spherical convolutional Wasserstein distance to more comprehensively measure differences between climate models and reanalysis data. This new similarity measure accounts for spatial variability using convolutional projections and quantifies local differences in the distribution… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  9. arXiv:2401.13314  [pdf, other

    q-fin.RM math.NA q-fin.CP stat.ML

    An Explicit Scheme for Pathwise XVA Computations

    Authors: Lokman Abbas-Turki, Stéphane Crépey, Botao Li, Bouazza Saadeddine

    Abstract: Motivated by the equations of cross valuation adjustments (XVAs) in the realistic case where capital is deemed fungible as a source of funding for variation margin, we introduce a simulation/regression scheme for a class of anticipated BSDEs, where the coefficient entails a conditional expected shortfall of the martingale part of the solution. The scheme is explicit in time and uses neural network… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  10. arXiv:2311.14042  [pdf, other

    stat.ME

    Optimized Covariance Design for AB Test on Social Network under Interference

    Authors: Qianyi Chen, Bo Li, Lu Deng, Yong Wang

    Abstract: Online A/B tests have become increasingly popular and important for social platforms. However, accurately estimating the global average treatment effect (GATE) has proven to be challenging due to network interference, which violates the Stable Unit Treatment Value Assumption (SUTVA) and poses a great challenge to experimental design. Existing network experimental design research was mostly based o… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  11. arXiv:2311.00966  [pdf, other

    cs.LG stat.ML

    Invariant-Feature Subspace Recovery: A New Class of Provable Domain Generalization Algorithms

    Authors: Haoxiang Wang, Gargi Balasubramaniam, Haozhe Si, Bo Li, Han Zhao

    Abstract: Domain generalization asks for models trained over a set of training environments to generalize well in unseen test environments. Recently, a series of algorithms such as Invariant Risk Minimization (IRM) have been proposed for domain generalization. However, Rosenfeld et al. (2021) shows that in a simple linear data model, even if non-convexity issues are ignored, IRM and its extensions cannot ge… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: Submitted to JMLR. This journal version significantly extends our ICML 2022 paper, arXiv:2201.12919

  12. arXiv:2310.13852  [pdf, other

    cs.LG stat.ML

    Gradual Domain Adaptation: Theory and Algorithms

    Authors: Yifei He, Haoxiang Wang, Bo Li, Han Zhao

    Abstract: Unsupervised domain adaptation (UDA) adapts a model from a labeled source domain to an unlabeled target domain in a one-off way. Though widely applied, UDA faces a great challenge whenever the distribution shift between the source and the target is large. Gradual domain adaptation (GDA) mitigates this limitation by using intermediate domains to gradually adapt from the source to the target domain.… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2204.08200

  13. arXiv:2310.12079  [pdf, other

    stat.ML cs.LG

    Differential Equation Scaling Limits of Shaped and Unshaped Neural Networks

    Authors: Mufan Bill Li, Mihai Nica

    Abstract: Recent analyses of neural networks with shaped activations (i.e. the activation function is scaled as the network size grows) have led to scaling limits described by differential equations. However, these results do not a priori tell us anything about "ordinary" unshaped networks, where the activation is unchanged as the network size grows. In this article, we find similar differential equation ba… ▽ More

    Submitted 18 April, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

  14. arXiv:2310.09639  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    DPZero: Private Fine-Tuning of Language Models without Backpropagation

    Authors: Liang Zhang, Bingcong Li, Kiran Koshy Thekumparampil, Sewoong Oh, Niao He

    Abstract: The widespread practice of fine-tuning large language models (LLMs) on domain-specific data faces two major challenges in memory and privacy. First, as the size of LLMs continues to grow, the memory demands of gradient-based training methods via backpropagation become prohibitively high. Second, given the tendency of LLMs to memorize training data, it is important to protect potentially sensitive… ▽ More

    Submitted 6 June, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

    Comments: ICML 2024

  15. arXiv:2310.07817  [pdf, other

    stat.ME math.ST

    Nonlinear global Fréchet regression for random objects via weak conditional expectation

    Authors: Satarupa Bhattacharjee, Bing Li, Lingzhou Xue

    Abstract: Random objects are complex non-Euclidean data taking value in general metric space, possibly devoid of any underlying vector space structure. Such data are getting increasingly abundant with the rapid advancement in technology. Examples include probability distributions, positive semi-definite matrices, and data on Riemannian manifolds. However, except for regression for object-valued response wit… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    MSC Class: 62G05; 62J02; 62G08; 62J99

  16. arXiv:2310.07801  [pdf, other

    cs.CV cs.AI stat.ME

    Trajectory-aware Principal Manifold Framework for Data Augmentation and Image Generation

    Authors: Elvis Han Cui, Bingbin Li, Yanan Li, Weng Kee Wong, Donghui Wang

    Abstract: Data augmentation for deep learning benefits model training, image transformation, medical imaging analysis and many other fields. Many existing methods generate new samples from a parametric distribution, like the Gaussian, with little attention to generate samples along the data manifold in either the input or feature space. In this paper, we verify that there are theoretical and practical advan… ▽ More

    Submitted 30 July, 2023; originally announced October 2023.

    Comments: 20 figures

  17. arXiv:2310.05401  [pdf, other

    cs.LG stat.ML

    Entropy-MCMC: Sampling from Flat Basins with Ease

    Authors: Bolian Li, Ruqi Zhang

    Abstract: Bayesian deep learning counts on the quality of posterior distribution estimation. However, the posterior of deep neural networks is highly multi-modal in nature, with local modes exhibiting varying generalization performance. Given a practical budget, targeting at the original posterior can lead to suboptimal performance, as some samples may become trapped in "bad" modes and suffer from overfitti… ▽ More

    Submitted 25 March, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Journal ref: ICLR 2024

  18. arXiv:2309.16620  [pdf, other

    stat.ML cond-mat.dis-nn cs.AI cs.LG

    Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit

    Authors: Blake Bordelon, Lorenzo Noci, Mufan Bill Li, Boris Hanin, Cengiz Pehlevan

    Abstract: The cost of hyperparameter tuning in deep learning has been rising with model sizes, prompting practitioners to find new tuning methods using a proxy of smaller networks. One such proposal uses $μ$P parameterized networks, where the optimal hyperparameters for small width networks transfer to networks with arbitrarily large width. However, in this scheme, hyperparameters do not transfer across dep… ▽ More

    Submitted 8 December, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

  19. arXiv:2308.11037  [pdf, ps, other

    math.ST stat.ME stat.ML

    On Exact Bayesian Credible Sets for Classification and Pattern Recognition

    Authors: Chaegeun Song, Bing Li

    Abstract: The current definition of a Bayesian credible set cannot, in general, achieve an arbitrarily preassigned credible level. This drawback is particularly acute for classification problems, where there are only a finite number of achievable credible levels. As a result, there is as of today no general way to construct an exact credible set for classification. In this paper, we introduce a generalized… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: 16 pages, 6 figures

  20. arXiv:2308.04561  [pdf, other

    math.ST stat.ML

    Spectral Regularized Kernel Goodness-of-Fit Tests

    Authors: Omar Hagrass, Bharath K. Sriperumbudur, Bing Li

    Abstract: Maximum mean discrepancy (MMD) has enjoyed a lot of success in many machine learning and statistical applications, including non-parametric hypothesis testing, because of its ability to handle non-Euclidean data. Recently, it has been demonstrated in Balasubramanian et al.(2021) that the goodness-of-fit test based on MMD is not minimax optimal while a Tikhonov regularized version of it is, for an… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: 44 pages. arXiv admin note: text overlap with arXiv:2212.09201

    MSC Class: 62G10 (Primary); 65J20; 65J22; 46E22; 47A52 (Secondary)

  21. arXiv:2307.08685   

    stat.ME stat.AP

    Evaluating Climate Models with Sliced Elastic Distance

    Authors: Robert C. Garrett, Trevor Harris, Bo Li

    Abstract: The validation of global climate models plays a crucial role in ensuring the accuracy of climatological predictions. However, existing statistical methods for evaluating differences between climate fields often overlook time misalignment and therefore fail to distinguish between sources of variability. To more comprehensively measure differences between climate fields, we introduce a new vector-va… ▽ More

    Submitted 25 January, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Issue with the method having limitations for the application area

  22. arXiv:2307.04353  [pdf, other

    stat.ML cs.LG

    On Sufficient Graphical Models

    Authors: Bing Li, Kyongwon Kim

    Abstract: We introduce a sufficient graphical model by applying the recently developed nonlinear sufficient dimension reduction techniques to the evaluation of conditional independence. The graphical model is nonparametric in nature, as it does not make distributional assumptions such as the Gaussian or copula Gaussian assumptions. However, unlike a fully nonparametric graphical model, which relies on the h… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  23. arXiv:2307.01080  [pdf

    stat.AP

    Mobility Behaviors Shift Disparity in Flood Exposure in U.S. Population Groups

    Authors: Bo Li, Chan Fan, Yu-Heng Chien, Chia-Wei Xsu, Ali Mostafavi

    Abstract: Current characterization of flood exposure is largely based on residential location of populations; however, location of residence only partially captures the extent to which populations are exposed to flood. An important, though yet under-recognized aspect of flood exposure is associated with human mobility patterns and population visitation to places located in flood prone areas. This study anal… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  24. arXiv:2306.17759  [pdf, other

    stat.ML cs.LG

    The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

    Authors: Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy

    Abstract: In deep learning theory, the covariance matrix of the representations serves as a proxy to examine the network's trainability. Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention model with skip connections in the proportional limit of infinite-depth-and-width. We show that at initialization the limiting distribution can be described by a… ▽ More

    Submitted 9 December, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

  25. arXiv:2306.10594  [pdf, other

    math.ST stat.ME

    A nonparametric test for elliptical distribution based on kernel embedding of probabilities

    Authors: Yin Tang, Bing Li

    Abstract: Elliptical distribution is a basic assumption underlying many multivariate statistical methods. For example, in sufficient dimension reduction and statistical graphical models, this assumption is routinely imposed to simplify the data dependence structure. Before applying such methods, we need to decide whether the data are elliptically distributed. Currently existing tests either focus exclusivel… ▽ More

    Submitted 26 March, 2024; v1 submitted 18 June, 2023; originally announced June 2023.

    Comments: 25 pages, 3 figures

    MSC Class: Primary 62G10; 62G20; secondary 62H10

  26. arXiv:2306.09694  [pdf, ps, other

    math.OC cs.LG math.NA stat.ML

    Linear convergence of forward-backward accelerated algorithms without knowledge of the modulus of strong convexity

    Authors: Bowen Li, Bin Shi, Ya-xiang Yuan

    Abstract: A significant milestone in modern gradient-based optimization was achieved with the development of Nesterov's accelerated gradient descent (NAG) method. This forward-backward technique has been further advanced with the introduction of its proximal generalization, commonly known as the fast iterative shrinkage-thresholding algorithm (FISTA), which enjoys widespread application in image science and… ▽ More

    Submitted 8 April, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 17 pages, 3 figures; To appear in SIAM Journal on Optimization

  27. arXiv:2306.01271  [pdf, other

    cs.LG stat.ML

    Towards Understanding Clean Generalization and Robust Overfitting in Adversarial Training

    Authors: Binghui Li, Yuanzhi Li

    Abstract: Similar to surprising performance in the standard deep learning, deep nets trained by adversarial training also generalize well for $\textit{unseen clean data (natural data)}$. However, despite adversarial training can achieve low robust training error, there exists a significant $\textit{robust generalization gap}$. We call this phenomenon the… ▽ More

    Submitted 5 February, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 28 pages, comments welcome

  28. arXiv:2305.17583  [pdf, other

    stat.ML cs.LG

    On Neural Networks as Infinite Tree-Structured Probabilistic Graphical Models

    Authors: Boyao Li, Alexandar J. Thomson, Matthew M. Engelhard, David Page

    Abstract: Deep neural networks (DNNs) lack the precise semantics and definitive probabilistic interpretation of probabilistic graphical models (PGMs). In this paper, we propose an innovative solution by constructing infinite tree-structured PGMs that correspond exactly to neural networks. Our research reveals that DNNs, during forward propagation, indeed perform approximations of PGM inference that are prec… ▽ More

    Submitted 1 March, 2024; v1 submitted 27 May, 2023; originally announced May 2023.

  29. arXiv:2305.07642  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    The ASNR-MICCAI Brain Tumor Segmentation (BraTS) Challenge 2023: Intracranial Meningioma

    Authors: Dominic LaBella, Maruf Adewole, Michelle Alonso-Basanta, Talissa Altes, Syed Muhammad Anwar, Ujjwal Baid, Timothy Bergquist, Radhika Bhalerao, Sully Chen, Verena Chung, Gian-Marco Conte, Farouk Dako, James Eddy, Ivan Ezhov, Devon Godfrey, Fathi Hilal, Ariana Familiar, Keyvan Farahani, Juan Eugenio Iglesias, Zhifan Jiang, Elaine Johanson, Anahita Fathi Kazerooni, Collin Kent, John Kirkpatrick, Florian Kofler , et al. (35 additional authors not shown)

    Abstract: Meningiomas are the most common primary intracranial tumor in adults and can be associated with significant morbidity and mortality. Radiologists, neurosurgeons, neuro-oncologists, and radiation oncologists rely on multiparametric MRI (mpMRI) for diagnosis, treatment planning, and longitudinal treatment monitoring; yet automated, objective, and quantitative tools for non-invasive assessment of men… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

  30. arXiv:2303.11536  [pdf, other

    cs.LG cs.AI cs.CV math.ST stat.ML

    Indeterminate Probability Neural Network

    Authors: Tao Yang, Chuang Liu, Xiaofeng Ma, Weijia Lu, Ning Wu, Bingyang Li, Zhifei Yang, Peng Liu, Lin Sun, Xiaodong Zhang, Can Zhang

    Abstract: We propose a new general model called IPNN - Indeterminate Probability Neural Network, which combines neural network and probability theory together. In the classical probability theory, the calculation of probability is based on the occurrence of events, which is hardly used in current neural networks. In this paper, we propose a new general probability theory, which is an extension of classical… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: 13 pages

  31. arXiv:2303.06075  [pdf, other

    cs.LG cs.CV stat.ML

    Long-tailed Classification from a Bayesian-decision-theory Perspective

    Authors: Bolian Li, Ruqi Zhang

    Abstract: Long-tailed classification poses a challenge due to its heavy imbalance in class probabilities and tail-sensitivity risks with asymmetric misprediction costs. Recent attempts have used re-balancing loss and ensemble methods, but they are largely heuristic and depend heavily on empirical results, lacking theoretical explanation. Furthermore, existing methods overlook the decision loss, which charac… ▽ More

    Submitted 20 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

  32. arXiv:2302.10409  [pdf, other

    stat.ML cs.LG

    Mean Parity Fair Regression in RKHS

    Authors: Shaokui Wei, Jiayin Liu, Bing Li, Hongyuan Zha

    Abstract: We study the fair regression problem under the notion of Mean Parity (MP) fairness, which requires the conditional mean of the learned function output to be constant with respect to the sensitive attributes. We address this problem by leveraging reproducing kernel Hilbert space (RKHS) to construct the functional space whose members are guaranteed to satisfy the fairness constraints. The proposed f… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: Accepted by AISTATS 2023. Code available at https://github.com/shawkui/MP_Fair_Regression

  33. arXiv:2302.08049  [pdf, ps, other

    math.ST stat.ML

    Improved Discretization Analysis for Underdamped Langevin Monte Carlo

    Authors: Matthew Zhang, Sinho Chewi, Mufan Bill Li, Krishnakumar Balasubramanian, Murat A. Erdogdu

    Abstract: Underdamped Langevin Monte Carlo (ULMC) is an algorithm used to sample from unnormalized densities by leveraging the momentum of a particle moving in a potential well. We provide a novel analysis of ULMC, motivated by two central questions: (1) Can we obtain improved sampling guarantees beyond strong log-concavity? (2) Can we achieve acceleration for sampling? For (1), prior results for ULMC onl… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  34. arXiv:2302.02092  [pdf, other

    cs.LG stat.ML

    Interpolation for Robust Learning: Data Augmentation on Wasserstein Geodesics

    Authors: Jiacheng Zhu, Jielin Qiu, Aritra Guha, Zhuolin Yang, Xuanlong Nguyen, Bo Li, Ding Zhao

    Abstract: We propose to study and promote the robustness of a model as per its performance through the interpolation of training data distributions. Specifically, (1) we augment the data by finding the worst-case Wasserstein barycenter on the geodesic connecting subpopulation distributions of different categories. (2) We regularize the model for smoother performance on the continuous geodesic path connectin… ▽ More

    Submitted 28 August, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: 34 pages, 3 figures, 18 tables

    Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:43129-43157, 2023

  35. arXiv:2212.11367  [pdf, other

    q-bio.PE cs.LG q-bio.QM stat.AP

    Forecasting West Nile Virus with Graph Neural Networks: Harnessing Spatial Dependence in Irregularly Sampled Geospatial Data

    Authors: Adam Tonks, Trevor Harris, Bo Li, William Brown, Rebecca Smith

    Abstract: Machine learning methods have seen increased application to geospatial environmental problems, such as precipitation nowcasting, haze forecasting, and crop yield prediction. However, many of the machine learning methods applied to mosquito population and disease forecasting do not inherently take into account the underlying spatial structure of the given data. In our work, we apply a spatially awa… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  36. arXiv:2212.09201  [pdf, other

    math.ST cs.LG stat.ML

    Spectral Regularized Kernel Two-Sample Tests

    Authors: Omar Hagrass, Bharath K. Sriperumbudur, Bing Li

    Abstract: Over the last decade, an approach that has gained a lot of popularity to tackle nonparametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of probability distributions. The main goal of our work is to understand the optimality of two-sample tests constructed based on this approach. First, we show the popular M… ▽ More

    Submitted 1 May, 2024; v1 submitted 18 December, 2022; originally announced December 2022.

    Comments: 75 pages, to be published in the Annals of Statistics

    MSC Class: Primary: 62G10; Secondary: 65J20; 65J22; 46E22; 47A52

  37. arXiv:2212.06319  [pdf, ps, other

    math.OC cs.LG math.ST stat.ML

    Linear Convergence of ISTA and FISTA

    Authors: Bowen Li, Bin Shi, Ya-xiang Yuan

    Abstract: In this paper, we revisit the class of iterative shrinkage-thresholding algorithms (ISTA) for solving the linear inverse problem with sparse representation, which arises in signal and image processing. It is shown in the numerical experiment to deblur an image that the convergence behavior in the logarithmic-scale ordinate tends to be linear instead of logarithmic, approximating to be flat. Making… ▽ More

    Submitted 14 January, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

    Comments: 16 pages, 4 figures

  38. arXiv:2211.10008  [pdf, other

    cs.AI stat.ME

    Confounder Balancing for Instrumental Variable Regression with Latent Variable

    Authors: Anpeng Wu, Kun Kuang, Ruoxuan Xiong, Bo Li, Fei Wu

    Abstract: This paper studies the confounding effects from the unmeasured confounders and the imbalance of observed confounders in IV regression and aims at unbiased causal effect estimation. Recently, nonlinear IV estimators were proposed to allow for nonlinear model in both stages. However, the observed confounders may be imbalanced in stage 2, which could still lead to biased treatment effect estimation i… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

  39. arXiv:2211.01610  [pdf, ps, other

    math.OC cs.LG math.ST stat.ML

    Proximal Subgradient Norm Minimization of ISTA and FISTA

    Authors: Bowen Li, Bin Shi, Ya-xiang Yuan

    Abstract: For first-order smooth optimization, the research on the acceleration phenomenon has a long-time history. Until recently, the mechanism leading to acceleration was not successfully uncovered by the gradient correction term and its equivalent implicit-velocity form. Furthermore, based on the high-resolution differential equation framework with the corresponding emerging techniques, phase-space repr… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: 17 pages, 4 figures

  40. arXiv:2210.11620  [pdf, other

    cs.LG stat.ML

    LOT: Layer-wise Orthogonal Training on Improving $\ell_2$ Certified Robustness

    Authors: Xiaojun Xu, Linyi Li, Bo Li

    Abstract: Recent studies show that training deep neural networks (DNNs) with Lipschitz constraints are able to enhance adversarial robustness and other model properties such as stability. In this paper, we propose a layer-wise orthogonal training method (LOT) to effectively train 1-Lipschitz convolution layers via parametrizing an orthogonal matrix with an unconstrained matrix. We then efficiently compute t… ▽ More

    Submitted 26 March, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  41. arXiv:2210.01980  [pdf, other

    stat.ME

    Robust Estimation of Loss-Based Measures of Model Performance under Covariate Shift

    Authors: Samantha Morrison, Constantine Gatsonis, Issa J. Dahabreh, Bing Li, Jon A. Steingrimsson

    Abstract: We present methods for estimating loss-based measures of the performance of a prediction model in a target population that differs from the source population in which the model was developed, in settings where outcome and covariate data are available from the source population but only covariate data are available on a simple random sample from the target population. Prior work adjusting for diffe… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

  42. arXiv:2208.10912  [pdf, other

    cs.AI stat.ME

    Learning Instrumental Variable from Data Fusion for Treatment Effect Estimation

    Authors: Anpeng Wu, Kun Kuang, Ruoxuan Xiong, Minqing Zhu, Yuxuan Liu, Bo Li, Furui Liu, Zhihua Wang, Fei Wu

    Abstract: The advent of the big data era brought new opportunities and challenges to draw treatment effect in data fusion, that is, a mixed dataset collected from multiple sources (each source with an independent treatment assignment mechanism). Due to possibly omitted source labels and unmeasured confounders, traditional methods cannot estimate individual treatment assignment probability and infer treatmen… ▽ More

    Submitted 7 December, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

  43. arXiv:2208.06245  [pdf, ps, other

    cs.LG cond-mat.dis-nn physics.soc-ph stat.ML

    Understanding the stochastic dynamics of sequential decision-making processes: A path-integral analysis of multi-armed bandits

    Authors: Bo Li, Chi Ho Yeung

    Abstract: The multi-armed bandit (MAB) model is one of the most classical models to study decision-making in an uncertain environment. In this model, a player chooses one of $K$ possible arms of a bandit machine to play at each time step, where the corresponding arm returns a random reward to the player, potentially from a specific unknown distribution. The target of the player is to collect as many rewards… ▽ More

    Submitted 10 June, 2023; v1 submitted 11 August, 2022; originally announced August 2022.

    Journal ref: Chaos 33, 063107 (2023)

  44. arXiv:2208.05740  [pdf, other

    cs.LG cs.CR cs.CV math.OC stat.ML

    General Cutting Planes for Bound-Propagation-Based Neural Network Verification

    Authors: Huan Zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, J. Zico Kolter

    Abstract: Bound propagation methods, when combined with branch and bound, are among the most effective methods to formally verify properties of deep neural networks such as correctness, robustness, and safety. However, existing works cannot handle the general form of cutting plane constraints widely accepted in traditional solvers, which are crucial for strengthening verifiers with tightened convex relaxati… ▽ More

    Submitted 4 December, 2022; v1 submitted 11 August, 2022; originally announced August 2022.

    Comments: Accepted by NeurIPS 2022. GCP-CROWN is part of the alpha-beta-CROWN verifier, the VNN-COMP 2022 winner

  45. arXiv:2208.01220  [pdf, other

    stat.ML cs.LG eess.SP

    GeoECG: Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction

    Authors: Jiacheng Zhu, Jielin Qiu, Zhuolin Yang, Douglas Weber, Michael A. Rosenberg, Emerson Liu, Bo Li, Ding Zhao

    Abstract: There has been an increased interest in applying deep neural networks to automatically interpret and analyze the 12-lead electrocardiogram (ECG). The current paradigms with machine learning methods are often limited by the amount of labeled data. This phenomenon is particularly problematic for clinically-relevant data, where labeling at scale can be time-consuming and costly in terms of the specia… ▽ More

    Submitted 10 August, 2022; v1 submitted 1 August, 2022; originally announced August 2022.

    Comments: 26 pages, Figure 13, Machine Learning for Healthcare 2022

    Journal ref: Machine Learning for Healthcare 2022, JMLR Volume 182

  46. arXiv:2207.09081  [pdf, other

    cs.LG cs.AI cs.RO stat.ME

    Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

    Authors: Wenhao Ding, Haohong Lin, Bo Li, Ding Zhao

    Abstract: As a pivotal component to attaining generalizable solutions in human intelligence, reasoning provides great potential for reinforcement learning (RL) agents' generalization towards varied goals by summarizing part-to-whole arguments and discovering cause-and-effect relations. However, how to discover and represent causalities remains a huge gap that hinders the development of causal RL. In this pa… ▽ More

    Submitted 17 May, 2023; v1 submitted 19 July, 2022; originally announced July 2022.

    Comments: Accepted to NeurIPS 2022

  47. arXiv:2207.08211  [pdf, other

    stat.ME math.ST

    Nonlinear function-on-function regression by RKHS

    Authors: Peijun Sang, Bing Li

    Abstract: We propose a nonlinear function-on-function regression model where both the covariate and the response are random functions. The nonlinear regression is carried out in two steps: we first construct Hilbert spaces to accommodate the functional covariate and the functional response, and then build a second-layer Hilbert space for the covariate to capture nonlinearity. The second-layer space is assum… ▽ More

    Submitted 17 July, 2022; originally announced July 2022.

  48. arXiv:2207.04613  [pdf, other

    stat.ME math.ST stat.ML

    Nonlinear Sufficient Dimension Reduction for Distribution-on-Distribution Regression

    Authors: Qi Zhang, Bing Li, Lingzhou Xue

    Abstract: We introduce a new approach to nonlinear sufficient dimension reduction in cases where both the predictor and the response are distributional data, modeled as members of a metric space. Our key step is to build universal kernels (cc-universal) on the metric spaces, which results in reproducing kernel Hilbert spaces for the predictor and response that are rich enough to characterize the conditional… ▽ More

    Submitted 24 April, 2023; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: 36 pages

  49. arXiv:2206.08120  [pdf, other

    stat.ME

    Simultaneous Estimation of Graphical Models by Neighborhood Selection

    Authors: Ilias Moysidis, Bing Li

    Abstract: In many applications concerning statistical graphical models the data originate from several subpopulations that share similarities but have also significant differences. This raises the question of how to estimate several graphical models simultaneously. Compiling all the data together to estimate a single graph would ignore the differences among subpopulations. On the other hand, estimating a gr… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

  50. arXiv:2206.02768  [pdf, other

    stat.ML cs.LG

    The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization

    Authors: Mufan Bill Li, Mihai Nica, Daniel M. Roy

    Abstract: The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, given a random covariance matrix defined by the penultimate layer. In this work, we study the distribution of this random matrix. Recent work has shown that sha** the activation function as network depth grows large is necessary for this covariance matrix to be non-degenerate. However, the current inf… ▽ More

    Submitted 14 June, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: 48 pages, 10 figures. Advances in Neural Information Processing Systems (2022)