Skip to main content

Showing 1–50 of 802 results for author: Li, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.01621  [pdf, other

    cs.LG q-bio.QM stat.ME stat.ML

    Deciphering interventional dynamical causality from non-intervention systems

    Authors: Jifan Shi, Yang Li, Juan Zhao, Siyang Leng, Kazuyuki Aihara, Luonan Chen, Wei Lin

    Abstract: Detecting and quantifying causality is a focal topic in the fields of science, engineering, and interdisciplinary studies. However, causal studies on non-intervention systems attract much attention but remain extremely challenging. To address this challenge, we propose a framework named Interventional Dynamical Causality (IntDC) for such non-intervention systems, along with its computational crite… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  2. arXiv:2406.17698  [pdf, other

    stat.ML cs.LG

    Identifying Nonstationary Causal Structures with High-Order Markov Switching Models

    Authors: Carles Balsells-Rodas, Yixin Wang, Pedro A. M. Mediano, Yingzhen Li

    Abstract: Causal discovery in time series is a rapidly evolving field with a wide variety of applications in other areas such as climate science and neuroscience. Traditional approaches assume a stationary causal graph, which can be adapted to nonstationary time series with time-dependent effects or heterogeneous noise. In this work we address nonstationarity via regime-dependent causal structures. We first… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: CI4TS Workshop @UAI2024

  3. arXiv:2406.11666  [pdf, other

    math.ST cs.LG stat.ML

    ROTI-GCV: Generalized Cross-Validation for right-ROTationally Invariant Data

    Authors: Kevin Luo, Yufan Li, Pragya Sur

    Abstract: Two key tasks in high-dimensional regularized regression are tuning the regularization strength for good predictions and estimating the out-of-sample risk. It is known that the standard approach -- $k$-fold cross-validation -- is inconsistent in modern high-dimensional settings. While leave-one-out and generalized cross-validation remain consistent in some high-dimensional cases, they become incon… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 25 pages, 3 figures

  4. arXiv:2406.11501  [pdf, other

    cs.LG cs.AI stat.ME

    Teleporter Theory: A General and Simple Approach for Modeling Cross-World Counterfactual Causality

    Authors: Jiangmeng Li, Bin Qin, Qirui Ji, Yi Li, Wenwen Qiang, Jianwen Cao, Fanjiang Xu

    Abstract: Leveraging the development of structural causal model (SCM), researchers can establish graphical models for exploring the causal mechanisms behind machine learning techniques. As the complexity of machine learning applications rises, single-world interventionism causal analysis encounters theoretical adaptation limitations. Accordingly, cross-world counterfactual approach extends our understanding… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2406.11490  [pdf, other

    cs.LG stat.ME

    Interventional Imbalanced Multi-Modal Representation Learning via $β$-Generalization Front-Door Criterion

    Authors: Yi Li, Jiangmeng Li, Fei Song, Qingmeng Zhu, Changwen Zheng, Wenwen Qiang

    Abstract: Multi-modal methods establish comprehensive superiority over uni-modal methods. However, the imbalanced contributions of different modalities to task-dependent predictions constantly degrade the discriminative performance of canonical multi-modal methods. Based on the contribution to task-dependent predictions, modalities can be identified as predominant and auxiliary modalities. Benchmark methods… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  6. arXiv:2406.10262  [pdf, other

    cs.IR cs.AI math.OC stat.CO

    Fast solution to the fair ranking problem using the Sinkhorn algorithm

    Authors: Yuki Uehara, Shunnosuke Ikeda, Naoki Nishimura, Koya Ohashi, Yilin Li, Jie Yang, Deddy Jobson, Xingxia Zha, Takeshi Matsumoto, Noriyoshi Sukegawa, Yuichi Takano

    Abstract: In two-sided marketplaces such as online flea markets, recommender systems for providing consumers with personalized item rankings play a key role in promoting transactions between providers and consumers. Meanwhile, two-sided marketplaces face the problem of balancing consumer satisfaction and fairness among items to stimulate activity of item providers. Saito and Joachims (2022) devised an impac… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  7. arXiv:2406.09694  [pdf, other

    stat.ML cs.LG

    An Efficient Approach to Regression Problems with Tensor Neural Networks

    Authors: Yongxin Li

    Abstract: This paper introduces a tensor neural network (TNN) to address nonparametric regression problems. Characterized by its distinct sub-network structure, the TNN effectively facilitates variable separation, thereby enhancing the approximation of complex, unknown functions. Our comparative analysis reveals that the TNN outperforms conventional Feed-Forward Networks (FFN) and Radial Basis Function Netw… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    MSC Class: 62J02; 68T05

  8. arXiv:2406.04374  [pdf, other

    cs.IR cs.GT cs.LG stat.ML

    Dynamic Online Recommendation for Two-Sided Market with Bayesian Incentive Compatibility

    Authors: Yuantong Li, Guang Cheng, Xiaowu Dai

    Abstract: Recommender systems play a crucial role in internet economies by connecting users with relevant products or services. However, designing effective recommender systems faces two key challenges: (1) the exploration-exploitation tradeoff in balancing new product exploration against exploiting known preferences, and (2) dynamic incentive compatibility in accounting for users' self-interested behaviors… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  9. arXiv:2406.03707  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions

    Authors: Liyi Zhang, Michael Y. Li, Thomas L. Griffiths

    Abstract: Autoregressive language models have demonstrated a remarkable ability to extract latent structure from text. The embeddings from large language models have been shown to capture aspects of the syntax and semantics of language. But what {\em should} embeddings represent? We connect the autoregressive prediction objective to the idea of constructing predictive sufficient statistics to summarize the… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 15 pages, 8 figures

    ACM Class: I.2; I.5

  10. arXiv:2406.00574  [pdf, other

    math.OC math.ST stat.ML

    On the Convergence Rates of Set Membership Estimation of Linear Systems with Disturbances Bounded by General Convex Sets

    Authors: Haonan Xu, Yingying Li

    Abstract: This paper studies the uncertainty set estimation of system parameters of linear dynamical systems with bounded disturbances, which is motivated by robust (adaptive) constrained control. Departing from the confidence bounds of least square estimation from the machine-learning literature, this paper focuses on a method commonly used in (robust constrained) control literature: set membership estimat… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  11. arXiv:2406.00416  [pdf, other

    stat.ML cs.LG eess.SP

    Representation and De-interleaving of Mixtures of Hidden Markov Processes

    Authors: Jiadi Bao, Mengtao Zhu, Yunjie Li, Shafei Wang

    Abstract: De-interleaving of the mixtures of Hidden Markov Processes (HMPs) generally depends on its representation model. Existing representation models consider Markov chain mixtures rather than hidden Markov, resulting in the lack of robustness to non-ideal situations such as observation noise or missing observations. Besides, de-interleaving methods utilize a search-based strategy, which is time-consumi… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 13 pages, 9 figures, submitted to IEEE transactions on Signal Processing

  12. arXiv:2406.00196  [pdf, other

    stat.ME stat.AP

    A Seamless Phase II/III Design with Dose Optimization for Oncology Drug Development

    Authors: Yuhan Li, Yiding Zhang, Gu Mi, Ji Lin

    Abstract: The US FDA's Project Optimus initiative that emphasizes dose optimization prior to marketing approval represents a pivotal shift in oncology drug development. It has a ripple effect for rethinking what changes may be made to conventional pivotal trial designs to incorporate a dose optimization component. Aligned with this initiative, we propose a novel Seamless Phase II/III Design with Dose Optimi… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  13. arXiv:2405.18722  [pdf, other

    stat.ME

    Adaptive and Efficient Learning with Blockwise Missing and Semi-Supervised Data

    Authors: Yiming Li, Xuehan Yang, Ying Wei, Molei Liu

    Abstract: Data fusion is an important way to realize powerful and generalizable analyses across multiple sources. However, different capability of data collection across the sources has become a prominent issue in practice. This could result in the blockwise missingness (BM) of covariates troublesome for integration. Meanwhile, the high cost of obtaining gold-standard labels can cause the missingness of res… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  14. arXiv:2405.09362  [pdf, other

    stat.ML cs.LG

    On the Saturation Effect of Kernel Ridge Regression

    Authors: Yicheng Li, Haobo Zhang, Qian Lin

    Abstract: The saturation effect refers to the phenomenon that the kernel ridge regression (KRR) fails to achieve the information theoretical lower bound when the smoothness of the underground truth function exceeds certain level. The saturation effect has been widely observed in practices and a saturation lower bound of KRR has been conjectured for decades. In this paper, we provide a proof of this long-sta… ▽ More

    Submitted 28 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    Comments: ICLR 2023; Minor errors are corrected in this version

  15. arXiv:2405.03073  [pdf, other

    math.OC stat.ML

    Convergence and Complexity Guarantee for Inexact First-order Riemannian Optimization Algorithms

    Authors: Yuchen Li, Laura Balzano, Deanna Needell, Hanbaek Lyu

    Abstract: We analyze inexact Riemannian gradient descent (RGD) where Riemannian gradients and retractions are inexactly (and cheaply) computed. Our focus is on understanding when inexact RGD converges and what is the complexity in the general nonconvex and constrained setting. We answer these questions in a general framework of tangential Block Majorization-Minimization (tBMM). We establish that tBMM conver… ▽ More

    Submitted 9 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: 23 pages, 5 figures. ICML 2024. Appendix revised

  16. arXiv:2405.01251  [pdf, other

    cs.LG stat.ML

    Revisiting semi-supervised training objectives for differentiable particle filters

    Authors: Jiaxi Li, John-Joseph Brady, Xiongjie Chen, Yunpeng Li

    Abstract: Differentiable particle filters combine the flexibility of neural networks with the probabilistic nature of sequential Monte Carlo methods. However, traditional approaches rely on the availability of labelled data, i.e., the ground truth latent state information, which is often difficult to obtain in real-world applications. This paper compares the effectiveness of two semi-supervised training obj… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 5 pages, 2 figures

    MSC Class: 65C05; 62M20; 62M45; 62M05

  17. arXiv:2405.00742  [pdf, other

    cs.CR cs.LG stat.ML

    Federated Graph Learning for EV Charging Demand Forecasting with Personalization Against Cyberattacks

    Authors: Yi Li, Renyou Xie, Chaojie Li, Yi Wang, Zhaoyang Dong

    Abstract: Mitigating cybersecurity risk in electric vehicle (EV) charging demand forecasting plays a crucial role in the safe operation of collective EV chargings, the stability of the power grid, and the cost-effective infrastructure expansion. However, existing methods either suffer from the data privacy issue and the susceptibility to cyberattacks or fail to consider the spatial correlation among differe… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages,4 figures

  18. arXiv:2404.15073  [pdf

    stat.ME

    The Complex Estimand of Clone-Censor-Weighting When Studying Treatment Initiation Windows

    Authors: Michael Webster-Clark, Yi Li, Sophie Dell Aniello, Robert W. Platt

    Abstract: Clone-censor-weighting (CCW) is an analytic method for studying treatment regimens that are indistinguishable from one another at baseline without relying on landmark dates or creating immortal person time. One particularly interesting CCW application is estimating outcomes when starting treatment within specific time windows in observational data (e.g., starting a treatment within 30 days of hosp… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  19. arXiv:2404.14786  [pdf, other

    cs.AI cs.LG stat.ME

    RealTCD: Temporal Causal Discovery from Interventional Data with Large Language Model

    Authors: Peiwen Li, Xin Wang, Zeyang Zhang, Yuan Meng, Fang Shen, Yue Li, Jialong Wang, Yang Li, Wenweu Zhu

    Abstract: In the field of Artificial Intelligence for Information Technology Operations, causal discovery is pivotal for operation and maintenance of graph construction, facilitating downstream industrial tasks such as root cause analysis. Temporal causal discovery, as an emerging method, aims to identify temporal causal relationships between variables directly from observations by utilizing interventional… ▽ More

    Submitted 26 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  20. arXiv:2404.12481  [pdf, other

    stat.ML cs.LG

    Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis

    Authors: Yufan Li, Subhabrata Sen, Ben Adlam

    Abstract: In the transfer learning paradigm models learn useful representations (or features) during a data-rich pretraining stage, and then use the pretrained representation to improve model performance on data-scarce downstream tasks. In this work, we explore transfer learning with the goal of optimizing downstream performance. We introduce a simple linear model that takes as input an arbitrary pretrained… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  21. arXiv:2404.11713  [pdf, ps, other

    stat.ME

    Propensity Score Analysis with Guaranteed Subgroup Balance

    Authors: Yan Li, Yong-Fang Kuo, Liang Li

    Abstract: Estimating the causal treatment effects by subgroups is important in observational studies when the treatment effect heterogeneity may be present. Existing propensity score methods rely on a correctly specified propensity score model. Model misspecification results in biased treatment effect estimation and covariate imbalance. We proposed a new algorithm, the propensity score analysis with guarant… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  22. arXiv:2404.06168  [pdf

    stat.AP

    Protection of Guizhou Miao Batik Culture Based on Knowledge Graph and Deep Learning

    Authors: Huafeng Quan, Yiting Li, Dashuai Liu, Yue Zhou

    Abstract: In the globalization trend, China's cultural heritage is in danger of gradually disappearing. The protection and inheritance of these precious cultural resources has become a critical task. This paper focuses on the Miao batik culture in Guizhou Province, China, and explores the application of knowledge graphs, natural language processing, and deep learning techniques in the promotion and protecti… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  23. arXiv:2404.06055  [pdf, ps, other

    stat.AP

    Online/Offline Learning to Enable Robust Beamforming: Limited Feedback Meets Deep Generative Models

    Authors: Ying Li, Zhidi Lin, Kai Li, Michael Minyi Zhang

    Abstract: Robust beamforming is a pivotal technique in massive multiple-input multiple-output (MIMO) systems as it mitigates interference among user equipment (UE). One current risk-neutral approach to robust beamforming is the stochastic weighted minimum mean square error method (WMMSE). However, this method necessitates statistical channel information, which is typically inaccessible, particularly in fift… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  24. arXiv:2404.04865  [pdf, other

    cs.LG cs.CV stat.ML

    On the Learnability of Out-of-distribution Detection

    Authors: Zhen Fang, Yixuan Li, Feng Liu, Bo Han, Jie Lu

    Abstract: Supervised learning aims to train a classifier under the assumption that training and test data are from the same distribution. To ease the above assumption, researchers have studied a more realistic setting: out-of-distribution (OOD) detection, where test data may come from classes that are unknown during training (i.e., OOD data). Due to the unavailability and diversity of OOD data, good general… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Accepted by JMLR in 7th of April, 2024. This is a journal extension of the previous NeurIPS 2022 Outstanding Paper "Is Out-of-distribution Detection Learnable?" [arXiv:2210.14707]

  25. arXiv:2404.04859  [pdf, other

    cs.LG stat.ML

    Demystifying Lazy Training of Neural Networks from a Macroscopic Viewpoint

    Authors: Yuqing Li, Tao Luo, Qixuan Zhou

    Abstract: In this paper, we advance the understanding of neural network training dynamics by examining the intricate interplay of various factors introduced by weight parameters in the initialization process. Motivated by the foundational work of Luo et al. (J. Mach. Learn. Res., Vol. 22, Iss. 1, No. 71, pp 3327-3373), we explore the gradient descent dynamics of neural networks through the lens of macroscop… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  26. arXiv:2404.04794  [pdf, other

    stat.ME

    A Deep Learning Approach to Nonparametric Propensity Score Estimation with Optimized Covariate Balance

    Authors: Maosen Peng, Yan Li, Chong Wu, Liang Li

    Abstract: This paper proposes a novel propensity score weighting analysis. We define two sufficient and necessary conditions for a function of the covariates to be the propensity score. The first is "local balance", which ensures the conditional independence of covariates and treatment assignment across a dense grid of propensity score values. The second condition, "local calibration", guarantees that a bal… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: Corresponding author: Chong Wu (Email: [email protected]) and Liang Li (Email: [email protected])

  27. arXiv:2404.04399  [pdf, other

    stat.ML cs.AI cs.LG stat.AP stat.ME

    Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

    Authors: Toru Shirakawa, Yi Li, Yulun Wu, Sky Qiu, Yuxuan Li, Mingduo Zhao, Hiroyasu Iso, Mark van der Laan

    Abstract: We propose Deep Longitudinal Targeted Minimum Loss-based Estimation (Deep LTMLE), a novel approach to estimate the counterfactual mean of outcome under dynamic treatment policies in longitudinal problem settings. Our approach utilizes a transformer architecture with heterogeneous type embedding trained using temporal-difference learning. After obtaining an initial estimate using the transformer, f… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  28. arXiv:2404.01697  [pdf, other

    stat.ML cs.LG

    Preventing Model Collapse in Gaussian Process Latent Variable Models

    Authors: Ying Li, Zhidi Lin, Feng Yin, Michael Minyi Zhang

    Abstract: Gaussian process latent variable models (GPLVMs) are a versatile family of unsupervised learning models commonly used for dimensionality reduction. However, common challenges in modeling data with GPLVMs include inadequate kernel flexibility and improper selection of the projection noise, leading to a type of model collapse characterized by vague latent representations that do not reflect the unde… ▽ More

    Submitted 18 June, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: International Conference on Machine Learning (ICML), 2024

  29. arXiv:2404.01469  [pdf, other

    stat.AP stat.ME

    A group testing based exploration of age-varying factors in chlamydia infections among Iowa residents

    Authors: Yizeng Li, Dewei Wang, Joshua M. Tebbs

    Abstract: Group testing, a method that screens subjects in pooled samples rather than individually, has been employed as a cost-effective strategy for chlamydia screening among Iowa residents. In efforts to deepen our understanding of chlamydia epidemiology in Iowa, several group testing regression models have been proposed. Different than previous approaches, we expand upon the varying coefficient model to… ▽ More

    Submitted 6 May, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  30. arXiv:2403.15707  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Role of Locality and Weight Sharing in Image-Based Tasks: A Sample Complexity Separation between CNNs, LCNs, and FCNs

    Authors: Aakash Lahoti, Stefani Karp, Ezra Winston, Aarti Singh, Yuanzhi Li

    Abstract: Vision tasks are characterized by the properties of locality and translation invariance. The superior performance of convolutional neural networks (CNNs) on these tasks is widely attributed to the inductive bias of locality and weight sharing baked into their architecture. Existing attempts to quantify the statistical benefits of these biases in CNNs over locally connected convolutional neural net… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 40 pages, 4 figures, Accepted to ICLR 2024, Spotlight

  31. arXiv:2403.11175  [pdf, ps, other

    stat.ML cs.AI cs.IT cs.LG math.ST

    Prior-dependent analysis of posterior sampling reinforcement learning with function approximation

    Authors: Yingru Li, Zhi-Quan Luo

    Abstract: This work advances randomized exploration in reinforcement learning (RL) with function approximation modeled by linear mixture MDPs. We establish the first prior-dependent Bayesian regret bound for RL with function approximation; and refine the Bayesian regret analysis for posterior sampling reinforcement learning (PSRL), presenting an upper bound of ${\mathcal{O}}(d\sqrt{H^3 T \log T})$, where… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Published in the 27th International Conference on Artificial Intelligence and Statistics (AISTATS)

  32. arXiv:2403.00869  [pdf, other

    cs.LG stat.ML

    Enhancing Multivariate Time Series Forecasting with Mutual Information-driven Cross-Variable and Temporal Modeling

    Authors: Shiyi Qi, Liangjian Wen, Yiduo Li, Yuanhang Yang, Zhe Li, Zhongwen Rao, Lujia Pan, Zenglin Xu

    Abstract: Recent advancements have underscored the impact of deep learning techniques on multivariate time series forecasting (MTSF). Generally, these techniques are bifurcated into two categories: Channel-independence and Channel-mixing approaches. Although Channel-independence methods typically yield better results, Channel-mixing could theoretically offer improvements by leveraging inter-variable correla… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

  33. arXiv:2402.18900  [pdf, ps, other

    stat.ME stat.AP stat.ML

    Prognostic Covariate Adjustment for Logistic Regression in Randomized Controlled Trials

    Authors: Yunfan Li, Arman Sabbaghi, Jonathan R. Walsh, Charles K. Fisher

    Abstract: Randomized controlled trials (RCTs) with binary primary endpoints introduce novel challenges for inferring the causal effects of treatments. The most significant challenge is non-collapsibility, in which the conditional odds ratio estimand under covariate adjustment differs from the unconditional estimand in the logistic regression analysis of RCT data. This issue gives rise to apparent paradoxes,… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 27 pages, 1 figure, 9 tables

    MSC Class: 62J12

  34. arXiv:2402.18392  [pdf, other

    cs.LG cs.AI econ.EM stat.ML

    Unveiling the Potential of Robustness in Evaluating Causal Inference Models

    Authors: Yiyan Huang, Cheuk Hang Leung, Siyi Wang, Yijun Li, Qi Wu

    Abstract: The growing demand for personalized decision-making has led to a surge of interest in estimating the Conditional Average Treatment Effect (CATE). The intersection of machine learning and causal inference has yielded various effective CATE estimators. However, deploying these estimators in practice is often hindered by the absence of counterfactual labels, making it challenging to select the desira… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  35. arXiv:2402.15515  [pdf

    cs.AI q-bio.QM stat.AP

    Feasibility of Identifying Factors Related to Alzheimer's Disease and Related Dementia in Real-World Data

    Authors: Aokun Chen, Qian Li, Yu Huang, Yongqiu Li, Yu-neng Chuang, Xia Hu, Serena Guo, Yonghui Wu, Yi Guo, Jiang Bian

    Abstract: A comprehensive view of factors associated with AD/ADRD will significantly aid in studies to develop new treatments for AD/ADRD and identify high-risk populations and patients for prevention efforts. In our study, we summarized the risk factors for AD/ADRD by reviewing existing meta-analyses and review articles on risk and preventive factors for AD/ADRD. In total, we extracted 477 risk factors in… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  36. arXiv:2402.14026  [pdf, ps, other

    math.ST cs.DS cs.IT math.NA math.PR stat.ML

    Probability Tools for Sequential Random Projection

    Authors: Yingru Li

    Abstract: We introduce the first probabilistic framework tailored for sequential random projection, an approach rooted in the challenges of sequential decision-making under uncertainty. The analysis is complicated by the sequential dependence and high-dimensional nature of random variables, a byproduct of the adaptive mechanisms inherent in sequential decision processes. Our work features a novel constructi… ▽ More

    Submitted 12 May, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: 12 pages, 1 figure

  37. arXiv:2402.12711  [pdf, ps, other

    cs.LG stat.ML

    Achieving Near-Optimal Regret for Bandit Algorithms with Uniform Last-Iterate Guarantee

    Authors: Junyan Liu, Yunfan Li, Lin Yang

    Abstract: Existing performance measures for bandit algorithms such as regret, PAC bounds, or uniform-PAC (Dann et al., 2017), typically evaluate the cumulative performance, while allowing the play of an arbitrarily bad arm at any finite time t. Such a behavior can be highly detrimental in high-stakes applications. This paper introduces a stronger performance measure, the uniform last-iterate (ULI) guarantee… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  38. arXiv:2402.12653  [pdf, other

    cs.SI stat.AP

    Unbiased Estimation for Total Treatment Effect Under Interference Using Aggregated Dyadic Data

    Authors: Lu Deng, Yilin Li, **g**g Zhang, Yong Wang, Chuan Chen

    Abstract: In social media platforms, user behavior is often influenced by interactions with other users, complicating the accurate estimation of causal effects in traditional A/B experiments. This study investigates situations where an individual's outcome can be broken down into the sum of multiple pairwise outcomes, a reflection of user interactions. These outcomes, referred to as dyadic data, are prevale… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  39. arXiv:2402.10232  [pdf, ps, other

    stat.ML cs.DS cs.LG math.PR

    Simple, unified analysis of Johnson-Lindenstrauss with applications

    Authors: Yingru Li

    Abstract: We present a simple and unified analysis of the Johnson-Lindenstrauss (JL) lemma, a cornerstone in the field of dimensionality reduction critical for managing high-dimensional data. Our approach not only simplifies the understanding but also unifies various constructions under the JL framework, including spherical, binary-coin, sparse JL, Gaussian and sub-Gaussian models. This simplification and u… ▽ More

    Submitted 27 February, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

  40. arXiv:2402.10228  [pdf, other

    cs.LG cs.AI stat.ML

    Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent

    Authors: Yingru Li, Jiawei Xu, Lei Han, Zhi-Quan Luo

    Abstract: We propose HyperAgent, a reinforcement learning (RL) algorithm based on the hypermodel framework for exploration in RL. HyperAgent allows for the efficient incremental approximation of posteriors associated with an optimal action-value function ($Q^\star$) without the need for conjugacy and follows the greedy policies w.r.t. these approximate posterior samples. We demonstrate that HyperAgent offer… ▽ More

    Submitted 14 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Proceedings of the $\mathit{41}^{st}$ International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024. Copyright 2024 by the author(s). Invited talk in Informs Optimization Conference 2024 and International Symposium on Mathematical Programming 2024

  41. arXiv:2402.09456  [pdf, other

    cs.LG cs.AI cs.GT stat.ML

    Optimistic Thompson Sampling for No-Regret Learning in Unknown Games

    Authors: Yingru Li, Liangqi Liu, Wenqiang Pu, Hao Liang, Zhi-Quan Luo

    Abstract: This work tackles the complexities of multi-player scenarios in \emph{unknown games}, where the primary challenge lies in navigating the uncertainty of the environment through bandit feedback alongside strategic decision-making. We introduce Thompson Sampling (TS)-based algorithms that exploit the information of opponents' actions and reward structures, leading to a substantial reduction in experi… ▽ More

    Submitted 24 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  42. arXiv:2402.08539  [pdf

    cs.LG stat.AP

    Intelligent Diagnosis of Alzheimer's Disease Based on Machine Learning

    Authors: Mingyang Li, Hongyu Liu, Yixuan Li, Zejun Wang, Yuan Yuan, Honglin Dai

    Abstract: This study is based on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and aims to explore early detection and disease progression in Alzheimer's disease (AD). We employ innovative data preprocessing strategies, including the use of the random forest algorithm to fill missing data and the handling of outliers and invalid data, thereby fully mining and utilizing these limited data re… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  43. arXiv:2402.08182  [pdf, other

    cs.LG stat.ML

    Variational Continual Test-Time Adaptation

    Authors: Fan Lyu, Kaile Du, Yuyang Li, Hanyu Zhao, Zhang Zhang, Guangcan Liu, Liang Wang

    Abstract: The prior drift is crucial in Continual Test-Time Adaptation (CTTA) methods that only use unlabeled test data, as it can cause significant error propagation. In this paper, we introduce VCoTTA, a variational Bayesian approach to measure uncertainties in CTTA. At the source stage, we transform a pre-trained deterministic model into a Bayesian Neural Network (BNN) via a variational warm-up strategy,… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  44. arXiv:2402.04084  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Provably learning a multi-head attention layer

    Authors: Sitan Chen, Yuanzhi Li

    Abstract: The multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models. Given a sequence length $k$, attention matrices $\mathbfΘ_1,\ldots,\mathbfΘ_m\in\mathbb{R}^{d\times d}$, and projection matrices $\mathbf{W}_1,\ldots,\mathbf{W}_m\in\mathbb{R}^{d\times d}$, the corresponding multi-head attention layer… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 105 pages, comments welcome

  45. arXiv:2402.03502  [pdf, other

    cs.LG stat.ML

    How Does Unlabeled Data Provably Help Out-of-Distribution Detection?

    Authors: Xuefeng Du, Zhen Fang, Ilias Diakonikolas, Yixuan Li

    Abstract: Using unlabeled data to regularize the machine learning models has demonstrated promise for improving safety and reliability in detecting out-of-distribution (OOD) data. Harnessing the power of unlabeled in-the-wild data is non-trivial due to the heterogeneity of both in-distribution (ID) and OOD data. This lack of a clean set of OOD samples poses significant challenges in learning an optimal OOD… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: ICLR 2024

  46. arXiv:2402.00809  [pdf, other

    cs.LG stat.ML

    Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

    Authors: Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

    Abstract: In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni… ▽ More

    Submitted 2 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  47. arXiv:2401.17573  [pdf

    stat.ML cs.LG eess.IV eess.SY

    Tensor-based process control and monitoring for semiconductor manufacturing with unstable disturbances

    Authors: Yanrong Li, Juan Du, Fugee Tsung, Wei Jiang

    Abstract: With the development and popularity of sensors installed in manufacturing systems, complex data are collected during manufacturing processes, which brings challenges for traditional process control methods. This paper proposes a novel process control and monitoring method for the complex structure of high-dimensional image-based overlay errors (modeled in tensor form), which are collected in semic… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 30 pages, 5 figures

  48. arXiv:2401.17008  [pdf, other

    stat.ME

    A Unified Three-State Model Framework for Analysis of Treatment Crossover in Survival Trials

    Authors: Zile Zhao, Ye Li, Xiaodong Luo, Ray Bai

    Abstract: We present a unified three-state model (TSM) framework for evaluating treatment effects in clinical trials in the presence of treatment crossover. Researchers have proposed diverse methodologies to estimate the treatment effect that would have hypothetically been observed if treatment crossover had not occurred. However, there is little work on understanding the connections between these different… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 27 pages, 5 figures, 2 tables

  49. arXiv:2401.15122  [pdf, other

    cs.LG cs.AI q-bio.BM q-bio.QM stat.ML

    A Multi-Grained Symmetric Differential Equation Model for Learning Protein-Ligand Binding Dynamics

    Authors: Shengchao Liu, Weitao Du, Yan**g Li, Zhuoxinran Li, Vignesh Bhethanabotla, Nakul Rampal, Omar Yaghi, Christian Borgs, Anima Anandkumar, Hongyu Guo, Jennifer Chayes

    Abstract: In drug discovery, molecular dynamics (MD) simulation for protein-ligand binding provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites. There has been a long history of improving the efficiency of MD simulations through better numerical methods and, more recently, by utilizing machine learning (ML) methods. Yet, challenges remain, s… ▽ More

    Submitted 1 February, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

  50. arXiv:2401.14549  [pdf, other

    stat.ME

    Privacy-preserving Quantile Treatment Effect Estimation for Randomized Controlled Trials

    Authors: Leon Yao, Paul Yiming Li, Jiannan Lu

    Abstract: In accordance with the principle of "data minimization", many internet companies are opting to record less data. However, this is often at odds with A/B testing efficacy. For experiments with units with multiple observations, one popular data minimizing technique is to aggregate data for each unit. However, exact quantile estimation requires the full observation-level data. In this paper, we devel… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted to 2023 CODE conference as a parallel presentation