Skip to main content

Showing 1–50 of 303 results for author: Li, L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.05666  [pdf, other

    cs.LG cs.IR stat.ML

    General Distribution Learning: A theoretical framework for Deep Learning

    Authors: Binchuan Qi, Li Li, Wei Gong

    Abstract: There remain numerous unanswered research questions on deep learning (DL) within the classical learning theory framework. These include the remarkable generalization capabilities of overparametrized neural networks (NNs), the efficient optimization performance despite non-convexity of objectives, the mechanism of flat minima for generalization, and the exceptional performance of deep architectures… ▽ More

    Submitted 26 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2105.04026 by other authors. arXiv admin note: text overlap with arXiv:2105.04026 by other authors

  2. arXiv:2406.03628  [pdf, other

    stat.ML cs.LG

    Synthetic Oversampling: Theory and A Practical Approach Using LLMs to Address Data Imbalance

    Authors: Ryumei Nakada, Yichen Xu, Lexin Li, Linjun Zhang

    Abstract: Imbalanced data and spurious correlations are common challenges in machine learning and data science. Oversampling, which artificially increases the number of instances in the underrepresented classes, has been widely adopted to tackle these challenges. In this article, we introduce OPAL (\textbf{O}versam\textbf{P}ling with \textbf{A}rtificial \textbf{L}LM-generated data), a systematic oversamplin… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 59 pages, 7 figures

  3. arXiv:2404.17019  [pdf, other

    stat.ME cs.LG stat.ML

    Neyman Meets Causal Machine Learning: Experimental Evaluation of Individualized Treatment Rules

    Authors: Michael Lingzhi Li, Kosuke Imai

    Abstract: A century ago, Neyman showed how to evaluate the efficacy of treatment using a randomized experiment under a minimal set of assumptions. This classical repeated sampling framework serves as a basis of routine experimental analyses conducted by today's scientists across disciplines. In this paper, we demonstrate that Neyman's methodology can also be used to experimentally evaluate the efficacy of i… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  4. arXiv:2404.12290  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Debiased Distribution Compression

    Authors: Lingxiao Li, Raaz Dwivedi, Lester Mackey

    Abstract: Modern compression methods can summarize a target distribution $\mathbb{P}$ more succinctly than i.i.d. sampling but require access to a low-bias input sequence like a Markov chain converging quickly to $\mathbb{P}$. We introduce a new suite of compression methods suitable for compression with biased input sequences. Given $n$ points targeting the wrong distribution and quadratic time, Stein kerne… ▽ More

    Submitted 26 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: Accepted to ICML 2024

  5. arXiv:2404.11713  [pdf, ps, other

    stat.ME

    Propensity Score Analysis with Guaranteed Subgroup Balance

    Authors: Yan Li, Yong-Fang Kuo, Liang Li

    Abstract: Estimating the causal treatment effects by subgroups is important in observational studies when the treatment effect heterogeneity may be present. Existing propensity score methods rely on a correctly specified propensity score model. Model misspecification results in biased treatment effect estimation and covariate imbalance. We proposed a new algorithm, the propensity score analysis with guarant… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  6. arXiv:2404.04794  [pdf, other

    stat.ME

    A Deep Learning Approach to Nonparametric Propensity Score Estimation with Optimized Covariate Balance

    Authors: Maosen Peng, Yan Li, Chong Wu, Liang Li

    Abstract: This paper proposes a novel propensity score weighting analysis. We define two sufficient and necessary conditions for a function of the covariates to be the propensity score. The first is "local balance", which ensures the conditional independence of covariates and treatment assignment across a dense grid of propensity score values. The second condition, "local calibration", guarantees that a bal… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: Corresponding author: Chong Wu (Email: [email protected]) and Liang Li (Email: [email protected])

  7. arXiv:2403.11348  [pdf, other

    cs.LG cs.AI stat.ML

    COLEP: Certifiably Robust Learning-Reasoning Conformal Prediction via Probabilistic Circuits

    Authors: Mintong Kang, Nezihe Merve Gürel, Linyi Li, Bo Li

    Abstract: Conformal prediction has shown spurring performance in constructing statistically rigorous prediction sets for arbitrary black-box machine learning models, assuming the data is exchangeable. However, even small adversarial perturbations during the inference can violate the exchangeability assumption, challenge the coverage guarantees, and result in a subsequent decline in empirical coverage. In th… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted to ICLR 2024

  8. arXiv:2403.07031  [pdf, other

    cs.LG stat.CO stat.ME stat.ML

    The Cram Method for Efficient Simultaneous Learning and Evaluation

    Authors: Zeyang Jia, Kosuke Imai, Michael Lingzhi Li

    Abstract: We introduce the "cram" method, a general and efficient approach to simultaneous learning and evaluation using a generic machine learning (ML) algorithm. In a single pass of batched data, the proposed method repeatedly trains an ML algorithm and tests its empirical performance. Because it utilizes the entire sample for both learning and evaluation, cramming is significantly more data-efficient tha… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  9. arXiv:2403.04568  [pdf, other

    cs.LG stat.ML

    Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition

    Authors: Long-Fei Li, Peng Zhao, Zhi-Hua Zhou

    Abstract: We study reinforcement learning with linear function approximation, unknown transition, and adversarial losses in the bandit feedback setting. Specifically, we focus on linear mixture MDPs whose transition kernel is a linear mixture model. We propose a new algorithm that attains an $\widetilde{O}(d\sqrt{HS^3K} + \sqrt{HSAK})$ regret with high probability, where $d$ is the dimension of feature mapp… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: AISTATS 2024

  10. arXiv:2402.02697  [pdf, ps, other

    cs.LG stat.ML

    Deep Equilibrium Models are Almost Equivalent to Not-so-deep Explicit Models for High-dimensional Gaussian Mixtures

    Authors: Zenan Ling, Longbo Li, Zhanbo Feng, Yixuan Zhang, Feng Zhou, Robert C. Qiu, Zhenyu Liao

    Abstract: Deep equilibrium models (DEQs), as a typical implicit neural network, have demonstrated remarkable success on various tasks. There is, however, a lack of theoretical understanding of the connections and differences between implicit DEQs and explicit neural network models. In this paper, leveraging recent advances in random matrix theory (RMT), we perform an in-depth analysis on the eigenspectra of… ▽ More

    Submitted 19 May, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted by ICML 2024

  11. arXiv:2401.17393  [pdf, other

    stat.ME

    Estimating the EVSI with Gaussian Approximations and Spline-Based Series Methods

    Authors: Linke Li, Hawre Jalal, Anna Heath

    Abstract: Background. The Expected Value of Sample Information (EVSI) measures the expected benefits that could be obtained by collecting additional data. Estimating EVSI using the traditional nested Monte Carlo method is computationally expensive but the recently developed Gaussian approximation (GA) approach can efficiently estimate EVSI across different sample sizes. However, the conventional GA may resu… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 11 pages, 2 figures, presented at 44th Medical Decision Making Annual North American Meeting

  12. arXiv:2401.16660  [pdf, ps, other

    stat.ME

    A Nonparametric Approach for Estimating the Effective Sample Size in Gaussian Approximation of Expected Value of Sample Information

    Authors: Linke Li, Hawre Jalal, Anna Heath

    Abstract: The effective sample size (ESS) measures the informational value of a probability distribution in terms of an equivalent number of study participants. The ESS plays a crucial role in estimating the Expected Value of Sample Information (EVSI) through the Gaussian approximation approach. Despite the significance of ESS, existing ESS estimation methods within the Gaussian approximation framework are… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 5 pages

  13. arXiv:2401.12369  [pdf, other

    cs.LG stat.ME

    SubgroupTE: Advancing Treatment Effect Estimation with Subgroup Identification

    Authors: Seungyeon Lee, Ruoqi Liu, Wenyu Song, Lang Li, ** Zhang

    Abstract: Precise estimation of treatment effects is crucial for evaluating intervention effectiveness. While deep learning models have exhibited promising performance in learning counterfactual representations for treatment effect estimation (TEE), a major limitation in most of these models is that they treat the entire population as a homogeneous group, overlooking the diversity of treatment effects acros… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  14. arXiv:2312.16769  [pdf, other

    stat.ME q-bio.NC stat.AP

    Estimation and Inference for High-dimensional Multi-response Growth Curve Model

    Authors: Xin Zhou, Yin Xia, Lexin Li

    Abstract: A growth curve model (GCM) aims to characterize how an outcome variable evolves, develops and grows as a function of time, along with other predictors. It provides a particularly useful framework to model growth trend in longitudinal data. However, the estimation and inference of GCM with a large number of response variables faces numerous challenges, and remains underdeveloped. In this article, w… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  15. arXiv:2312.05756  [pdf

    cs.CE cs.AI stat.ME

    A quantitative fusion strategy of stock picking and timing based on Particle Swarm Optimized-Back Propagation Neural Network and Multivariate Gaussian-Hidden Markov Model

    Authors: Huajian Li, Longjian Li, Jiajian Liang, Weinan Dai

    Abstract: In recent years, machine learning (ML) has brought effective approaches and novel techniques to economic decision, investment forecasting, and risk management, etc., co** the variable and intricate nature of economic and financial environments. For the investment in stock market, this research introduces a pioneering quantitative fusion model combining stock timing and picking strategy by levera… ▽ More

    Submitted 22 December, 2023; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: 12 pages, 6 figures, 4 tables, 26 references

  16. arXiv:2311.11543  [pdf, other

    stat.ME stat.AP

    A Comparison of Parameter Estimation Methods for Shared Frailty Models

    Authors: Tingxuan Wu, Cindy Feng, Longhai Li

    Abstract: This paper compares six different parameter estimation methods for shared frailty models via a series of simulation studies. A shared frailty model is a survival model that incorporates a random effect term, where the frailties are common or shared among individuals within specific groups. Several parameter estimation methods are available for fitting shared frailty models, such as penalized parti… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  17. arXiv:2311.00878  [pdf, other

    stat.ME stat.AP

    Backward Joint Model for Dynamic Prediction using Multivariate Longitudinal and Competing Risk Data

    Authors: Wenhao Li, Liang Li, Brad C. Astor, Wei Yang, Tom H. Greene

    Abstract: Joint modeling is a useful approach to dynamic prediction of clinical outcomes using longitudinally measured predictors. When the outcomes are competing risk events, fitting the conventional shared random effects joint model often involves intensive computation, especially when multiple longitudinal biomarkers are be used as predictors, as is often desired in prediction problems. Motivated by a lo… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  18. arXiv:2310.19360  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Balance, Imbalance, and Rebalance: Understanding Robust Overfitting from a Minimax Game Perspective

    Authors: Yifei Wang, Liangchen Li, Jiansheng Yang, Zhouchen Lin, Yisen Wang

    Abstract: Adversarial Training (AT) has become arguably the state-of-the-art algorithm for extracting robust features. However, researchers recently notice that AT suffers from severe robust overfitting problems, particularly after learning rate (LR) decay. In this paper, we explain this phenomenon by viewing adversarial training as a dynamic minimax game between the model trainer and the attacker. Specific… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023

  19. arXiv:2310.16203  [pdf, other

    stat.ME

    Multivariate Dynamic Mediation Analysis under a Reinforcement Learning Framework

    Authors: Lan Luo, Chengchun Shi, Jitao Wang, Zhenke Wu, Lexin Li

    Abstract: Mediation analysis is an important analytic tool commonly used in a broad range of scientific applications. In this article, we study the problem of mediation analysis when there are multivariate and conditionally dependent mediators, and when the variables are observed over multiple time points. The problem is challenging, because the effect of a mediator involves not only the path from the treat… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  20. arXiv:2310.07973  [pdf, other

    stat.ME math.OC stat.AP stat.ML

    Statistical Performance Guarantee for Subgroup Identification with Generic Machine Learning

    Authors: Michael Lingzhi Li, Kosuke Imai

    Abstract: Across a wide array of disciplines, many researchers use machine learning (ML) algorithms to identify a subgroup of individuals who are likely to benefit from a treatment the most (``exceptional responders'') or those who are harmed by it. A common approach to this subgroup identification problem consists of two steps. First, researchers estimate the conditional average treatment effect (CATE) usi… ▽ More

    Submitted 20 December, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

  21. arXiv:2306.11906  [pdf

    stat.CO stat.OT

    Statistical thinking in simulation design: a continuing conversation on the balancing intercept problem

    Authors: Boyi Guo, Linzi Li, Jacqueline E. Rudolph

    Abstract: Epidemiologists have a growing interest in employing computational approaches to solve analytic problems, with simulation being arguably the most accessible among all approaches. While previous literature discussed the utility of simulation and demonstrated how to carry out them, few have focused on connecting underlying statistical concepts to these simulation approaches, creating gaps between th… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: 9 pages including title and abstract pages, references and supporting material, 1 figure

  22. arXiv:2306.05436  [pdf, other

    stat.AP cs.CY

    Remaining Useful Life Modelling with an Escalator Health Condition Analytic System

    Authors: Inez M. Zwetsloot, Yu Lin, Jiaqi Qiu, Lishuai Li, William Ka Fai Lee, Edmond Yin San Yeung, Colman Yiu Wah Yeung, Chris Chun Long Wong

    Abstract: The refurbishment of an escalator is usually linked with its design life as recommended by the manufacturer. However, the actual useful life of an escalator should be determined by its operating condition which is affected by the runtime, workload, maintenance quality, vibration, etc., rather than age only. The objective of this project is to develop a comprehensive health condition analytic syste… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: 14 pages, 12 figures, 7 tables

  23. arXiv:2305.20028  [pdf, other

    cs.LG stat.ML

    A Study of Bayesian Neural Network Surrogates for Bayesian Optimization

    Authors: Yucen Lily Li, Tim G. J. Rudner, Andrew Gordon Wilson

    Abstract: Bayesian optimization is a highly efficient approach to optimizing objective functions which are expensive to query. These objectives are typically represented by Gaussian process (GP) surrogate models which are easy to optimize and support exact inference. While standard GP surrogates have been well-established in Bayesian optimization, Bayesian neural networks (BNNs) have recently become practic… ▽ More

    Submitted 8 May, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: ICLR 2024. Code available at https://github.com/yucenli/bnn-bo

  24. arXiv:2305.19244  [pdf, other

    stat.ML cs.LG

    Testing for the Markov Property in Time Series via Deep Conditional Generative Learning

    Authors: Yunzhe Zhou, Chengchun Shi, Lexin Li, Qiwei Yao

    Abstract: The Markov property is widely imposed in analysis of time series data. Correspondingly, testing the Markov property, and relatedly, inferring the order of a Markov model, are of paramount importance. In this article, we propose a nonparametric test for the Markov property in high-dimensional time series via deep conditional generative learning. We also apply the test sequentially to determine the… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  25. arXiv:2305.15751  [pdf, other

    stat.ME stat.CO

    High-dimensional Response Growth Curve Modeling for Longitudinal Neuroimaging Analysis

    Authors: Lu Wang, Xiang Lyu, Zhengwu Zhang, Lexin Li

    Abstract: There is increasing interest in modeling high-dimensional longitudinal outcomes in applications such as developmental neuroimaging research. Growth curve model offers a useful tool to capture both the mean growth pattern across individuals, as well as the dynamic changes of outcomes over time within each individual. However, when the number of outcomes is large, it becomes challenging and often in… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  26. arXiv:2305.11908  [pdf, other

    cs.HC cs.LG q-bio.NC stat.ML

    Sequential Best-Arm Identification with Application to Brain-Computer Interface

    Authors: Xin Zhou, Botao Hao, Jian Kang, Tor Lattimore, Lexin Li

    Abstract: A brain-computer interface (BCI) is a technology that enables direct communication between the brain and an external device or computer system. It allows individuals to interact with the device using only their thoughts, and holds immense potential for a wide range of applications in medicine, rehabilitation, and human augmentation. An electroencephalogram (EEG) and event-related potential (ERP)-b… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  27. arXiv:2305.05890  [pdf, other

    cs.LG stat.ME

    CUTS+: High-dimensional Causal Discovery from Irregular Time-series

    Authors: Yuxiao Cheng, Lianglong Li, Tingxiong Xiao, Zongren Li, Qin Zhong, **li Suo, Kunlun He

    Abstract: Causal discovery in time-series is a fundamental problem in the machine learning community, enabling causal reasoning and decision-making in complex scenarios. Recently, researchers successfully discover causality by combining neural networks with Granger causality, but their performances degrade largely when encountering high-dimensional data because of the highly redundant network design and hug… ▽ More

    Submitted 16 August, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: Submit to AAAI-24

  28. arXiv:2304.08184  [pdf, other

    econ.EM stat.ME

    Adjustment with Many Regressors Under Covariate-Adaptive Randomizations

    Authors: Liang Jiang, Liyao Li, Ke Miao, Yichong Zhang

    Abstract: Our paper discovers a new trade-off of using regression adjustments (RAs) in causal inference under covariate-adaptive randomizations (CARs). On one hand, RAs can improve the efficiency of causal estimators by incorporating information from covariates that are not used in the randomization. On the other hand, RAs can degrade estimation efficiency due to their estimation errors, which are not asymp… ▽ More

    Submitted 6 February, 2024; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: 75 pages, including appendix

  29. arXiv:2303.18163  [pdf, other

    stat.ME

    Robust Tensor Factor Analysis

    Authors: Matteo Barigozzi, Yong He, Lingxiao Li, Lorenzo Trapani

    Abstract: We consider (robust) inference in the context of a factor model for tensor-valued sequences. We study the consistency of the estimated common factors and loadings space when using estimators based on minimising quadratic loss functions. Building on the observation that such loss functions are adequate only if sufficiently many moments exist, we extend our results to the case of heavy-tailed distri… ▽ More

    Submitted 27 August, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

  30. arXiv:2303.09616  [pdf, other

    stat.ME

    Cross-validatory Z-Residual for Diagnosing Shared Frailty Models

    Authors: Tingxuan Wu, Cindy Feng, Longhai Li

    Abstract: Residual diagnostic methods play a critical role in assessing model assumptions and detecting outliers in statistical modelling. In the context of survival models with censored observations, Li et al. (2021) introduced the Z-residual, which follows an approximately normal distribution under the true model. This property makes it possible to use Z-residuals for diagnosing survival models in a way s… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 32 pages, 14 figures

  31. arXiv:2303.02817  [pdf, other

    stat.ME

    Huber Principal Component Analysis for Large-dimensional Factor Models

    Authors: Yong He, Lingxiao Li, Dong Liu, Wen-Xin Zhou

    Abstract: Factor models have been widely used in economics and finance. However, the heavy-tailed nature of macroeconomic and financial data is often neglected in the existing literature. To address this issue and achieve robustness, we propose an approach to estimate factor loadings and scores by minimizing the Huber loss function, which is motivated by the equivalence of conventional Principal Component A… ▽ More

    Submitted 29 March, 2023; v1 submitted 5 March, 2023; originally announced March 2023.

  32. arXiv:2302.09106  [pdf, other

    stat.ME

    Z-residual diagnostics for detecting misspecification of the functional form of covariates for shared frailty models

    Authors: Tingxuan Wu, Longhai Li, Cindy Feng

    Abstract: In survival analysis, the hazard function often depends on a set of covariates. Martingale and deviance residual are most widely used for examining the validity of the function form of covariates by checking whether there is a discernible trend in their scatterplot against continuous covariates. However, visual inspection of martingale and deviance residuals is often subjective. In addition, these… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: 21 pages, 7 figures

  33. arXiv:2301.09231  [pdf, other

    cs.LG stat.AP stat.ML

    GP-NAS-ensemble: a model for NAS Performance Prediction

    Authors: Kunlong Chen, Liu Yang, Yitian Chen, Kun** Chen, Yidan Xu, Lujun Li

    Abstract: It is of great significance to estimate the performance of a given model architecture without training in the application of Neural Architecture Search (NAS) as it may take a lot of time to evaluate the performance of an architecture. In this paper, a novel NAS framework called GP-NAS-ensemble is proposed to predict the performance of a neural network architecture with a small training dataset. We… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

  34. arXiv:2301.08353  [pdf, other

    cs.IR cs.AI cs.LG stat.ML

    AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction

    Authors: YaChen Yan, Liubo Li

    Abstract: Learning feature interactions is crucial to success for large-scale CTR prediction in recommender systems and Ads ranking. Researchers and practitioners extensively proposed various neural network architectures for searching and modeling feature interactions. However, we observe that different datasets favor different neural network architectures and feature interaction types, suggesting that diff… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

    Comments: arXiv admin note: text overlap with arXiv:2301.01089, arXiv:2301.08139

  35. arXiv:2301.06769  [pdf, ps, other

    math.PR cs.LG stat.ML

    Geometric ergodicity of SGLD via reflection coupling

    Authors: Lei Li, Jian-Guo Liu, Yuliang Wang

    Abstract: We consider the geometric ergodicity of the Stochastic Gradient Langevin Dynamics (SGLD) algorithm under nonconvexity settings. Via the technique of reflection coupling, we prove the Wasserstein contraction of SGLD when the target distribution is log-concave only outside some compact set. The time discretization and the minibatch in SGLD introduce several difficulties when applying the reflection… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

  36. arXiv:2301.01089  [pdf, other

    cs.LG cs.AI cs.IR stat.ML

    xDeepInt: a hybrid architecture for modeling the vector-wise and bit-wise feature interactions

    Authors: YaChen Yan, Liubo Li

    Abstract: Learning feature interactions is the key to success for the large-scale CTR prediction and recommendation. In practice, handcrafted feature engineering usually requires exhaustive searching. In order to reduce the high cost of human efforts in feature engineering, researchers propose several deep neural networks (DNN)-based approaches to learn the feature interactions in an end-to-end fashion. How… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

  37. arXiv:2211.00249  [pdf, other

    stat.ML cs.LG stat.ME

    Robust Direct Learning for Causal Data Fusion

    Authors: Xinyu Li, Yilin Li, Qing Cui, Longfei Li, Jun Zhou

    Abstract: In the era of big data, the explosive growth of multi-source heterogeneous data offers many exciting challenges and opportunities for improving the inference of conditional average treatment effects. In this paper, we investigate homogeneous and heterogeneous causal data fusion problems under a general setting that allows for the presence of source-specific covariates. We provide a direct learning… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: 16 pages, 2 figures. Accepted for presentation at the 14th Asian Conference on Machine Learning (ACML 2022), and for publication in Proceedings of Machine Learning Research, Volume 189

  38. arXiv:2210.14420  [pdf, other

    stat.ML cs.LG

    Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach

    Authors: Yunzhe Zhou, Zhengling Qi, Chengchun Shi, Lexin Li

    Abstract: In this article, we propose a novel pessimism-based Bayesian learning method for optimal dynamic treatment regimes in the offline setting. When the coverage condition does not hold, which is common for offline data, the existing solutions would produce sub-optimal policies. The pessimism principle addresses this issue by discouraging recommendation of actions that are less explored conditioning on… ▽ More

    Submitted 21 February, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: 18 pages, 6 figures. Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) 2023

  39. arXiv:2210.13400  [pdf, other

    stat.ML cs.LG

    Sampling with Mollified Interaction Energy Descent

    Authors: Lingxiao Li, Qiang Liu, Anna Korba, Mikhail Yurochkin, Justin Solomon

    Abstract: Sampling from a target measure whose density is only known up to a normalization constant is a fundamental problem in computational statistics and machine learning. In this paper, we present a new optimization-based method for sampling called mollified interaction energy descent (MIED). MIED minimizes a new class of energies on probability measures called mollified interaction energies (MIEs). The… ▽ More

    Submitted 1 March, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

  40. arXiv:2210.11620  [pdf, other

    cs.LG stat.ML

    LOT: Layer-wise Orthogonal Training on Improving $\ell_2$ Certified Robustness

    Authors: Xiaojun Xu, Linyi Li, Bo Li

    Abstract: Recent studies show that training deep neural networks (DNNs) with Lipschitz constraints are able to enhance adversarial robustness and other model properties such as stability. In this paper, we propose a layer-wise orthogonal training method (LOT) to effectively train 1-Lipschitz convolution layers via parametrizing an orthogonal matrix with an unconstrained matrix. We then efficiently compute t… ▽ More

    Submitted 26 March, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  41. arXiv:2210.11107  [pdf, other

    stat.AP

    Graphical model inference with external network data

    Authors: Jack Jewson, Li Li, Laura Battaglia, Stephen Hansen, David Rossell, Piotr Zwiernik

    Abstract: We consider two applications where we study how dependence structure between many variables is linked to external network data. We first study the interplay between social media connectedness and the co-evolution of the COVID-19 pandemic across USA counties. We next study study how the dependence between stock market returns across firms relates to similarities in economic and policy indicators fr… ▽ More

    Submitted 13 November, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

  42. arXiv:2210.08326  [pdf, ps, other

    stat.ME cs.LG math.OC stat.ML

    Distributionally Robust Causal Inference with Observational Data

    Authors: Dimitris Bertsimas, Kosuke Imai, Michael Lingzhi Li

    Abstract: We consider the estimation of average treatment effects in observational studies and propose a new framework of robust causal inference with unobserved confounders. Our approach is based on distributionally robust optimization and proceeds in two steps. We first specify the maximal degree to which the distribution of unobserved potential outcomes may deviate from that of observed outcomes. We then… ▽ More

    Submitted 2 February, 2023; v1 submitted 15 October, 2022; originally announced October 2022.

  43. arXiv:2210.08031  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    Neural Attentive Circuits

    Authors: Nasim Rahaman, Martin Weiss, Francesco Locatello, Chris Pal, Yoshua Bengio, Bernhard Schölkopf, Li Erran Li, Nicolas Ballas

    Abstract: Recent work has seen the development of general purpose neural architectures that can be trained to perform tasks across diverse data modalities. General purpose models typically make few assumptions about the underlying data-structure and are known to perform well in the large-data regime. At the same time, there has been growing interest in modular neural architectures that represent the data us… ▽ More

    Submitted 19 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: To appear at NeurIPS 2022

  44. arXiv:2210.07536  [pdf, other

    cs.LG stat.ME stat.ML

    A Reinforcement Learning Approach to Estimating Long-term Treatment Effects

    Authors: Ziyang Tang, Yiheng Duan, Stephanie Zhang, Lihong Li

    Abstract: Randomized experiments (a.k.a. A/B tests) are a powerful tool for estimating treatment effects, to inform decisions making in business, healthcare and other applications. In many problems, the treatment has a lasting effect that evolves over time. A limitation with randomized experiments is that they do not easily extend to measure long-term effects, since running long experiments is time-consumin… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  45. arXiv:2208.12483  [pdf, other

    cs.LG stat.ML

    Dynamic Regret of Online Markov Decision Processes

    Authors: Peng Zhao, Long-Fei Li, Zhi-Hua Zhou

    Abstract: We investigate online Markov Decision Processes (MDPs) with adversarially changing loss functions and known transitions. We choose dynamic regret as the performance measure, defined as the performance difference between the learner and any sequence of feasible changing policies. The measure is strictly stronger than the standard static regret that benchmarks the learner's performance with a fixed… ▽ More

    Submitted 26 August, 2022; originally announced August 2022.

  46. arXiv:2208.05740  [pdf, other

    cs.LG cs.CR cs.CV math.OC stat.ML

    General Cutting Planes for Bound-Propagation-Based Neural Network Verification

    Authors: Huan Zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, J. Zico Kolter

    Abstract: Bound propagation methods, when combined with branch and bound, are among the most effective methods to formally verify properties of deep neural networks such as correctness, robustness, and safety. However, existing works cannot handle the general form of cutting plane constraints widely accepted in traditional solvers, which are crucial for strengthening verifiers with tightened convex relaxati… ▽ More

    Submitted 4 December, 2022; v1 submitted 11 August, 2022; originally announced August 2022.

    Comments: Accepted by NeurIPS 2022. GCP-CROWN is part of the alpha-beta-CROWN verifier, the VNN-COMP 2022 winner

  47. arXiv:2207.09304  [pdf, ps, other

    math.PR cs.LG stat.ML

    A sharp uniform-in-time error estimate for Stochastic Gradient Langevin Dynamics

    Authors: Lei Li, Yuliang Wang

    Abstract: We establish a sharp uniform-in-time error estimate for the Stochastic Gradient Langevin Dynamics (SGLD), which is a popular sampling algorithm. Under mild assumptions, we obtain a uniform-in-time $O(η^2)$ bound for the KL-divergence between the SGLD iteration and the Langevin diffusion, where $η$ is the step size (or learning rate). Our analysis is also valid for varying step sizes. Based on this… ▽ More

    Submitted 21 October, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

  48. arXiv:2207.04922  [pdf, ps, other

    stat.ML cs.LG

    On uniform-in-time diffusion approximation for stochastic gradient descent

    Authors: Lei Li, Yuliang Wang

    Abstract: The diffusion approximation of stochastic gradient descent (SGD) in current literature is only valid on a finite time interval. In this paper, we establish the uniform-in-time diffusion approximation of SGD, by only assuming that the expected loss is strongly convex and some other mild conditions, without assuming the convexity of each random loss function. The main technique is to establish the e… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

  49. arXiv:2207.03522  [pdf, other

    cs.LG cs.NE cs.SI physics.soc-ph stat.ML

    TF-GNN: Graph Neural Networks in TensorFlow

    Authors: Oleksandr Ferludin, Arno Eigenwillig, Martin Blais, Dustin Zelle, Jan Pfeifer, Alvaro Sanchez-Gonzalez, Wai Lok Sibon Li, Sami Abu-El-Haija, Peter Battaglia, Neslihan Bulut, Jonathan Halcrow, Filipe Miguel Gonçalves de Almeida, Pedro Gonnet, Liangze Jiang, Parth Kothari, Silvio Lattanzi, André Linhares, Brandon Mayer, Vahab Mirrokni, John Palowitch, Mihir Paradkar, Jennifer She, Anton Tsitsulin, Kevin Villela, Lisa Wang , et al. (2 additional authors not shown)

    Abstract: TensorFlow-GNN (TF-GNN) is a scalable library for Graph Neural Networks in TensorFlow. It is designed from the bottom up to support the kinds of rich heterogeneous graph data that occurs in today's information ecosystems. In addition to enabling machine learning researchers and advanced developers, TF-GNN offers low-code solutions to empower the broader developer community in graph learning. Many… ▽ More

    Submitted 23 July, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

  50. arXiv:2206.09800  [pdf, other

    stat.ME

    Statistical Inference for Large-dimensional Tensor Factor Model by Iterative Projections

    Authors: Matteo Barigozzi, Yong He, Lingxiao Li, Lorenzo Trapani

    Abstract: Tensor Factor Models (TFM) are appealing dimension reduction tools for high-order large-dimensional tensor time series, and have wide applications in economics, finance and medical imaging. In this paper, we propose a projection estimator for the Tucker-decomposition based TFM, and provide its least-square interpretation which parallels to the least-square interpretation of the Principal Component… ▽ More

    Submitted 23 April, 2023; v1 submitted 20 June, 2022; originally announced June 2022.