Skip to main content

Showing 1–50 of 310 results for author: Liu, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.03596  [pdf

    stat.ME

    A Multivariate Equivalence Test Based on Mahalanobis Distance with a Data-Driven Margin

    Authors: Chao Wang, Yu-Ting Weng, Shaobo Liu, Tengfei Li, Meiyu Shen, Yi Tsong

    Abstract: Multivariate equivalence testing is needed in a variety of scenarios for drug development. For example, drug products obtained from natural sources may contain many components for which the individual effects and/or their interactions on clinical efficacy and safety cannot be completely characterized. Such lack of sufficient characterization poses a challenge for both generic drug developers to de… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  2. arXiv:2406.02628  [pdf, ps, other

    stat.ML cs.CC cs.DS cs.LG

    Replicability in High Dimensional Statistics

    Authors: Max Hopkins, Russell Impagliazzo, Daniel Kane, Sihan Liu, Christopher Ye

    Abstract: The replicability crisis is a major issue across nearly all areas of empirical science, calling for the formal study of replicability in statistics. Motivated in this context, [Impagliazzo, Lei, Pitassi, and Sorrell STOC 2022] introduced the notion of replicable learning algorithms, and gave basic procedures for $1$-dimensional tasks including statistical queries. In this work, we study the comput… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 119 pages

    ACM Class: F.2.0

  3. arXiv:2406.01380  [pdf, other

    cs.CV stat.AP

    Convolutional Unscented Kalman Filter for Multi-Object Tracking with Outliers

    Authors: Shiqi Liu, Wenhan Cao, Chang Liu, Tianyi Zhang, Shengbo Eben Li

    Abstract: Multi-object tracking (MOT) is an essential technique for navigation in autonomous driving. In tracking-by-detection systems, biases, false positives, and misses, which are referred to as outliers, are inevitable due to complex traffic scenarios. Recent tracking methods are based on filtering algorithms that overlook these outliers, leading to reduced tracking accuracy or even loss of the objects… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 11 pages, 5 figures

  4. arXiv:2405.19058  [pdf, other

    stat.ME

    Participation bias in the estimation of heritability and genetic correlation

    Authors: Shuang Song, Stefania Benonisdottir, Jun S. Liu, Augustine Kong

    Abstract: It is increasingly recognized that participation bias can pose problems for genetic studies. Recently, to overcome the challenge that genetic information of non-participants is unavailable, it is shown that by comparing the IBD (identity by descent) shared and not-shared segments among the participants, one can estimate the genetic component underlying participation. That, however, does not direct… ▽ More

    Submitted 30 May, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  5. arXiv:2405.18081  [pdf, other

    math.ST cs.IT math.PR stat.ML

    Optimality of Approximate Message Passing Algorithms for Spiked Matrix Models with Rotationally Invariant Noise

    Authors: Rishabh Dudeja, Songbin Liu, Junjie Ma

    Abstract: We study the problem of estimating a rank one signal matrix from an observed matrix generated by corrupting the signal with additive rotationally invariant noise. We develop a new class of approximate message-passing algorithms for this problem and provide a simple and concise characterization of their dynamics in the high-dimensional limit. At each iteration, these algorithms exploit prior knowle… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  6. arXiv:2405.15115  [pdf, other

    cs.LG cs.CL stat.ML

    Towards Better Understanding of In-Context Learning Ability from In-Context Uncertainty Quantification

    Authors: Shang Liu, Zhongze Cai, Guanting Chen, Xiaocheng Li

    Abstract: Predicting simple function classes has been widely used as a testbed for develo** theory and understanding of the trained Transformer's in-context learning (ICL) ability. In this paper, we revisit the training of Transformers on linear regression tasks, and different from all the existing literature, we consider a bi-objective prediction task of predicting both the conditional expectation… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  7. arXiv:2404.02343  [pdf, other

    q-fin.PR cs.LG math.OC stat.ML

    Improved model-free bounds for multi-asset options using option-implied information and deep learning

    Authors: Evangelia Dragazi, Shuaiqiang Liu, Antonis Papapantoleon

    Abstract: We consider the computation of model-free bounds for multi-asset options in a setting that combines dependence uncertainty with additional information on the dependence structure. More specifically, we consider the setting where the marginal distributions are known and partial information, in the form of known prices for multi-asset options, is also available in the market. We provide a fundamenta… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    MSC Class: 91G20; 91G60; 68T07

  8. arXiv:2404.00481  [pdf, other

    stat.ML cs.LG eess.SY

    Convolutional Bayesian Filtering

    Authors: Wenhan Cao, Shiqi Liu, Chang Liu, Zeyu He, Stephen S. -T. Yau, Shengbo Eben Li

    Abstract: Bayesian filtering serves as the mainstream framework of state estimation in dynamic systems. Its standard version utilizes total probability rule and Bayes' law alternatively, where how to define and compute conditional probability is critical to state distribution inference. Previously, the conditional probability is assumed to be exactly known, which represents a measure of the occurrence proba… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  9. arXiv:2403.13027  [pdf, other

    cs.LG cs.CR cs.IT stat.ML

    Towards Better Statistical Understanding of Watermarking LLMs

    Authors: Zhongze Cai, Shang Liu, Hanzhao Wang, Huaiyang Zhong, Xiaocheng Li

    Abstract: In this paper, we study the problem of watermarking large language models (LLMs). We consider the trade-off between model distortion and detection ability and formulate it as a constrained optimization problem based on the green-red algorithm of Kirchenbauer et al. (2023a). We show that the optimal solution to the optimization problem enjoys a nice analytical property which provides a better under… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  10. arXiv:2403.12166  [pdf, other

    cs.LG stat.ML

    The Power of Few: Accelerating and Enhancing Data Reweighting with Coreset Selection

    Authors: Mohammad Jafari, Yimeng Zhang, Yihua Zhang, Sijia Liu

    Abstract: As machine learning tasks continue to evolve, the trend has been to gather larger datasets and train increasingly larger models. While this has led to advancements in accuracy, it has also escalated computational costs to unsustainable levels. Addressing this, our work aims to strike a delicate balance between computational efficiency and model accuracy, a persisting challenge in the field. We int… ▽ More

    Submitted 30 May, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted to ICASSP 2024

  11. arXiv:2403.07310  [pdf, other

    stat.ML cs.LG

    How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance

    Authors: Hongkang Li, Shuai Zhang, Yihua Zhang, Meng Wang, Sijia Liu, Pin-Yu Chen

    Abstract: Group imbalance has been a known problem in empirical risk minimization (ERM), where the achieved high average accuracy is accompanied by low accuracy in a minority group. Despite algorithmic efforts to improve the minority group accuracy, a theoretical generalization analysis of ERM on individual groups remains elusive. By formulating the group imbalance problem with the Gaussian Mixture Model, t… ▽ More

    Submitted 19 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  12. arXiv:2402.07340  [pdf, other

    cs.LG cs.IT cs.SI math.PR math.ST stat.ML

    Random Geometric Graph Alignment with Graph Neural Networks

    Authors: Suqi Liu, Morgane Austern

    Abstract: We characterize the performance of graph neural networks for graph alignment problems in the presence of vertex feature information. More specifically, given two graphs that are independent perturbations of a single random geometric graph with noisy sparse features, the task is to recover an unknown one-to-one map** between the vertices of the two graphs. We show under certain conditions on the… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

    Comments: 29 pages, 2 figure, 1 table

  13. arXiv:2402.06162  [pdf, other

    stat.ML cs.LG

    Wasserstein proximal operators describe score-based generative models and resolve memorization

    Authors: Benjamin J. Zhang, Siting Liu, Wuchen Li, Markos A. Katsoulakis, Stanley J. Osher

    Abstract: We focus on the fundamental mathematical structure of score-based generative models (SGMs). We first formulate SGMs in terms of the Wasserstein proximal operator (WPO) and demonstrate that, via mean-field games (MFGs), the WPO formulation reveals mathematical structure that describes the inductive bias of diffusion and score-based models. In particular, MFGs yield optimality conditions in the form… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  14. arXiv:2401.17523  [pdf, other

    cs.LG cs.CR stat.ML

    Game-Theoretic Unlearnable Example Generator

    Authors: Shuang Liu, Yihan Wang, Xiao-Shan Gao

    Abstract: Unlearnable example attacks are data poisoning attacks aiming to degrade the clean test accuracy of deep learning by adding imperceptible perturbations to the training samples, which can be formulated as a bi-level optimization problem. However, directly solving this optimization problem is intractable for deep neural networks. In this paper, we investigate unlearnable example attacks from a game-… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  15. arXiv:2401.15806  [pdf, ps, other

    stat.ME math.ST

    Continuous-time structural failure time model for intermittent treatment

    Authors: Guanbo Wang, Siyi Liu, Shu Yang

    Abstract: The intermittent intake of treatment is commonly seen in patients with chronic disease. For example, patients with atrial fibrillation may need to discontinue the oral anticoagulants when they experience a certain surgery and re-initiate the treatment after the surgery. As another example, patients may skip a few days before they refill a treatment as planned. This treatment dispensation informati… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  16. arXiv:2401.15122  [pdf, other

    cs.LG cs.AI q-bio.BM q-bio.QM stat.ML

    A Multi-Grained Symmetric Differential Equation Model for Learning Protein-Ligand Binding Dynamics

    Authors: Shengchao Liu, Weitao Du, Yan**g Li, Zhuoxinran Li, Vignesh Bhethanabotla, Nakul Rampal, Omar Yaghi, Christian Borgs, Anima Anandkumar, Hongyu Guo, Jennifer Chayes

    Abstract: In drug discovery, molecular dynamics (MD) simulation for protein-ligand binding provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites. There has been a long history of improving the efficiency of MD simulations through better numerical methods and, more recently, by utilizing machine learning (ML) methods. Yet, challenges remain, s… ▽ More

    Submitted 1 February, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

  17. arXiv:2401.06383  [pdf, other

    stat.ME

    Decomposition with Monotone B-splines: Fitting and Testing

    Authors: Lijun Wang, Xiaodan Fan, Hongyu Zhao, Jun S. Liu

    Abstract: A univariate continuous function can always be decomposed as the sum of a non-increasing function and a non-decreasing one. Based on this property, we propose a non-parametric regression method that combines two spline-fitted monotone curves. We demonstrate by extensive simulations that, compared to standard spline-fitting methods, the proposed approach is particularly advantageous in high-noise s… ▽ More

    Submitted 9 April, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

  18. arXiv:2401.03206  [pdf, ps, other

    cs.LG math.NA math.OC math.PR math.ST stat.ME stat.ML

    A Robbins--Monro Sequence That Can Exploit Prior Information For Faster Convergence

    Authors: Siwei Liu, Ke Ma, Stephan M. Goetz

    Abstract: We propose a new method to improve the convergence speed of the Robbins-Monro algorithm by introducing prior information about the target point into the Robbins-Monro iteration. We achieve the incorporation of prior information without the need of a -- potentially wrong -- regression model, which would also entail additional constraints. We show that this prior-information Robbins-Monro sequence i… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: 26 pages, 5 figures

    MSC Class: 62L20; 62L05; 62L10; 60G99; 60-08; 65B99; 65C99; 90C15

  19. arXiv:2401.02630  [pdf, other

    cs.LG stat.AP

    Model-Agnostic Interpretation Framework in Machine Learning: A Comparative Study in NBA Sports

    Authors: Shun Liu

    Abstract: The field of machine learning has seen tremendous progress in recent years, with deep learning models delivering exceptional performance across a range of tasks. However, these models often come at the cost of interpretability, as they operate as opaque "black boxes" that obscure the rationale behind their decisions. This lack of transparency can limit understanding of the models' underlying princ… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  20. arXiv:2401.00139  [pdf, other

    cs.AI cs.CL cs.LG stat.ME

    Is Knowledge All Large Language Models Needed for Causal Reasoning?

    Authors: Hengrui Cai, Shengjie Liu, Rui Song

    Abstract: This paper explores the causal reasoning of large language models (LLMs) to enhance their interpretability and reliability in advancing artificial intelligence. Despite the proficiency of LLMs in a range of tasks, their potential for understanding causality requires further exploration. We propose a novel causal attribution model that utilizes ``do-operators" for constructing counterfactual scenar… ▽ More

    Submitted 5 June, 2024; v1 submitted 29 December, 2023; originally announced January 2024.

    Comments: A Python implementation of our proposed method is available at https://github.com/ncsulsj/Causal_LLM

  21. arXiv:2312.02213  [pdf, other

    cs.LG cs.AI cs.DB stat.AP

    JarviX: A LLM No code Platform for Tabular Data Analysis and Optimization

    Authors: Shang-Ching Liu, ShengKun Wang, Wenqi Lin, Chung-Wei Hsiung, Yi-Chen Hsieh, Yu-** Cheng, Sian-Hong Luo, Tsungyao Chang, Jianwei Zhang

    Abstract: In this study, we introduce JarviX, a sophisticated data analytics framework. JarviX is designed to employ Large Language Models (LLMs) to facilitate an automated guide and execute high-precision data analyzes on tabular datasets. This framework emphasizes the significance of varying column types, capitalizing on state-of-the-art LLMs to generate concise data insight summaries, propose relevant an… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  22. arXiv:2311.13154  [pdf, other

    cs.DS cs.IT cs.LG math.ST stat.ML

    Testing Closeness of Multivariate Distributions via Ramsey Theory

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sihan Liu

    Abstract: We investigate the statistical task of closeness (or equivalence) testing for multidimensional distributions. Specifically, given sample access to two unknown distributions $\mathbf p, \mathbf q$ on $\mathbb R^d$, we want to distinguish between the case that $\mathbf p=\mathbf q$ versus $\|\mathbf p-\mathbf q\|_{A_k} > ε$, where $\|\mathbf p-\mathbf q\|_{A_k}$ denotes the generalized ${A}_k$ dista… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  23. arXiv:2311.07565  [pdf, other

    cs.LG stat.ML

    Exploration via linearly perturbed loss minimisation

    Authors: David Janz, Shuai Liu, Alex Ayoub, Csaba Szepesvári

    Abstract: We introduce exploration via linear loss perturbations (EVILL), a randomised exploration method for structured stochastic bandit problems that works by solving for the minimiser of a linearly perturbed regularised negative log-likelihood function. We show that, for the case of generalised linear bandits, EVILL reduces to perturbed history exploration (PHE), a method where exploration is done by tr… ▽ More

    Submitted 6 March, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  24. arXiv:2310.20090  [pdf, other

    stat.ML cs.LG stat.CO

    Bridging the Gap Between Variational Inference and Wasserstein Gradient Flows

    Authors: Mingxuan Yi, Song Liu

    Abstract: Variational inference is a technique that approximates a target distribution by optimizing within the parameter space of variational families. On the other hand, Wasserstein gradient flows describe optimization within the space of probability measures where they do not necessarily admit a parametric density function. In this paper, we bridge the gap between these two methods. We demonstrate that,… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  25. arXiv:2310.15932  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Online Robust Mean Estimation

    Authors: Daniel M. Kane, Ilias Diakonikolas, Hanshen Xiao, Sihan Liu

    Abstract: We study the problem of high-dimensional robust mean estimation in an online setting. Specifically, we consider a scenario where $n$ sensors are measuring some common, ongoing phenomenon. At each time step $t=1,2,\ldots,T$, the $i^{th}$ sensor reports its readings $x^{(i)}_t$ for that time step. The algorithm must then commit to its estimate $μ_t$ for the true mean value of the process at time… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: To appear in SODA2024

  26. arXiv:2309.16855  [pdf, other

    stat.ME math.ST

    A Variational Spike-and-Slab Approach for Group Variable Selection

    Authors: Buyu Lin, Changhao Ge, Jun S. Liu

    Abstract: We introduce a class of generic spike-and-slab priors for high-dimensional linear regression with grouped variables and present a Coordinate-ascent Variational Inference (CAVI) algorithm for obtaining an optimal variational Bayes approximation. Using parameter expansion for a specific, yet comprehensive, family of slab distributions, we obtain a further gain in computational efficiency. The method… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: 64 pages, 6 figures

  27. arXiv:2309.16578  [pdf, other

    stat.ML cs.LG physics.chem-ph

    Overcoming the Barrier of Orbital-Free Density Functional Theory for Molecular Systems Using Deep Learning

    Authors: He Zhang, Siyuan Liu, Jiacheng You, Chang Liu, Shuxin Zheng, Ziheng Lu, Tong Wang, Nanning Zheng, Bin Shao

    Abstract: Orbital-free density functional theory (OFDFT) is a quantum chemistry formulation that has a lower cost scaling than the prevailing Kohn-Sham DFT, which is increasingly desired for contemporary molecular research. However, its accuracy is limited by the kinetic energy density functional, which is notoriously hard to approximate for non-periodic molecular systems. Here we propose M-OFDFT, an OFDFT… ▽ More

    Submitted 9 March, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: Published in Nature Computational Science, March 2024. Full paper with supplementary information

  28. arXiv:2309.12664  [pdf, other

    stat.CO stat.ML

    Langevin Quasi-Monte Carlo

    Authors: Sifan Liu

    Abstract: Langevin Monte Carlo (LMC) and its stochastic gradient versions are powerful algorithms for sampling from complex high-dimensional distributions. To sample from a distribution with density $Ï€(θ)\propto \exp(-U(θ)) $, LMC iteratively generates the next sample by taking a step in the gradient direction $\nabla U$ with added Gaussian perturbations. Expectations w.r.t. the target distribution $Ï€$ are… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  29. arXiv:2309.05762  [pdf, other

    stat.ME stat.AP

    Statistical and Practical Considerations in Planning and Conduct of Dose Optimization Trials

    Authors: Ying Yuan, Heng Zhou, Suyu Liu

    Abstract: The US Food and Drug Administration launched Project Optimus with the aim of shifting the paradigm of dose-finding and selection towards identifying the optimal biological dose that offers the best balance between benefit and risk, rather than the maximum tolerated dose. However, achieving dose optimization is a challenging task that involves a variety of factors and is considerably more complicat… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

  30. arXiv:2308.15370  [pdf, other

    stat.ML cs.LG

    Multi-Response Heteroscedastic Gaussian Process Models and Their Inference

    Authors: Taehee Lee, Jun S. Liu

    Abstract: Despite the widespread utilization of Gaussian process models for versatile nonparametric modeling, they exhibit limitations in effectively capturing abrupt changes in function smoothness and accommodating relationships with heteroscedastic errors. Addressing these shortcomings, the heteroscedastic Gaussian process (HeGP) regression seeks to introduce flexibility by acknowledging the variability o… ▽ More

    Submitted 30 August, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

    Comments: submitted to the Journal of the American Statistical Association (JASA)

  31. arXiv:2308.10346  [pdf, other

    stat.ME stat.CO

    An Exact Sampler for Inference after Polyhedral Model Selection

    Authors: Sifan Liu

    Abstract: Inference after model selection presents computational challenges when dealing with intractable conditional distributions. Markov chain Monte Carlo (MCMC) is a common method for sampling from these distributions, but its slow convergence often limits its practicality. In this work, we introduce a method tailored for selective inference in cases where the selection event can be characterized by a p… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

  32. arXiv:2308.05061  [pdf, other

    cs.LG math.NA stat.ML

    Fine-Tune Language Models as Multi-Modal Differential Equation Solvers

    Authors: Liu Yang, Siting Liu, Stanley J. Osher

    Abstract: In the growing domain of scientific machine learning, in-context operator learning has shown notable potential in building foundation models, as in this framework the model is trained to learn operators and solve differential equations using prompted data, during the inference stage without weight updates. However, the current model's overdependence on function data overlooks the invaluable human… ▽ More

    Submitted 1 February, 2024; v1 submitted 9 August, 2023; originally announced August 2023.

  33. arXiv:2308.02360  [pdf, other

    cs.LG stat.ML

    Intensity-free Integral-based Learning of Marked Temporal Point Processes

    Authors: Sishun Liu, Ke Deng, Xiuzhen Zhang, Yongli Ren

    Abstract: In the marked temporal point processes (MTPP), a core problem is to parameterize the conditional joint PDF (probability distribution function) $p^*(m,t)$ for inter-event time $t$ and mark $m$, conditioned on the history. The majority of existing studies predefine intensity functions. Their utility is challenged by specifying the intensity function's proper form, which is critical to balance expres… ▽ More

    Submitted 7 August, 2023; v1 submitted 4 August, 2023; originally announced August 2023.

  34. arXiv:2307.02719  [pdf, ps, other

    cs.LG stat.ML

    Understanding Uncertainty Sampling

    Authors: Shang Liu, Xiaocheng Li

    Abstract: Uncertainty sampling is a prevalent active learning algorithm that queries sequentially the annotations of data samples which the current prediction model is uncertain about. However, the usage of uncertainty sampling has been largely heuristic: (i) There is no consensus on the proper definition of "uncertainty" for a specific task under a specific loss; (ii) There is no theoretical guarantee that… ▽ More

    Submitted 20 July, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: Update: add numerical illustrations and experiments; correct some typos and modify the numbering

  35. arXiv:2307.02126  [pdf, other

    cs.LG stat.ML

    Robust Graph Structure Learning with the Alignment of Features and Adjacency Matrix

    Authors: Shaogao Lv, Gang Wen, Shiyu Liu, Linsen Wei, Ming Li

    Abstract: To improve the robustness of graph neural networks (GNN), graph structure learning (GSL) has attracted great interest due to the pervasiveness of noise in graph data. Many approaches have been proposed for GSL to jointly learn a clean graph structure and corresponding representations. To extend the previous work, this paper proposes a novel regularized GSL approach, particularly with an alignment… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  36. arXiv:2307.01748  [pdf, other

    stat.ME astro-ph.IM stat.CO

    Monotone Cubic B-Splines with a Neural-Network Generator

    Authors: Lijun Wang, Xiaodan Fan, Huabai Li, Jun S. Liu

    Abstract: We present a method for fitting monotone curves using cubic B-splines, which is equivalent to putting a monotonicity constraint on the coefficients. We explore different ways of enforcing this constraint and analyze their theoretical and empirical properties. We propose two algorithms for solving the spline fitting problem: one that uses standard optimization techniques and one that trains a Multi… ▽ More

    Submitted 17 November, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

  37. arXiv:2307.00238  [pdf, other

    stat.ML cs.LG

    Unified Transfer Learning Models in High-Dimensional Linear Regression

    Authors: Shuo Shuo Liu

    Abstract: Transfer learning plays a key role in modern data analysis when: (1) the target data are scarce but the source data are sufficient; (2) the distributions of the source and target data are heterogeneous. This paper develops an interpretable unified transfer learning model, termed as UTrans, which can detect both transferable variables and source data. More specifically, we establish the estimation… ▽ More

    Submitted 29 January, 2024; v1 submitted 1 July, 2023; originally announced July 2023.

  38. arXiv:2306.00602  [pdf, other

    stat.ML cs.LG stat.ME

    Approximate Stein Classes for Truncated Density Estimation

    Authors: Daniel J. Williams, Song Liu

    Abstract: Estimating truncated density models is difficult, as these models have intractable normalising constants and hard to satisfy boundary conditions. Score matching can be adapted to solve the truncated density estimation problem, but requires a continuous weighting function which takes zero at the boundary and is positive elsewhere. Evaluation of such a weighting function (and its gradient) often req… ▽ More

    Submitted 12 April, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Accepted to ICML 2023

  39. arXiv:2305.15577  [pdf, other

    stat.ML cs.LG

    Minimizing $f$-Divergences by Interpolating Velocity Fields

    Authors: Song Liu, Jiahao Yu, Jack Simons, Mingxuan Yi, Mark Beaumont

    Abstract: Many machine learning problems can be seen as approximating a \textit{target} distribution using a \textit{particle} distribution by minimizing their statistical discrepancy. Wasserstein Gradient Flow can move particles along a path that minimizes the $f$-divergence between the target and particle distributions. To move particles, we need to calculate the corresponding velocity fields derived from… ▽ More

    Submitted 6 June, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: This manuscript is an extended version of the ICML2024 version. The code for reproducing our results can be found at https://github.com/anewgithubname/gradest2

  40. arXiv:2305.12283  [pdf, ps, other

    cs.LG stat.ME stat.ML

    Distribution-Free Model-Agnostic Regression Calibration via Nonparametric Methods

    Authors: Shang Liu, Zhongze Cai, Xiaocheng Li

    Abstract: In this paper, we consider the uncertainty quantification problem for regression models. Specifically, we consider an individual calibration objective for characterizing the quantiles of the prediction model. While such an objective is well-motivated from downstream tasks such as newsvendor cost, the existing methods have been largely heuristic and lack of statistical guarantee in terms of individ… ▽ More

    Submitted 25 October, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: Accepted at NeurIPS 2023 and update a camera-ready version; Add some experiments and literature reviews

  41. arXiv:2305.12085  [pdf, other

    cs.LG stat.ML

    Stability and Generalization of lp-Regularized Stochastic Learning for GCN

    Authors: Shiyu Liu, Linsen Wei, Shaogao Lv, Ming Li

    Abstract: Graph convolutional networks (GCN) are viewed as one of the most popular representations among the variants of graph neural networks over graph data and have shown powerful performance in empirical experiments. That $\ell_2$-based graph smoothing enforces the global smoothness of GCN, while (soft) $\ell_1$-based sparse graph learning tends to promote signal sparsity to trade for discontinuity. Thi… ▽ More

    Submitted 19 June, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted to IJCAI 2023

  42. arXiv:2305.01379  [pdf, ps, other

    stat.ML cs.LG eess.SP math.OC

    LogSpecT: Feasible Graph Learning Model from Stationary Signals with Recovery Guarantees

    Authors: Shangyuan Liu, Linglingzhi Zhu, Anthony Man-Cho So

    Abstract: Graph learning from signals is a core task in Graph Signal Processing (GSP). One of the most commonly used models to learn graphs from stationary signals is SpecT. However, its practical formulation rSpecT is known to be sensitive to hyperparameter selection and, even worse, to suffer from infeasibility. In this paper, we give the first condition that guarantees the infeasibility of rSpecT and des… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  43. arXiv:2304.07993  [pdf, other

    cs.LG math.NA stat.ML

    In-Context Operator Learning with Data Prompts for Differential Equation Problems

    Authors: Liu Yang, Siting Liu, Tingwei Meng, Stanley J. Osher

    Abstract: This paper introduces a new neural-network-based approach, namely In-Context Operator Networks (ICON), to simultaneously learn operators from the prompted data and apply it to new questions during the inference stage, without any weight update. Existing methods are limited to using a neural network to approximate a specific equation solution or a specific operator, requiring retraining when switch… ▽ More

    Submitted 19 September, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: The second and third authors contributed equally. This is an outdated preprint. Please refer to the updated version published in PNAS: www.pnas.org/doi/10.1073/pnas.2310142120 See code in https://github.com/LiuYangMage/in-context-operator-networks

  44. arXiv:2303.05485  [pdf, ps, other

    cs.LG stat.ML

    Efficient Testable Learning of Halfspaces with Adversarial Label Noise

    Authors: Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Sihan Liu, Nikos Zarifis

    Abstract: We give the first polynomial-time algorithm for the testable learning of halfspaces in the presence of adversarial label noise under the Gaussian distribution. In the recently introduced testable learning model, one is required to produce a tester-learner such that if the data passes the tester, then one can trust the output of the robust learner on the data. Our tester-learner runs in time… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

  45. arXiv:2302.10655  [pdf, other

    stat.ML cs.LG stat.ME

    Density Ratio Estimation and Neyman Pearson Classification with Missing Data

    Authors: Josh Givens, Song Liu, Henry W J Reeve

    Abstract: Density Ratio Estimation (DRE) is an important machine learning technique with many downstream applications. We consider the challenge of DRE with missing not at random (MNAR) data. In this setting, we show that using standard DRE methods leads to biased results while our proposal (M-KLIEP), an adaptation of the popular DRE procedure KLIEP, restores consistency. Moreover, we provide finite sample… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: 40 pages, 11 Figures. To be published in proceedings for AISTAT 2023

  46. Anytime-Valid Confidence Sequences in an Enterprise A/B Testing Platform

    Authors: Akash V. Maharaj, Ritwik Sinha, David Arbour, Ian Waudby-Smith, Simon Z. Liu, Moumita Sinha, Raghavendra Addanki, Aaditya Ramdas, Manas Garg, Viswanathan Swaminathan

    Abstract: A/B tests are the gold standard for evaluating digital experiences on the web. However, traditional "fixed-horizon" statistical methods are often incompatible with the needs of modern industry practitioners as they do not permit continuous monitoring of experiments. Frequent evaluation of fixed-horizon tests ("peeking") leads to inflated type-I error and can result in erroneous conclusions. We hav… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: 15 pages, 12 figures. Expanded version of ACM Web Conference Proceedings paper

    ACM Class: G.3

    Journal ref: Companion Proceedings of the ACM Web Conference 2023 (WWW '23 Companion)

  47. arXiv:2302.07547  [pdf, other

    stat.AP cs.HC

    Multimodal N-of-1 trials: A Novel Personalized Healthcare Design

    Authors: **g**g Fu, Shuheng Liu, Siqi Du, Siqiao Ruan, Xuliang Guo, Weiwei Pan, Abhishek Sharma, Stefan Konigorski

    Abstract: N-of-1 trials aim to estimate treatment effects on the individual level and can be applied to personalize a wide range of physical and digital interventions in mHealth. In this study, we propose and apply a framework for multimodal N-of-1 trials in order to allow the inclusion of health outcomes assessed through images, audio or videos. We illustrate the framework in a series of N-of-1 trials that… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  48. arXiv:2302.06015  [pdf, other

    cs.LG cs.CV stat.ML

    A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity

    Authors: Hongkang Li, Meng Wang, Sijia Liu, Pin-yu Chen

    Abstract: Vision Transformers (ViTs) with self-attention modules have recently achieved great empirical success in many vision tasks. Due to non-convex interactions across layers, however, theoretical learning and generalization analysis is mostly elusive. Based on a data model characterizing both label-relevant and label-irrelevant tokens, this paper provides the first theoretical analysis of training a sh… ▽ More

    Submitted 11 November, 2023; v1 submitted 12 February, 2023; originally announced February 2023.

    Journal ref: ICLR 2023

  49. arXiv:2302.04611  [pdf, other

    cs.LG cs.AI q-bio.QM stat.ML

    A Text-guided Protein Design Framework

    Authors: Shengchao Liu, Yan**g Li, Zhuoxinran Li, Anthony Gitter, Yutao Zhu, Jiarui Lu, Zhao Xu, Weili Nie, Arvind Ramanathan, Chaowei Xiao, Jian Tang, Hongyu Guo, Anima Anandkumar

    Abstract: Current AI-assisted protein design mainly utilizes protein sequential and structural information. Meanwhile, there exists tremendous knowledge curated by humans in the text format describing proteins' high-level functionalities. Yet, whether the incorporation of such text data can help protein design tasks has not been explored. To bridge this gap, we propose ProteinDT, a multi-modal framework tha… ▽ More

    Submitted 3 December, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

  50. arXiv:2302.01075  [pdf, other

    stat.ML cs.LG

    MonoFlow: Rethinking Divergence GANs via the Perspective of Wasserstein Gradient Flows

    Authors: Mingxuan Yi, Zhanxing Zhu, Song Liu

    Abstract: The conventional understanding of adversarial training in generative adversarial networks (GANs) is that the discriminator is trained to estimate a divergence, and the generator learns to minimize this divergence. We argue that despite the fact that many variants of GANs were developed following this paradigm, the current theoretical understanding of GANs and their practical algorithms are inconsi… ▽ More

    Submitted 8 August, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Comments: ICML 2023