Skip to main content

Showing 1–50 of 348 results for author: Liu, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.01949  [pdf, other

    stat.AP

    Mass-Balance MRV for Carbon Dioxide Removal by Enhanced Rock Weathering: Methods, Simulation, and Inference

    Authors: Mark Baum, Henry Liu, Lily Schacht, Jake Schneider, Mary Yap

    Abstract: Carbon dioxide will likely need to be removed from the atmosphere to avoid significant future warming and climate change. Technologies are being developed to remove large quantities of carbon from the atmosphere. Enhanced rock weathering (ERW), where fine-grained silicate minerals are spread on soil, is a promising carbon removal method that can also support crop yields and maintain overall soil h… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2407.01079  [pdf, ps, other

    stat.ML cs.AI cs.LG

    On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

    Authors: Jerry Yao-Chieh Hu, Weimin Wu, Zhuoru Li, Zhao Song, Han Liu

    Abstract: We investigate the statistical and computational limits of latent \textbf{Di}ffusion \textbf{T}ransformers (\textbf{DiT}s) under the low-dimensional linear latent space assumption. Statistically, we study the universal approximation and sample complexity of the DiTs score function, as well as the distribution recovery property of the initial data. Specifically, under mild data assumptions, we deri… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2407.00271  [pdf, other

    math.DS physics.data-an stat.ML

    Minimum Reduced-Order Models via Causal Inference

    Authors: Nan Chen, Honghu Liu

    Abstract: Enhancing the sparsity of data-driven reduced-order models (ROMs) has gained increasing attention in recent years. In this work, we analyze an efficient approach to identifying skillful ROMs with a sparse structure using an information-theoretic indicator called causation entropy. The causation entropy quantifies in a statistical way the additional contribution of each term to the underlying dynam… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  4. arXiv:2406.19619  [pdf, other

    stat.ML cs.LG math.ST

    ScoreFusion: fusing score-based generative models via Kullback-Leibler barycenters

    Authors: Hao Liu, Junze, Ye, Jose Blanchet, Nian Si

    Abstract: We study the problem of fusing pre-trained (auxiliary) generative models to enhance the training of a target generative model. We propose using KL-divergence weighted barycenters as an optimal fusion mechanism, in which the barycenter weights are optimally trained to minimize a suitable loss for the target population. While computing the optimal KL-barycenter weights can be challenging, we demonst… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 40 pages, 6 figures

  5. arXiv:2406.13936  [pdf, other

    stat.ML cs.LG math.OC

    Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods

    Authors: Tim Tsz-Kit Lau, Weijian Li, Chenwei Xu, Han Liu, Mladen Kolar

    Abstract: Modern deep neural networks often require distributed training with many workers due to their large size. As worker numbers increase, communication overheads become the main bottleneck in data-parallel minibatch stochastic gradient methods with per-iteration gradient synchronization. Local gradient methods like Local SGD reduce communication by only syncing after several local steps. Despite under… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  6. arXiv:2406.13197  [pdf, other

    stat.ME

    Representation Transfer Learning for Semiparametric Regression

    Authors: Baihua He, Huihang Liu, Xinyu Zhang, Jian Huang

    Abstract: We propose a transfer learning method that utilizes data representations in a semiparametric regression model. Our aim is to perform statistical inference on the parameter of primary interest in the target model while accounting for potential nonlinear effects of confounding variables. We leverage knowledge from source domains, assuming that the sample size of the source data is substantially larg… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 42 pages, 11 figures, 5 tables

    MSC Class: 62F99

  7. arXiv:2406.05822  [pdf, other

    cs.LG stat.ML

    Symmetric Matrix Completion with ReLU Sampling

    Authors: Huikang Liu, Peng Wang, Longxiu Huang, Qing Qu, Laura Balzano

    Abstract: We study the problem of symmetric positive semi-definite low-rank matrix completion (MC) with deterministic entry-dependent sampling. In particular, we consider rectified linear unit (ReLU) sampling, where only positive entries are observed, as well as a generalization to threshold-based sampling. We first empirically demonstrate that the landscape of this MC problem is not globally benign: Gradie… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: 39 pages, 9 figures; This work has been accepted for publication in the Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

  8. arXiv:2406.05320  [pdf, other

    stat.ML cs.LG

    Deep Neural Networks are Adaptive to Function Regularity and Data Distribution in Approximation and Estimation

    Authors: Hao Liu, Jiahui Cheng, Wen**g Liao

    Abstract: Deep learning has exhibited remarkable results across diverse areas. To understand its success, substantial research has been directed towards its theoretical foundations. Nevertheless, the majority of these studies examine how well deep neural networks can model functions with uniform regularity. In this paper, we explore a different angle: how deep neural networks can adapt to different regulari… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  9. arXiv:2406.03136  [pdf, ps, other

    cs.LG cs.AI cs.CC stat.ML

    Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models

    Authors: Jerry Yao-Chieh Hu, Maojiang Su, En-Jui Kuo, Zhao Song, Han Liu

    Abstract: We study the computational limits of Low-Rank Adaptation (LoRA) update for finetuning transformer-based models using fine-grained complexity theory. Our key observation is that the existence of low-rank decompositions within the gradient computation of LoRA adaptation leads to possible algorithmic speedup. This allows us to (i) identify a phase transition behavior and (ii) prove the existence of n… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  10. arXiv:2406.01335  [pdf, other

    quant-ph q-fin.ST stat.ML

    Statistics-Informed Parameterized Quantum Circuit via Maximum Entropy Principle for Data Science and Finance

    Authors: Xi-Ning Zhuang, Zhao-Yun Chen, Cheng Xue, Xiao-Fan Xu, Chao Wang, Huan-Yu Liu, Tai-** Sun, Yun-Jie Wang, Yu-Chun Wu, Guo-** Guo

    Abstract: Quantum machine learning has demonstrated significant potential in solving practical problems, particularly in statistics-focused areas such as data science and finance. However, challenges remain in preparing and learning statistical models on a quantum processor due to issues with trainability and interpretability. In this letter, we utilize the maximum entropy principle to design a statistics-i… ▽ More

    Submitted 18 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 19 pages, 5 figures

  11. arXiv:2405.18856  [pdf, other

    stat.ME math.ST

    Inference under covariate-adaptive randomization with many strata

    Authors: Jiahui Xin, Hanzhong Liu, Wei Ma

    Abstract: Covariate-adaptive randomization is widely employed to balance baseline covariates in interventional studies such as clinical trials and experiments in development economics. Recent years have witnessed substantial progress in inference under covariate-adaptive randomization with a fixed number of strata. However, concerns have been raised about the impact of a large number of strata on its design… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  12. arXiv:2405.17490  [pdf, other

    cs.LG stat.ML

    Revisit, Extend, and Enhance Hessian-Free Influence Functions

    Authors: Ziao Yang, Han Yue, Jian Chen, Hongfu Liu

    Abstract: Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, noisy label detection, and more. By employing the first-order Taylor extension, influence functions can estimate sample influence without the need for expensive model retraining. However, applying influence functions directly to deep models presents challenges, primaril… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  13. arXiv:2405.03579  [pdf, other

    stat.AP cs.DB stat.ME

    Some Statistical and Data Challenges When Building Early-Stage Digital Experimentation and Measurement Capabilities

    Authors: C. H. Bryan Liu

    Abstract: Digital experimentation and measurement (DEM) capabilities -- the knowledge and tools necessary to run experiments with digital products, services, or experiences and measure their impact -- are fast becoming part of the standard toolkit of digital/data-driven organisations in guiding business decisions. Many large technology companies report having mature DEM capabilities, and several businesses… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: PhD thesis. Imperial College London. Official library version available on: https://spiral.imperial.ac.uk/handle/10044/1/110307

  14. arXiv:2404.08667  [pdf, other

    eess.SY stat.AP

    Traffic State Estimation and Uncertainty Quantification at Signalized Intersections with Low Penetration Rate Vehicle Trajectory Data

    Authors: Xingmin Wang, Zihao Wang, Zachary Jerome, Henry X. Liu

    Abstract: This paper studies the traffic state estimation problem at signalized intersections with low penetration rate vehicle trajectory data. While many existing studies have proposed different methods to estimate unknown traffic states and parameters (e.g., penetration rate, queue length) with this data, most of them only provide a point estimation without knowing the uncertainty of these estimated valu… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  15. arXiv:2404.03900  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Nonparametric Modern Hopfield Models

    Authors: Jerry Yao-Chieh Hu, Bo-Yu Chen, Dennis Wu, Feng Ruan, Han Liu

    Abstract: We present a nonparametric construction for deep learning compatible modern Hopfield models and utilize this framework to debut an efficient variant. Our key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models as a nonparametric regression problem subject to a set of query-memory pairs. Crucially, our framework not only recovers the known resul… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 59 pages; Code available at https://github.com/MAGICS-LAB/NonparametricHopfield

  16. arXiv:2404.03830  [pdf, other

    cs.LG cs.AI stat.ML

    BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model

    Authors: Chenwei Xu, Yu-Chao Huang, Jerry Yao-Chieh Hu, Weijian Li, Ammar Gilani, Hsi-Sheng Goan, Han Liu

    Abstract: We introduce the \textbf{B}i-Directional \textbf{S}parse \textbf{Hop}field Network (\textbf{BiSHop}), a novel end-to-end framework for deep tabular learning. BiSHop handles the two major challenges of deep tabular learning: non-rotationally invariant data structure and feature sparsity in tabular data. Our key motivation comes from the recent established connection between associative memory and a… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 40 page; Code available at https://github.com/MAGICS-LAB/BiSHop

  17. arXiv:2404.03828  [pdf, other

    cs.LG cs.AI stat.ML

    Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

    Authors: Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Robin Luo, Hong-Yu Chen, Weijian Li, Wei-Po Wang, Han Liu

    Abstract: We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathrm{OutEffHop}$) and use it to address the outlier inefficiency problem of {training} gigantic transformer-based models. Our main contribution is a novel associative memory model facilitating \textit{outlier-efficient} associative memory retrievals. Interestingly, this memory model manifests a model-based interpretation of an out… ▽ More

    Submitted 26 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted at ICML 2024; v2 updated to camera-ready version; Code available at https://github.com/MAGICS-LAB/OutEffHop; Models are on Hugging Face: https://huggingface.co/collections/magicslabnu/outeffhop-6610fcede8d2cda23009a98f

  18. arXiv:2404.03827  [pdf, other

    cs.LG cs.AI stat.ML

    Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models

    Authors: Dennis Wu, Jerry Yao-Chieh Hu, Teng-Yun Hsiao, Han Liu

    Abstract: We propose a two-stage memory retrieval dynamics for modern Hopfield models, termed $\mathtt{U\text{-}Hop}$, with enhanced memory capacity. Our key contribution is a learnable feature map $Φ$ which transforms the Hopfield energy function into kernel space. This transformation ensures convergence between the local minima of energy and the fixed points of retrieval dynamics within the kernel space.… ▽ More

    Submitted 12 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted at ICML 2024; v2 updated to camera-ready version; Code available at https://github.com/MAGICS-LAB/UHop

  19. arXiv:2403.08194  [pdf, other

    cs.LG stat.ML

    Unsupervised Learning of Hybrid Latent Dynamics: A Learn-to-Identify Framework

    Authors: Yubo Ye, Sumeet Vadhavkar, Xiajun Jiang, Ryan Missel, Huafeng Liu, Linwei Wang

    Abstract: Modern applications increasingly require unsupervised learning of latent dynamics from high-dimensional time-series. This presents a significant challenge of identifiability: many abstract latent representations may reconstruct observations, yet do they guarantee an adequate identification of the governing dynamics? This paper investigates this challenge from two angles: the use of physics inducti… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: Under Review

  20. arXiv:2402.12875  [pdf, other

    cs.LG cs.CC stat.ML

    Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

    Authors: Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma

    Abstract: Instructing the model to generate a sequence of intermediate steps, a.k.a., a chain of thought (CoT), is a highly effective method to improve the accuracy of large language models (LLMs) on arithmetics and symbolic reasoning tasks. However, the mechanism behind CoT remains unclear. This work provides a theoretical understanding of the power of CoT for decoder-only transformers through the lens of… ▽ More

    Submitted 23 May, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: 38 pages, 10 figures. Accepted by ICLR 2024

  21. arXiv:2402.11215  [pdf, other

    cs.LG math.OC stat.ML

    AdAdaGrad: Adaptive Batch Size Schemes for Adaptive Gradient Methods

    Authors: Tim Tsz-Kit Lau, Han Liu, Mladen Kolar

    Abstract: The choice of batch sizes in minibatch stochastic gradient optimizers is critical in large-scale model training for both optimization and generalization performance. Although large-batch training is arguably the dominant training paradigm for large-scale deep learning due to hardware advances, the generalization performance of the model deteriorates compared to small-batch training, leading to the… ▽ More

    Submitted 28 May, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

  22. arXiv:2402.08539  [pdf

    cs.LG stat.AP

    Intelligent Diagnosis of Alzheimer's Disease Based on Machine Learning

    Authors: Mingyang Li, Hongyu Liu, Yixuan Li, Zejun Wang, Yuan Yuan, Honglin Dai

    Abstract: This study is based on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and aims to explore early detection and disease progression in Alzheimer's disease (AD). We employ innovative data preprocessing strategies, including the use of the random forest algorithm to fill missing data and the handling of outliers and invalid data, thereby fully mining and utilizing these limited data re… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  23. arXiv:2402.04520  [pdf, ps, other

    cs.LG cs.AI stat.ML

    On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis

    Authors: Jerry Yao-Chieh Hu, Thomas Lin, Zhao Song, Han Liu

    Abstract: We investigate the computational limits of the memory retrieval dynamics of modern Hopfield models from the fine-grained complexity analysis. Our key contribution is the characterization of a phase transition behavior in the efficiency of all possible modern Hopfield models based on the norm of patterns. Specifically, we establish an upper bound criterion for the norm of input query patterns and m… ▽ More

    Submitted 31 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted at ICML 2024; v2 corrected typos; v3 added clarifications and references; v4,5 updated to camera-ready version

  24. arXiv:2401.16667  [pdf, other

    math.ST stat.AP stat.ME

    Sharp variance estimator and causal bootstrap in stratified randomized experiments

    Authors: Haoyang Yu, Ke Zhu, Hanzhong Liu

    Abstract: The design-based finite-population asymptotic theory provides a normal approximation for the sampling distribution of the average treatment effect estimator in stratified randomized experiments. The asymptotic variance could be estimated by a Neyman-type conservative variance estimator. However, the variance estimator can be overly conservative, and the asymptotic theory may fail in small samples.… ▽ More

    Submitted 26 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  25. arXiv:2401.01872  [pdf, other

    stat.ME stat.AP

    Multiple Imputation of Hierarchical Nonlinear Time Series Data with an Application to School Enrollment Data

    Authors: Daphne H. Liu, Adrian E. Raftery

    Abstract: International comparisons of hierarchical time series data sets based on survey data, such as annual country-level estimates of school enrollment rates, can suffer from large amounts of missing data due to differing coverage of surveys across countries and across times. A popular approach to handling missing data in these settings is through multiple imputation, which can be especially effective w… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: 36 pages, 5 figures

  26. arXiv:2312.17346  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction

    Authors: Dennis Wu, Jerry Yao-Chieh Hu, Weijian Li, Bo-Yu Chen, Han Liu

    Abstract: We present STanHop-Net (Sparse Tandem Hopfield Network) for multivariate time series prediction with memory-enhanced capabilities. At the heart of our approach is STanHop, a novel Hopfield-based neural network block, which sparsely learns and stores both temporal and cross-series representations in a data-dependent fashion. In essence, STanHop sequentially learn temporal representation and cross-s… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  27. arXiv:2312.16793  [pdf, other

    cs.LG stat.ML

    Sparse PCA with Oracle Property

    Authors: Quanquan Gu, Zhaoran Wang, Han Liu

    Abstract: In this paper, we study the estimation of the $k$-dimensional sparse principal subspace of covariance matrix $Σ$ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: 16 pages, 1 table. In NIPS 2014

  28. arXiv:2312.12477  [pdf, other

    cs.LG cs.AI stat.ME

    When Graph Neural Network Meets Causality: Opportunities, Methodologies and An Outlook

    Authors: Wenzhao Jiang, Hao Liu, Hui Xiong

    Abstract: Graph Neural Networks (GNNs) have emerged as powerful representation learning tools for capturing complex dependencies within diverse graph-structured data. Despite their success in a wide range of graph mining tasks, GNNs have raised serious concerns regarding their trustworthiness, including susceptibility to distribution shift, biases towards certain populations, and lack of explainability. Rec… ▽ More

    Submitted 17 June, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  29. arXiv:2312.09613  [pdf, other

    cs.LG cs.AI stat.ML

    Rethinking Causal Relationships Learning in Graph Neural Networks

    Authors: Hang Gao, Chengyu Yao, Jiangmeng Li, Lingyu Si, Yifan **, Fengge Wu, Changwen Zheng, Hua** Liu

    Abstract: Graph Neural Networks (GNNs) demonstrate their significance by effectively modeling complex interrelationships within graph-structured data. To enhance the credibility and robustness of GNNs, it becomes exceptionally crucial to bolster their ability to capture causal relationships. However, despite recent advancements that have indeed strengthened GNNs from a causal learning perspective, conductin… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  30. arXiv:2312.03438  [pdf, ps, other

    math.OC eess.SP stat.ML

    On the Estimation Performance of Generalized Power Method for Heteroscedastic Probabilistic PCA

    Authors: **xin Wang, Chonghe Jiang, Huikang Liu, Anthony Man-Cho So

    Abstract: The heteroscedastic probabilistic principal component analysis (PCA) technique, a variant of the classic PCA that considers data heterogeneity, is receiving more and more attention in the data science and signal processing communities. In this paper, to estimate the underlying low-dimensional linear subspace (simply called \emph{ground truth}) from available heterogeneous data samples, we consider… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 22 pages

  31. arXiv:2312.02591  [pdf, ps, other

    stat.ME

    General Spatio-Temporal Factor Models for High-Dimensional Random Fields on a Lattice

    Authors: Matteo Barigozzi, Davide La Vecchia, Hang Liu

    Abstract: Motivated by the need for analysing large spatio-temporal panel data, we introduce a novel dimensionality reduction methodology for $n$-dimensional random fields observed across a number $S$ spatial locations and $T$ time periods. We call it General Spatio-Temporal Factor Model (GSTFM). First, we provide the probabilistic and mathematical underpinning needed for the representation of a random fiel… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  32. arXiv:2312.01944  [pdf, other

    stat.ME

    New Methods for Network Count Time Series

    Authors: Hengxu Liu, Guy Nason

    Abstract: The original generalized network autoregressive models are poor for modelling count data as they are based on the additive and constant noise assumptions, which is usually inappropriate for count data. We introduce two new models (GNARI and NGNAR) for count network time series by adapting and extending existing count-valued time series models. We present results on the statistical and asymptotic p… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    MSC Class: 62M10

  33. arXiv:2312.01266  [pdf, ps, other

    stat.ME math.ST

    A unified framework for covariate adjustment under stratified randomization

    Authors: Fuyi Tu, Wei Ma, Hanzhong Liu

    Abstract: Randomization, as a key technique in clinical trials, can eliminate sources of bias and produce comparable treatment groups. In randomized experiments, the treatment effect is a parameter of general interest. Researchers have explored the validity of using linear models to estimate the treatment effect and perform covariate adjustment and thus improve the estimation efficiency. However, the relati… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  34. arXiv:2311.15539  [pdf

    stat.CO

    A Novel Human-Based Meta-Heuristic Algorithm: Dragon Boat Optimization

    Authors: Xiang Li, Long Lan, Husam Lahza, Shaowu Yang, Shuihua Wang, Wen**g Yang, Hengzhu Liu, Yudong Zhang

    Abstract: (Aim) Dragon Boat Racing, a popular aquatic folklore team sport, is traditionally held during the Dragon Boat Festival. Inspired by this event, we propose a novel human-based meta-heuristic algorithm called dragon boat optimization (DBO) in this paper. (Method) It models the unique behaviors of each crew member on the dragon boat during the race by introducing social psychology mechanisms (social… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  35. arXiv:2310.18910  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    InstanT: Semi-supervised Learning with Instance-dependent Thresholds

    Authors: Muyang Li, Runze Wu, Haoyu Liu, Jun Yu, Xun Yang, Bo Han, Tongliang Liu

    Abstract: Semi-supervised learning (SSL) has been a fundamental challenge in machine learning for decades. The primary family of SSL algorithms, known as pseudo-labeling, involves assigning pseudo-labels to confident unlabeled instances and incorporating them into the training set. Therefore, the selection criteria of confident instances are crucial to the success of SSL. Recently, there has been growing in… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted as poster for NeurIPS 2023

  36. arXiv:2310.11550  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback

    Authors: Haolin Liu, Chen-Yu Wei, Julian Zimmert

    Abstract: We study online reinforcement learning in linear Markov decision processes with adversarial losses and bandit feedback, without prior knowledge on transitions or access to simulators. We introduce two algorithms that achieve improved regret performance compared to existing approaches. The first algorithm, although computationally inefficient, ensures a regret of… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  37. arXiv:2310.10767  [pdf, ps, other

    cs.LG stat.ML

    Wide Neural Networks as Gaussian Processes: Lessons from Deep Equilibrium Models

    Authors: Tianxiang Gao, Xiaokai Huo, Hailiang Liu, Hongyang Gao

    Abstract: Neural networks with wide layers have attracted significant attention due to their equivalence to Gaussian processes, enabling perfect fitting of training data while maintaining generalization performance, known as benign overfitting. However, existing results mainly focus on shallow or finite-depth networks, necessitating a comprehensive analysis of wide neural networks with infinite-depth layers… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023

  38. arXiv:2310.06746  [pdf, other

    cs.LG stat.ME stat.ML

    Causal Rule Learning: Enhancing the Understanding of Heterogeneous Treatment Effect via Weighted Causal Rules

    Authors: Ying Wu, Hanzhong Liu, Kai Ren, Xiangyu Chang

    Abstract: Interpretability is a key concern in estimating heterogeneous treatment effects using machine learning methods, especially for healthcare applications where high-stake decisions are often made. Inspired by the Predictive, Descriptive, Relevant framework of interpretability, we propose causal rule learning which finds a refined set of causal rules characterizing potential subgroups to estimate and… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  39. arXiv:2309.16240  [pdf, other

    cs.LG cs.AI stat.ML

    Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints

    Authors: Chaoqi Wang, Yibo Jiang, Chenghao Yang, Han Liu, Yuxin Chen

    Abstract: The increasing capabilities of large language models (LLMs) raise opportunities for artificial general intelligence but concurrently amplify safety concerns, such as potential misuse of AI systems, necessitating effective AI alignment. Reinforcement Learning from Human Feedback (RLHF) has emerged as a promising pathway towards AI alignment but brings forth challenges due to its complexity and depe… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: Preprint

  40. arXiv:2309.12673  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    On Sparse Modern Hopfield Model

    Authors: Jerry Yao-Chieh Hu, Donglin Yang, Dennis Wu, Chenwei Xu, Bo-Yu Chen, Han Liu

    Abstract: We introduce the sparse modern Hopfield model as a sparse extension of the modern Hopfield model. Like its dense counterpart, the sparse modern Hopfield model equips a memory-retrieval dynamics whose one-step approximation corresponds to the sparse attention mechanism. Theoretically, our key contribution is a principled derivation of a closed-form sparse Hopfield energy using the convex conjugate… ▽ More

    Submitted 29 November, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: 37 pages, accepted at NeurIPS 2023. [v2] updated to match with camera-ready version. Code is available at https://github.com/MAGICS-LAB/SparseModernHopfield

  41. arXiv:2309.00814  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits

    Authors: Haolin Liu, Chen-Yu Wei, Julian Zimmert

    Abstract: We consider the adversarial linear contextual bandit problem, where the loss vectors are selected fully adversarially and the per-round action set (i.e. the context) is drawn from a fixed distribution. Existing methods for this problem either require access to a simulator to generate free i.i.d. contexts, achieve a sub-optimal regret no better than $\widetilde{O}(T^{\frac{5}{6}})$, or are computat… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

  42. arXiv:2307.13757  [pdf, other

    cs.LG cs.HC stat.ME

    UPREVE: An End-to-End Causal Discovery Benchmarking System

    Authors: Suraj Jyothi Unni, Paras Sheth, Kaize Ding, Huan Liu, K. Selcuk Candan

    Abstract: Discovering causal relationships in complex socio-behavioral systems is challenging but essential for informed decision-making. We present Upload, PREprocess, Visualize, and Evaluate (UPREVE), a user-friendly web-based graphical user interface (GUI) designed to simplify the process of causal discovery. UPREVE allows users to run multiple algorithms simultaneously, visualize causal relationships, a… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: 8 pages, Accepted to SBP-BRiMS 2023

  43. arXiv:2306.08956  [pdf, other

    cs.SD eess.AS stat.ML

    Multi-Loss Convolutional Network with Time-Frequency Attention for Speech Enhancement

    Authors: Liang Wan, Hongqing Liu, Yi Zhou, Jie Ji

    Abstract: The Dual-Path Convolution Recurrent Network (DPCRN) was proposed to effectively exploit time-frequency domain information. By combining the DPRNN module with Convolution Recurrent Network (CRN), the DPCRN obtained a promising performance in speech separation with a limited model size. In this paper, we explore self-attention in the DPCRN module and design a model called Multi-Loss Convolutional Ne… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  44. arXiv:2306.06252  [pdf, other

    cs.LG stat.ML

    Feature Programming for Multivariate Time Series Prediction

    Authors: Alex Reneau, Jerry Yao-Chieh Hu, Chenwei Xu, Weijian Li, Ammar Gilani, Han Liu

    Abstract: We introduce the concept of programmable feature engineering for time series modeling and propose a feature programming framework. This framework generates large amounts of predictive features for noisy multivariate time series while allowing users to incorporate their inductive bias with minimal effort. The key motivation of our framework is to view any multivariate time series as a cumulative su… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 21 pages, accepted to ICML2023. Code is available at https://github.com/SirAlex900/FeatureProgramming

  45. arXiv:2306.03266  [pdf, other

    cs.LG stat.ML

    Extending the Design Space of Graph Neural Networks by Rethinking Folklore Weisfeiler-Lehman

    Authors: Jiarui Feng, Lecheng Kong, Hao Liu, Dacheng Tao, Fuhai Li, Muhan Zhang, Yixin Chen

    Abstract: Message passing neural networks (MPNNs) have emerged as the most popular framework of graph neural networks (GNNs) in recent years. However, their expressive power is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test. Some works are inspired by $k$-WL/FWL (Folklore WL) and design the corresponding neural versions. Despite the high expressive power, there are serious limitations in this li… ▽ More

    Submitted 14 January, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted to NeurIPS 2023

  46. arXiv:2305.15988  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Non-Log-Concave and Nonsmooth Sampling via Langevin Monte Carlo Algorithms

    Authors: Tim Tsz-Kit Lau, Han Liu, Thomas Pock

    Abstract: We study the problem of approximate sampling from non-log-concave distributions, e.g., Gaussian mixtures, which is often challenging even in low dimensions due to their multimodality. We focus on performing this task via Markov chain Monte Carlo (MCMC) methods derived from discretizations of the overdamped Langevin diffusions, which are commonly known as Langevin Monte Carlo algorithms. Furthermor… ▽ More

    Submitted 29 May, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

  47. arXiv:2305.06584  [pdf, other

    cs.LG math.OC stat.ML

    Active Learning in the Predict-then-Optimize Framework: A Margin-Based Approach

    Authors: Mo Liu, Paul Grigas, Heyuan Liu, Zuo-Jun Max Shen

    Abstract: We develop the first active learning method in the predict-then-optimize framework. Specifically, we develop a learning method that sequentially decides whether to request the "labels" of feature samples from an unlabeled data stream, where the labels correspond to the parameters of an optimization model for decision-making. Our active learning method is the first to be directly informed by the de… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  48. arXiv:2303.14900  [pdf, other

    stat.AP

    Nonparametric approaches for analyzing carbon emission: from statistical and machine learning perspectives

    Authors: Yiming Ma, Hang Liu, Shanyong Wang

    Abstract: Linear regression models, especially the extended STIRPAT model, are routinely-applied for analyzing carbon emissions data. However, since the relationship between carbon emissions and the influencing factors is complex, fitting a simple parametric model may not be an ideal solution. This paper investigated various nonparametric approaches in statistics and machine learning (ML) for modeling carbo… ▽ More

    Submitted 26 March, 2023; originally announced March 2023.

  49. arXiv:2303.11054  [pdf, other

    stat.AP

    Some novel aspects of quantile regression: local stationarity, random forests and optimal transportation

    Authors: Manon Felix, Davide La Vecchia, Hang Liu, Yiming Ma

    Abstract: This paper is written for a Festschrift in honour of Professor Marc Hallin and it proposes some developments on quantile regression. We connect our investigation to Marc's scientific production and we present some theoretical and methodological advances for quantiles estimation in non standard settings. We split our contributions in two parts. The first part is about conditional quantiles estimati… ▽ More

    Submitted 9 September, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

  50. arXiv:2303.09863  [pdf, other

    stat.ML cs.LG

    Deep Nonparametric Estimation of Intrinsic Data Structures by Chart Autoencoders: Generalization Error and Robustness

    Authors: Hao Liu, Alex Havrilla, Rongjie Lai, Wen**g Liao

    Abstract: Autoencoders have demonstrated remarkable success in learning low-dimensional latent features of high-dimensional data across various applications. Assuming that data are sampled near a low-dimensional manifold, we employ chart autoencoders, which encode data into low-dimensional latent features on a collection of charts, preserving the topology and geometry of the data manifold. Our paper establi… ▽ More

    Submitted 25 October, 2023; v1 submitted 17 March, 2023; originally announced March 2023.