Skip to main content

Showing 1–50 of 443 results for author: Zhang, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.14469  [pdf, other

    cs.CE cs.AI cs.LG stat.ML

    Fusion of Movement and Naive Predictions for Point Forecasting in Univariate Random Walks

    Authors: Cheng Zhang

    Abstract: Traditional methods for point forecasting in univariate random walks often fail to surpass naive benchmarks due to data unpredictability. This study introduces a novel forecasting method that fuses movement prediction (binary classification) with naive forecasts for accurate one-step-ahead point forecasting. The method's efficacy is demonstrated through theoretical analysis, simulations, and real-… ▽ More

    Submitted 24 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2406.10650  [pdf, other

    stat.ML cs.LG

    The Implicit Bias of Adam on Separable Data

    Authors: Chenyang Zhang, Difan Zou, Yuan Cao

    Abstract: Adam has become one of the most favored optimizers in deep learning problems. Despite its success in practice, numerous mysteries persist regarding its theoretical understanding. In this paper, we study the implicit bias of Adam in linear logistic regression. Specifically, we show that when the training data are linearly separable, Adam converges towards a linear classifier that achieves the maxim… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 33 pages, 2 figures

  3. arXiv:2405.19752  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Understanding Memory-Regret Trade-Off for Streaming Stochastic Multi-Armed Bandits

    Authors: Yuchen He, Zichun Ye, Chihao Zhang

    Abstract: We study the stochastic multi-armed bandit problem in the $P$-pass streaming model. In this problem, the $n$ arms are present in a stream and at most $m<n$ arms and their statistics can be stored in the memory. We give a complete characterization of the optimal regret in terms of $m, n$ and $P$. Specifically, we design an algorithm with… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  4. arXiv:2405.18997  [pdf, other

    stat.ML cs.LG

    Kernel Semi-Implicit Variational Inference

    Authors: Ziheng Cheng, Longlin Yu, Tianyu Xie, Shiyue Zhang, Cheng Zhang

    Abstract: Semi-implicit variational inference (SIVI) extends traditional variational families with semi-implicit distributions defined in a hierarchical manner. Due to the intractable densities of semi-implicit distributions, classical SIVI often resorts to surrogates of evidence lower bound (ELBO) that would introduce biases for training. A recent advancement in SIVI, named SIVI-SM, utilizes an alternative… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICML 2024 camera ready

  5. arXiv:2405.18836  [pdf, other

    stat.ME cs.LG

    Do Finetti: On Causal Effects for Exchangeable Data

    Authors: Siyuan Guo, Chi Zhang, Karthika Mohan, Ferenc Huszár, Bernhard Schölkopf

    Abstract: We study causal effect estimation in a setting where the data are not i.i.d. (independent and identically distributed). We focus on exchangeable data satisfying an assumption of independent causal mechanisms. Traditional causal effect estimation frameworks, e.g., relying on structural causal models and do-calculus, are typically limited to i.i.d. data and do not extend to more general exchangeable… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  6. arXiv:2405.16577  [pdf, other

    stat.ML cs.LG

    Reflected Flow Matching

    Authors: Tianyu Xie, Yu Zhu, Longlin Yu, Tong Yang, Ziheng Cheng, Shiyue Zhang, Xiangyu Zhang, Cheng Zhang

    Abstract: Continuous normalizing flows (CNFs) learn an ordinary differential equation to transform prior samples into data. Flow matching (FM) has recently emerged as a simulation-free approach for training CNFs by regressing a velocity model towards the conditional velocity field. However, on constrained domains, the learned velocity model may lead to undesirable flows that result in highly unnatural sampl… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: ICML 2024 camera-ready

  7. arXiv:2405.12838  [pdf, ps, other

    quant-ph stat.CO

    Quantum Non-Identical Mean Estimation: Efficient Algorithms and Fundamental Limits

    Authors: Jiachen Hu, Tongyang Li, Xinzhao Wang, Yecheng Xue, Chenyi Zhang, Han Zhong

    Abstract: We systematically investigate quantum algorithms and lower bounds for mean estimation given query access to non-identically distributed samples. On the one hand, we give quantum mean estimators with quadratic quantum speed-up given samples from different bounded or sub-Gaussian random variables. On the other hand, we prove that, in general, it is impossible for any quantum algorithm to achieve qua… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 31 pages, 0 figure. To appear in the 19th Theory of Quantum Computation, Communication and Cryptography (TQC 2024)

  8. arXiv:2405.08005  [pdf, other

    math.OC cs.AI cs.GT cs.LG stat.ML

    Graphon Mean Field Games with a Representative Player: Analysis and Learning Algorithm

    Authors: Fuzhong Zhou, Chenyu Zhang, Xu Chen, Xuan Di

    Abstract: We propose a discrete time graphon game formulation on continuous state and action spaces using a representative player to study stochastic games with heterogeneous interaction among agents. This formulation admits both philosophical and mathematical advantages, compared to a widely adopted formulation using a continuum of players. We prove the existence and uniqueness of the graphon equilibrium w… ▽ More

    Submitted 4 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: Published as a conference paper at ICML 2024

  9. arXiv:2404.16023  [pdf, other

    stat.AP cs.LG

    Learning Car-Following Behaviors Using Bayesian Matrix Normal Mixture Regression

    Authors: Chengyuan Zhang, Kehua Chen, Meixin Zhu, Hai Yang, Lijun Sun

    Abstract: Learning and understanding car-following (CF) behaviors are crucial for microscopic traffic simulation. Traditional CF models, though simple, often lack generalization capabilities, while many data-driven methods, despite their robustness, operate as "black boxes" with limited interpretability. To bridge this gap, this work introduces a Bayesian Matrix Normal Mixture Regression (MNMR) model that s… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 6 pages, Accepted by the 35th IEEE Intelligent Vehicles Symposium

  10. arXiv:2404.13836  [pdf, other

    stat.ME

    MultiFun-DAG: Multivariate Functional Directed Acyclic Graph

    Authors: Tian Lan, Ziyue Li, Junpeng Lin, Zhishuai Li, Lei Bai, Man Li, Fugee Tsung, Rui Zhao, Chen Zhang

    Abstract: Directed Acyclic Graphical (DAG) models efficiently formulate causal relationships in complex systems. Traditional DAGs assume nodes to be scalar variables, characterizing complex systems under a facile and oversimplified form. This paper considers that nodes can be multivariate functional data and thus proposes a multivariate functional DAG (MultiFun-DAG). It constructs a hidden bilinear multivar… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  11. arXiv:2404.13302  [pdf, other

    stat.CO stat.ME

    Monte Carlo sampling with integrator snippets

    Authors: Christophe Andrieu, Mauro Camara Escudero, Chang Zhang

    Abstract: Assume interest is in sampling from a probability distribution $μ$ defined on $(\mathsf{Z},\mathscr{Z})$. We develop a framework to construct sampling algorithms taking full advantage of numerical integrators of ODEs, say $ψ\colon\mathsf{Z}\rightarrow\mathsf{Z}$ for one integration step, to explore $μ$ efficiently and robustly. The popular Hybrid/Hamiltonian Monte Carlo (HMC) algorithm [Duane, 198… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    MSC Class: 65C05; 65C35 ACM Class: I.6.8; G.3

  12. arXiv:2404.10942  [pdf, other

    cs.LG cs.AI cs.CY stat.ME

    What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement Learning

    Authors: Zhihong Deng, **g Jiang, Guodong Long, Chengqi Zhang

    Abstract: In sequential decision-making problems involving sensitive attributes like race and gender, reinforcement learning (RL) agents must carefully consider long-term fairness while maximizing returns. Recent works have proposed many different types of fairness notions, but how unfairness arises in RL problems remains unclear. In this paper, we address this gap in the literature by investigating the sou… ▽ More

    Submitted 28 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: 13 pages, 9 figures, accepted by IJCAI 2024

  13. arXiv:2404.08169  [pdf, other

    stat.ME

    AutoGFI: Streamlined Generalized Fiducial Inference for Modern Inference Problems

    Authors: Wei Du, Jan Hannig, Thomas C. M. Lee, Yi Su, Chunzhe Zhang

    Abstract: The origins of fiducial inference trace back to the 1930s when R. A. Fisher first introduced the concept as a response to what he perceived as a limitation of Bayesian inference - the requirement for a subjective prior distribution on model parameters in cases where no prior information was available. However, Fisher's initial fiducial approach fell out of favor as complications arose, particularl… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  14. arXiv:2404.06969  [pdf, other

    cs.LG stat.ML

    FiP: a Fixed-Point Approach for Causal Generative Modeling

    Authors: Meyer Scetbon, Joel Jennings, Agrin Hilmkil, Cheng Zhang, Chao Ma

    Abstract: Modeling true world data-generating processes lies at the heart of empirical science. Structural Causal Models (SCMs) and their associated Directed Acyclic Graphs (DAGs) provide an increasingly popular answer to such problems by defining the causal generative process that transforms random noise into observations. However, learning them from observational data poses an ill-posed and NP-hard invers… ▽ More

    Submitted 14 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  15. arXiv:2404.04403  [pdf, other

    stat.ME cs.AI

    Low-Rank Robust Subspace Tensor Clustering for Metro Passenger Flow Modeling

    Authors: Jiuyun Hu, Ziyue Li, Chen Zhang, Fugee Tsung, Hao Yan

    Abstract: Tensor clustering has become an important topic, specifically in spatio-temporal modeling, due to its ability to cluster spatial modes (e.g., stations or road segments) and temporal modes (e.g., time of the day or day of the week). Our motivating example is from subway passenger flow modeling, where similarities between stations are commonly found. However, the challenges lie in the innate high-di… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: Conditionally Accepted in INFORMS Journal of Data Science

  16. arXiv:2404.03329  [pdf

    cs.LG eess.SP stat.ML

    DeepFunction: Deep Metric Learning-based Imbalanced Classification for Diagnosing Threaded Pipe Connection Defects using Functional Data

    Authors: Yukun Xie, Juan Du, Chen Zhang

    Abstract: In modern manufacturing, most of the product lines are conforming. Few products are nonconforming but with different defect types. The identification of defect types can help further root cause diagnosis of production lines. With the sensing development, signals of process variables can be collected in high resolution, which can be regarded as multichannel functional data. They have abundant infor… ▽ More

    Submitted 24 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Revised version for submission to IISE Transactions

  17. arXiv:2404.00220  [pdf, other

    stat.ML cs.LG

    Partially-Observable Sequential Change-Point Detection for Autocorrelated Data via Upper Confidence Region

    Authors: Haijie Xu, Xiaochen Xian, Chen Zhang, Kaibo Liu

    Abstract: Sequential change point detection for multivariate autocorrelated data is a very common problem in practice. However, when the sensing resources are limited, only a subset of variables from the multivariate system can be observed at each sensing time point. This raises the problem of partially observable multi-sensor sequential change point detection. For it, we propose a detection scheme called a… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  18. arXiv:2404.00218  [pdf, other

    stat.ML cs.LG

    Functional-Edged Network Modeling

    Authors: Haijie Xu, Chen Zhang

    Abstract: Contrasts with existing works which all consider nodes as functions and use edges to represent the relationships between different functions. We target at network modeling whose edges are functional data and transform the adjacency matrix into a functional adjacency tensor, introducing an additional dimension dedicated to function representation. Tucker functional decomposition is used for the fun… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  19. arXiv:2403.19851  [pdf, other

    cs.CL cs.CR cs.LG stat.ML

    Localizing Paragraph Memorization in Language Models

    Authors: Niklas Stoehr, Mitchell Gordon, Chiyuan Zhang, Owen Lewis

    Abstract: Can we localize the weights and mechanisms used by a language model to memorize and recite entire paragraphs of its training data? In this paper, we show that while memorization is spread across multiple layers and model components, gradients of memorized paragraphs have a distinguishable spatial pattern, being larger in lower model layers than gradients of non-memorized examples. Moreover, the me… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  20. arXiv:2402.11156  [pdf, other

    stat.ML cs.LG

    Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits

    Authors: Kyoungseok Jang, Chicheng Zhang, Kwang-Sung Jun

    Abstract: We study low-rank matrix trace regression and the related problem of low-rank matrix bandits. Assuming access to the distribution of the covariates, we propose a novel low-rank matrix estimation method called LowPopArt and provide its recovery guarantee that depends on a novel quantity denoted by B(Q) that characterizes the hardness of the problem, where Q is the covariance matrix of the measureme… ▽ More

    Submitted 8 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  21. arXiv:2401.17504  [pdf, other

    cs.LG stat.ME

    CaMU: Disentangling Causal Effects in Deep Model Unlearning

    Authors: Shaofei Shen, Chenhao Zhang, Alina Bialkowski, Weitong Chen, Miao Xu

    Abstract: Machine unlearning requires removing the information of forgetting data while kee** the necessary information of remaining data. Despite recent advancements in this area, existing methodologies mainly focus on the effect of removing forgetting data without considering the negative impact this can have on the information of the remaining data, resulting in significant performance degradation afte… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: Full version of the paper accepted for the SDM 24 conference

  22. arXiv:2312.07727  [pdf, other

    stat.ME math.ST

    Two-sample inference for sparse functional data

    Authors: Chi Zhang, Peijun Sang, Yingli Qin

    Abstract: We propose a novel test procedure for comparing mean functions across two groups within the reproducing kernel Hilbert space (RKHS) framework. Our proposed method is adept at handling sparsely and irregularly sampled functional data when observation times are random for each subject. Conventional approaches, which are built upon functional principal components analysis, usually assume a homogeneou… ▽ More

    Submitted 29 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

  23. arXiv:2311.12293  [pdf

    stat.ME stat.AP

    Sample size calculation based on the difference in restricted mean time lost for clinical trials with competing risks

    Authors: Xiang Geng, Zhao** Li, Chengfeng Zhang, Yanjie Wang, Haoning Shen, Zhiheng Huang, Yawen Hou, Zheng Chen

    Abstract: Computation of sample size is important when designing clinical trials. The presence of competing risks makes the design of clinical trials with time-to-event endpoints cumbersome. A model based on the subdistribution hazard ratio (SHR) is commonly used for trials under competing risks. However, this approach has some limitations related to model assumptions and clinical interpretation. Considerin… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  24. arXiv:2311.11563  [pdf

    stat.ME stat.AP

    Time-varying effect in the competing risks based on restricted mean time lost

    Authors: Zhiyin Yu, Zhao** Li, Chengfeng Zhang, Yawen Hou, Derun Zhou, Zheng Chen

    Abstract: Patients with breast cancer tend to die from other diseases, so for studies that focus on breast cancer, a competing risks model is more appropriate. Considering subdistribution hazard ratio, which is used often, limited to model assumptions and clinical interpretation, we aimed to quantify the effects of prognostic factors by an absolute indicator, the difference in restricted mean time lost (RMT… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  25. arXiv:2311.03989  [pdf, other

    cs.LG cs.AI stat.ME

    Learned Causal Method Prediction

    Authors: Shantanu Gupta, Cheng Zhang, Agrin Hilmkil

    Abstract: For a given causal question, it is important to efficiently decide which causal inference method to use for a given dataset. This is challenging because causal methods typically rely on complex and difficult-to-verify assumptions, and cross-validation is not applicable since ground truth causal quantities are unobserved. In this work, we propose CAusal Method Predictor (CAMP), a framework for pred… ▽ More

    Submitted 8 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

  26. arXiv:2311.01902  [pdf, other

    cs.LG stat.ME

    High Precision Causal Model Evaluation with Conditional Randomization

    Authors: Chao Ma, Cheng Zhang

    Abstract: The gold standard for causal model evaluation involves comparing model predictions with true effects estimated from randomized controlled trials (RCT). However, RCTs are not always feasible or ethical to perform. In contrast, conditionally randomized experiments based on inverse probability weighting (IPW) offer a more realistic approach but may suffer from high estimation variance. To tackle this… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023 Camera Ready version

  27. arXiv:2310.20224  [pdf, other

    stat.ML cs.AI cs.LG stat.AP

    Choose A Table: Tensor Dirichlet Process Multinomial Mixture Model with Graphs for Passenger Trajectory Clustering

    Authors: Ziyue Li, Hao Yan, Chen Zhang, Lijun Sun, Wolfgang Ketter, Fugee Tsung

    Abstract: Passenger clustering based on trajectory records is essential for transportation operators. However, existing methods cannot easily cluster the passengers due to the hierarchical structure of the passenger trip information, including multiple trips within each passenger and multi-dimensional information about each trip. Furthermore, existing approaches rely on an accurate specification of the clus… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: Accepted in ACM SIGSPATIAL 2023. arXiv admin note: substantial text overlap with arXiv:2306.13794

  28. arXiv:2310.17153  [pdf, other

    cs.LG stat.ME

    Hierarchical Semi-Implicit Variational Inference with Application to Diffusion Model Acceleration

    Authors: Longlin Yu, Tianyu Xie, Yu Zhu, Tong Yang, Xiangyu Zhang, Cheng Zhang

    Abstract: Semi-implicit variational inference (SIVI) has been introduced to expand the analytical variational families by defining expressive semi-implicit distributions in a hierarchical manner. However, the single-layer architecture commonly used in current SIVI methods can be insufficient when the target posterior has complicated structures. In this paper, we propose hierarchical semi-implicit variationa… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: 25 pages, 13 figures, NeurIPS 2023

  29. arXiv:2310.16516  [pdf, other

    stat.ML cs.LG

    Particle-based Variational Inference with Generalized Wasserstein Gradient Flow

    Authors: Ziheng Cheng, Shiyue Zhang, Longlin Yu, Cheng Zhang

    Abstract: Particle-based variational inference methods (ParVIs) such as Stein variational gradient descent (SVGD) update the particles based on the kernelized Wasserstein gradient flow for the Kullback-Leibler (KL) divergence. However, the design of kernels is often non-trivial and can be restrictive for the flexibility of the method. Recent works show that functional gradient flow approximations with quadr… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  30. arXiv:2310.16428  [pdf, ps, other

    stat.AP

    Similarity-driven and Task-driven Models for Diversity of Opinion in Crowdsourcing Markets

    Authors: Chen Jason Zhang, Yunrui Liu, Pengcheng Zeng, Ting Wu, Lei Chen, Pan Hui, Fei Hao

    Abstract: The recent boom in crowdsourcing has opened up a new avenue for utilizing human intelligence in the realm of data analysis. This innovative approach provides a powerful means for connecting online workers to tasks that cannot effectively be done solely by machines or conducted by professional experts due to cost constraints. Within the field of social science, four elements are required to constru… ▽ More

    Submitted 28 February, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

    Comments: 37 pages, 11 figures

  31. arXiv:2310.16336  [pdf, other

    cs.LG stat.ML

    SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process

    Authors: Zichong Li, Yanbo Xu, Simiao Zuo, Haoming Jiang, Chao Zhang, Tuo Zhao, Hongyuan Zha

    Abstract: Transformer Hawkes process models have shown to be successful in modeling event sequence data. However, most of the existing training methods rely on maximizing the likelihood of event sequences, which involves calculating some intractable integral. Moreover, the existing methods fail to provide uncertainty quantification for model predictions, e.g., confidence intervals for the predicted event's… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  32. arXiv:2310.15411  [pdf, ps, other

    cs.LG stat.ML

    Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach

    Authors: Yinan Li, Chicheng Zhang

    Abstract: We study the problem of computationally and label efficient PAC active learning $d$-dimensional halfspaces with Tsybakov Noise~\citep{tsybakov2004optimal} under structured unlabeled data distributions. Inspired by~\cite{diakonikolas2020learning}, we prove that any approximate first-order stationary point of a smooth nonconvex loss function yields a halfspace with a low excess error guarantee. In l… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 29 pages

  33. arXiv:2310.11428  [pdf, other

    cs.LG math.OC stat.ML

    Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression

    Authors: Adam Block, Dylan J. Foster, Akshay Krishnamurthy, Max Simchowitz, Cyril Zhang

    Abstract: This work studies training instabilities of behavior cloning with deep neural networks. We observe that minibatch SGD updates to the policy network during training result in sharp oscillations in long-horizon rewards, despite negligibly affecting the behavior cloning loss. We empirically disentangle the statistical and computational causes of these oscillations, and find them to stem from the chao… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  34. arXiv:2310.09553  [pdf, other

    q-bio.PE cs.LG stat.ML

    ARTree: A Deep Autoregressive Model for Phylogenetic Inference

    Authors: Tianyu Xie, Cheng Zhang

    Abstract: Designing flexible probabilistic models over tree topologies is important for develo** efficient phylogenetic inference methods. To do that, previous works often leverage the similarity of tree topologies via hand-engineered heuristic features which would require pre-sampled tree topologies and may suffer from limited approximation capability. In this paper, we propose a deep autoregressive mode… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023 spotlight

  35. arXiv:2310.08843  [pdf

    stat.AP

    A Longitudinal Analysis about the Effect of Air Pollution on Astigmatism for Children and Young Adults

    Authors: Lin An, Qiuyue Hu, Jieying Guan, Yingting Zhu, Chenyao Jiang, Xiaoyun Zhong, Shuyue Ma, Dongmei Yu, Canyang Zhang, Yehong Zhuo, Peiwu Qin

    Abstract: Purpose: This study aimed to investigate the correlation between air pollution and astigmatism, considering the detrimental effects of air pollution on respiratory, cardiovascular, and eye health. Methods: A longitudinal study was conducted with 127,709 individuals aged 4-27 years from 9 cities in Guangdong Province, China, spanning from 2019 to 2021. Astigmatism was measured using cylinder values… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  36. arXiv:2310.00809  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Towards Causal Foundation Model: on Duality between Causal Inference and Attention

    Authors: Jiaqi Zhang, Joel Jennings, Agrin Hilmkil, Nick Pawlowski, Cheng Zhang, Chao Ma

    Abstract: Foundation models have brought changes to the landscape of machine learning, demonstrating sparks of human-level intelligence across a diverse array of tasks. However, a gap persists in complex tasks such as causal inference, primarily due to challenges associated with intricate reasoning steps and high numerical precision requirements. In this work, we take a first step towards building causally-… ▽ More

    Submitted 3 June, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

  37. arXiv:2309.03800  [pdf, other

    cs.LG cs.AI stat.ML

    Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck

    Authors: Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

    Abstract: In modern deep learning, algorithmic choices (such as width, depth, and learning rate) are known to modulate nuanced resource tradeoffs. This work investigates how these complexities necessarily arise for feature learning in the presence of computational-statistical gaps. We begin by considering offline sparse parity learning, a supervised classification problem which admits a statistical query lo… ▽ More

    Submitted 30 October, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: v2: NeurIPS 2023 camera-ready updates

  38. arXiv:2308.10014  [pdf, other

    stat.ML cs.LG stat.ME

    Semi-Implicit Variational Inference via Score Matching

    Authors: Longlin Yu, Cheng Zhang

    Abstract: Semi-implicit variational inference (SIVI) greatly enriches the expressiveness of variational families by considering implicit variational distributions defined in a hierarchical manner. However, due to the intractable densities of variational distributions, current SIVI approaches often use surrogate evidence lower bounds (ELBOs) or employ expensive inner-loop MCMC runs for unbiased ELBOs for tra… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: 17 pages, 8 figures; ICLR 2023

  39. arXiv:2308.04428  [pdf, other

    stat.ML cs.LG eess.SY

    Meta-Learning Operators to Optimality from Multi-Task Non-IID Data

    Authors: Thomas T. C. K. Zhang, Leonardo F. Toso, James Anderson, Nikolai Matni

    Abstract: A powerful concept behind much of the recent progress in machine learning is the extraction of common features across data from heterogeneous sources or tasks. Intuitively, using all of one's data to learn a common representation function benefits both computational effort and statistical generalization by leaving a smaller number of parameters to fine-tune on a given task. Toward theoretically gr… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  40. arXiv:2307.13917  [pdf, other

    cs.LG stat.ME

    BayesDAG: Gradient-Based Posterior Inference for Causal Discovery

    Authors: Yashas Annadani, Nick Pawlowski, Joel Jennings, Stefan Bauer, Cheng Zhang, Wenbo Gong

    Abstract: Bayesian causal discovery aims to infer the posterior distribution over causal models from observed data, quantifying epistemic uncertainty and benefiting downstream tasks. However, computational challenges arise due to joint inference over combinatorial space of Directed Acyclic Graphs (DAGs) and nonlinear functions. Despite recent progress towards efficient posterior inference over DAGs, existin… ▽ More

    Submitted 8 December, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023

  41. arXiv:2307.11685  [pdf, other

    q-fin.TR cs.LG stat.ML

    Towards Generalizable Reinforcement Learning for Trade Execution

    Authors: Chuheng Zhang, Yitong Duan, Xiaoyu Chen, Jianyu Chen, Jian Li, Li Zhao

    Abstract: Optimized trade execution is to sell (or buy) a given amount of assets in a given time with the lowest possible trading cost. Recently, reinforcement learning (RL) has been applied to optimized trade execution to learn smarter policies from market data. However, we find that many existing RL methods exhibit considerable overfitting which prevents them from real deployment. In this paper, we provid… ▽ More

    Submitted 11 May, 2023; originally announced July 2023.

    Comments: Accepted by IJCAI-23

  42. arXiv:2307.07320  [pdf, other

    math.ST cs.LG stat.ML

    Adaptive Linear Estimating Equations

    Authors: Mufang Ying, Koulik Khamaru, Cun-Hui Zhang

    Abstract: Sequential data collection has emerged as a widely adopted technique for enhancing the efficiency of data gathering processes. Despite its advantages, such data collection mechanism often introduces complexities to the statistical inference procedure. For instance, the ordinary least squares (OLS) estimator in an adaptive linear regression model can exhibit non-normal asymptotic behavior, posing c… ▽ More

    Submitted 7 November, 2023; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: Paper is accepted at NeurIPS 2023

  43. arXiv:2307.07264  [pdf, ps, other

    cs.LG cs.DS stat.ML

    On Interpolating Experts and Multi-Armed Bandits

    Authors: Houshuang Chen, Yuchen He, Chihao Zhang

    Abstract: Learning with expert advice and multi-armed bandit are two classic online decision problems which differ on how the information is observed in each round of the game. We study a family of problems interpolating the two. For a vector $\mathbf{m}=(m_1,\dots,m_K)\in \mathbb{N}^K$, an instance of $\mathbf{m}$-MAB indicates that the arms are partitioned into $K$ groups and the $i$-th group contains… ▽ More

    Submitted 4 August, 2023; v1 submitted 14 July, 2023; originally announced July 2023.

  44. arXiv:2307.07191  [pdf, other

    cs.LG stat.ML

    Benchmarks and Custom Package for Electrical Load Forecasting

    Authors: Zhixian Wang, Qingsong Wen, Chaoli Zhang, Liang Sun, Leandro Von Krannichfeldt, Yi Wang

    Abstract: Load forecasting is of great significance in the power industry as it can provide a reference for subsequent tasks such as power grid dispatch, thus bringing huge economic benefits. However, there are many differences between load forecasting and traditional time series forecasting. On the one hand, load forecasting aims to minimize the cost of subsequent tasks such as power grid dispatch, rather… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

  45. arXiv:2307.04226  [pdf, other

    physics.geo-ph stat.ML

    Seismic Data Interpolation based on Denoising Diffusion Implicit Models with Resampling

    Authors: Xiaoli Wei, Chunxia Zhang, Hongtao Wang, Chengli Tan, Deng Xiong, Baisong Jiang, Jiangshe Zhang, Sang-Woon Kim

    Abstract: The incompleteness of the seismic data caused by missing traces along the spatial extension is a common issue in seismic acquisition due to the existence of obstacles and economic constraints, which severely impairs the imaging quality of subsurface geological structures. Recently, deep learningbased seismic interpolation methods have attained promising progress, while achieving stable training of… ▽ More

    Submitted 13 July, 2023; v1 submitted 9 July, 2023; originally announced July 2023.

    Comments: 14 pages, 13 figures

  46. arXiv:2307.03340  [pdf, other

    stat.AP

    Calibrating Car-Following Models via Bayesian Dynamic Regression

    Authors: Chengyuan Zhang, Wenshuo Wang, Lijun Sun

    Abstract: Car-following behavior modeling is critical for understanding traffic flow dynamics and develo** high-fidelity microscopic simulation models. Most existing impulse-response car-following models prioritize computational efficiency and interpretability by using a parsimonious nonlinear function based on immediate preceding state observations. However, this approach disregards historical informatio… ▽ More

    Submitted 11 June, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

  47. arXiv:2307.03034  [pdf, ps, other

    stat.ML cs.LG

    PCL-Indexability and Whittle Index for Restless Bandits with General Observation Models

    Authors: Keqin Liu, Chengzhong Zhang

    Abstract: In this paper, we consider a general observation model for restless multi-armed bandit problems. The operation of the player needs to be based on certain feedback mechanism that is error-prone due to resource constraints or environmental or intrinsic noises. By establishing a general probabilistic model for dynamics of feedback/observation, we formulate the problem as a restless bandit with a coun… ▽ More

    Submitted 3 July, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

  48. arXiv:2306.15744  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Ticketed Learning-Unlearning Schemes

    Authors: Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Ayush Sekhari, Chiyuan Zhang

    Abstract: We consider the learning--unlearning paradigm defined as follows. First given a dataset, the goal is to learn a good predictor, such as one minimizing a certain loss. Subsequently, given any subset of examples that wish to be unlearnt, the goal is to learn, without the knowledge of the original training dataset, a good predictor that is identical to the predictor that would have been produced when… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: Conference on Learning Theory (COLT) 2023

  49. arXiv:2306.14388  [pdf, other

    stat.ME

    Nonlinear Functional Principal Component Analysis Using Neural Networks

    Authors: Rou Zhong, Chunming Zhang, **gxiao Zhang

    Abstract: Functional principal component analysis (FPCA) is an important technique for dimension reduction in functional data analysis (FDA). Classical FPCA method is based on the Karhunen-Loève expansion, which assumes a linear structure of the observed functional data. However, the assumption may not always be satisfied, and the FPCA method can become inefficient when the data deviates from the linear ass… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

  50. arXiv:2306.13949  [pdf

    stat.ME stat.AP

    Analysis of dynamic restricted mean survival time based on pseudo-observations

    Authors: Zi**g Yang, Chengfeng Zhang, Yawen Hou, Zheng Chen

    Abstract: In clinical follow-up studies with a time-to-event end point, the difference in the restricted mean survival time (RMST) is a suitable substitute for the hazard ratio (HR). However, the RMST only measures the survival of patients over a period of time from the baseline and cannot reflect changes in life expectancy over time. Based on the RMST, we study the conditional restricted mean survival time… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

    Comments: Biometrics. 2023

    Report number: 13891