Skip to main content

Showing 1–50 of 318 results for author: Chen, Z

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.01111  [pdf, other

    cs.LG cs.AI stat.ML

    Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

    Authors: Hao Wang, Zhichao Chen, Yuan Shen, Jiajun Fan, Zhaoran Liu, Degui Yang, Xinggao Liu, Haoxuan Li

    Abstract: Heterogeneous treatment effect (HTE) estimation from observational data poses significant challenges due to treatment selection bias. Existing methods address this bias by minimizing distribution discrepancies between treatment groups in latent space, focusing on global alignment. However, the fruitful aspect of local proximity, where similar units exhibit similar outcomes, is often overlooked. In… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Code is available at https://anonymous.4open.science/status/ncr-B697

  2. arXiv:2407.00797  [pdf, other

    stat.ME

    A placement-value based approach to concave ROC analysis

    Authors: Soutik Ghosal, Zhen Chen

    Abstract: The receiver operating characteristic (ROC) curve is an important graphic tool for evaluating a test in a wide range of disciplines. While useful, an ROC curve can cross the chance line, either by having an S-shape or a hook at the extreme specificity. These non-concave ROC curves are sub-optimal according to decision theory, as there are points that are superior than those corresponding to the po… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 18 pages, 6 figures, 2 tables

  3. arXiv:2406.19021  [pdf, other

    stat.ME

    Nonlinear Multivariate Function-on-function Regression with Variable Selection

    Authors: Xu Haijie, Zhang Chen

    Abstract: This paper proposes a multivariate nonlinear function-on-function regression model, which allows both the response and the covariates can be multi-dimensional functions. The model is built upon the multivariate functional reproducing kernel Hilbert space (RKHS) theory. It predicts the response function by linearly combining each covariate function in their respective functional RKHS, and extends t… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  4. arXiv:2406.18603  [pdf, other

    stat.AP cs.LG

    Confidence interval estimation of mixed oil length with conditional diffusion model

    Authors: Yanfeng Yang, Lihong Zhang, Ziqi Chen, Miaomiao Yu, Lei Chen

    Abstract: Accurately estimating the mixed oil length plays a big role in the economic benefit for oil pipeline network. While various proposed methods have tried to predict the mixed oil length, they often exhibit an extremely high probability (around 50\%) of underestimating it. This is attributed to their failure to consider the statistical variability inherent in the estimated length of mixed oil. To add… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  5. arXiv:2406.16530  [pdf, other

    stat.ML cs.LG stat.CO

    Conditional Bayesian Quadrature

    Authors: Zonghao Chen, Masha Naslidnyk, Arthur Gretton, François-Xavier Briol

    Abstract: We propose a novel approach for estimating conditional or parametric expectations in the setting where obtaining samples or evaluating integrands is costly. Through the framework of probabilistic numerical methods (such as Bayesian quadrature), our novel approach allows to incorporates prior information about the integrands especially the prior smoothness knowledge about the integrands and the con… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Journal ref: Conference on Uncertainty in Artificial Intelligence (UAI) 2024

  6. arXiv:2406.15762  [pdf, other

    cs.LG stat.ML

    Rethinking the Diffusion Models for Numerical Tabular Data Imputation from the Perspective of Wasserstein Gradient Flow

    Authors: Zhichao Chen, Haoxuan Li, Fangyikang Wang, Odin Zhang, Hu Xu, Xiaoyu Jiang, Zhihuan Song, Eric H. Wang

    Abstract: Diffusion models (DMs) have gained attention in Missing Data Imputation (MDI), but there remain two long-neglected issues to be addressed: (1). Inaccurate Imputation, which arises from inherently sample-diversification-pursuing generative process of DMs. (2). Difficult Training, which stems from intricate design required for the mask matrix in model training stage. To address these concerns within… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  7. arXiv:2406.14399  [pdf, other

    cs.LG cs.CV physics.ao-ph stat.ML

    WEATHER-5K: A Large-scale Global Station Weather Dataset Towards Comprehensive Time-series Forecasting Benchmark

    Authors: Tao Han, Song Guo, Zhenghao Chen, Wanghan Xu, Lei Bai

    Abstract: Global Station Weather Forecasting (GSWF) is crucial for various sectors, including aviation, agriculture, energy, and disaster preparedness. Recent advancements in deep learning have significantly improved the accuracy of weather predictions by optimizing models based on public meteorological data. However, existing public datasets for GSWF optimization and benchmarking still suffer from signific… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 26 pages,13 figures

  8. arXiv:2406.12205  [pdf, other

    cs.LG cs.AI cs.IT math.ST stat.ML

    Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback

    Authors: Zhirui Chen, Vincent Y. F. Tan

    Abstract: We consider offline reinforcement learning (RL) with preference feedback in which the implicit reward is a linear function of an unknown parameter. Given an offline dataset, our objective consists in ascertaining the optimal action for each state, with the ultimate goal of minimizing the {\em simple regret}. We propose an algorithm, \underline{RL} with \underline{L}ocally \underline{O}ptimal \unde… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted to Models of Human Feedback for AI Alignment Workshop, ICML 2024

  9. arXiv:2406.04690  [pdf, other

    cs.LG stat.ML

    Higher-order Structure Based Anomaly Detection on Attributed Networks

    Authors: Xu Yuan, Na Zhou, Shuo Yu, Huafei Huang, Zhikui Chen, Feng Xia

    Abstract: Anomaly detection (such as telecom fraud detection and medical image detection) has attracted the increasing attention of people. The complex interaction between multiple entities widely exists in the network, which can reflect specific human behavior patterns. Such patterns can be modeled by higher-order network structures, thus benefiting anomaly detection on attributed networks. However, due to… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  10. arXiv:2406.01335  [pdf, other

    quant-ph q-fin.ST stat.ML

    Statistics-Informed Parameterized Quantum Circuit via Maximum Entropy Principle for Data Science and Finance

    Authors: Xi-Ning Zhuang, Zhao-Yun Chen, Cheng Xue, Xiao-Fan Xu, Chao Wang, Huan-Yu Liu, Tai-** Sun, Yun-Jie Wang, Yu-Chun Wu, Guo-** Guo

    Abstract: Quantum machine learning has demonstrated significant potential in solving practical problems, particularly in statistics-focused areas such as data science and finance. However, challenges remain in preparing and learning statistical models on a quantum processor due to issues with trainability and interpretability. In this letter, we utilize the maximum entropy principle to design a statistics-i… ▽ More

    Submitted 18 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 19 pages, 5 figures

  11. arXiv:2406.00630  [pdf, other

    stat.ML cs.LG

    On Non-asymptotic Theory of Recurrent Neural Networks in Temporal Point Processes

    Authors: Zhiheng Chen, Guanhua Fang, Wen Yu

    Abstract: Temporal point process (TPP) is an important tool for modeling and predicting irregularly timed events across various domains. Recently, the recurrent neural network (RNN)-based TPPs have shown practical advantages over traditional parametric TPP models. However, in the current literature, it remains nascent in understanding neural TPPs from theoretical viewpoints. In this paper, we establish the… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  12. arXiv:2405.16130  [pdf, ps, other

    cs.LG stat.ME

    Automating the Selection of Proxy Variables of Unmeasured Confounders

    Authors: Feng Xie, Zhengming Chen, Shanshan Luo, Wang Miao, Ruichu Cai, Zhi Geng

    Abstract: Recently, interest has grown in the use of proxy variables of unobserved confounding for inferring the causal effect in the presence of unmeasured confounders from observational data. One difficulty inhibiting the practical use is finding valid proxy variables of unobserved confounding to a target causal effect of interest. These proxy variables are typically justified by background knowledge. In… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  13. arXiv:2405.13962  [pdf, other

    stat.ML cs.LG

    Learning heavy-tailed distributions with Wasserstein-proximal-regularized $α$-divergences

    Authors: Ziyu Chen, Hyemin Gu, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu

    Abstract: In this paper, we propose Wasserstein proximals of $α$-divergences as suitable objective functionals for learning heavy-tailed distributions in a stable manner. First, we provide sufficient, and in some cases necessary, relations among data dimension, $α$, and the decay rate of data distributions for the Wasserstein-proximal-regularized divergence to be finite. Finite-sample convergence rates for… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 23 pages, 7 figures

  14. arXiv:2404.17734  [pdf, other

    stat.ME stat.AP

    Manipulating a Continuous Instrumental Variable in an Observational Study of Premature Babies: Algorithm, Partial Identification Bounds, and Inference under Randomization and Biased Randomization Assumptions

    Authors: Zhe Chen, Min Haeng Cho, Bo Zhang

    Abstract: Regionalization of intensive care for premature babies refers to a triage system of mothers with high-risk pregnancies to hospitals of varied capabilities based on risks faced by infants. Due to the limited capacity of high-level hospitals, which are equipped with advanced expertise to provide critical care, understanding the effect of delivering premature babies at such hospitals on infant mortal… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  15. arXiv:2404.15760  [pdf, other

    cs.LG cs.AI stat.ML

    Debiasing Machine Unlearning with Counterfactual Examples

    Authors: Ziheng Chen, Jia Wang, Jun Zhuang, Abbavaram Gowtham Reddy, Fabrizio Silvestri, ** Huang, Kaushiki Nag, Kun Kuang, Xin Ning, Gabriele Tolomei

    Abstract: The right to be forgotten (RTBF) seeks to safeguard individuals from the enduring effects of their historical actions by implementing machine-learning techniques. These techniques facilitate the deletion of previously acquired knowledge without requiring extensive model retraining. However, they often overlook a critical issue: unlearning processes bias. This bias emerges from two main sources: (1… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  16. arXiv:2404.12376  [pdf, other

    cs.LG math.OC stat.ML

    Matching the Statistical Query Lower Bound for k-sparse Parity Problems with Stochastic Gradient Descent

    Authors: Yiwen Kou, Zixiang Chen, Quanquan Gu, Sham M. Kakade

    Abstract: The $k$-parity problem is a classical problem in computational complexity and algorithmic theory, serving as a key benchmark for understanding computational classes. In this paper, we solve the $k$-parity problem with stochastic gradient descent (SGD) on two-layer fully-connected neural networks. We demonstrate that SGD can efficiently solve the $k$-sparse parity problem on a $d$-dimensional hyper… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 36 pages, 7 figures, 3 tables

  17. arXiv:2404.10004  [pdf

    cs.LG physics.soc-ph stat.AP

    A Strategy Transfer and Decision Support Approach for Epidemic Control in Experience Shortage Scenarios

    Authors: X. Xiao, P. Chen, X. Cao, K. Liu, L. Deng, D. Zhao, Z. Chen, Q. Deng, F. Yu, H. Zhang

    Abstract: Epidemic outbreaks can cause critical health concerns and severe global economic crises. For countries or regions with new infectious disease outbreaks, it is essential to generate preventive strategies by learning lessons from others with similar risk profiles. A Strategy Transfer and Decision Support Approach (STDSA) is proposed based on the profile similarity evaluation. There are four steps in… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 20 pages, 9 figures

  18. arXiv:2404.08472  [pdf, other

    cs.LG stat.ML

    TSLANet: Rethinking Transformers for Time Series Representation Learning

    Authors: Emadeldeen Eldele, Mohamed Ragab, Zhenghua Chen, Min Wu, Xiaoli Li

    Abstract: Time series data, characterized by its intrinsic long and short-range dependencies, poses a unique challenge across analytical applications. While Transformer-based models excel at capturing long-range dependencies, they face limitations in noise sensitivity, computational efficiency, and overfitting with smaller datasets. In response, we introduce a novel Time Series Lightweight Adaptive Network… ▽ More

    Submitted 6 May, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted in ICML 2024

  19. arXiv:2403.17481  [pdf, ps, other

    stat.ME

    A Type of Nonlinear Fréchet Regressions

    Authors: Lu Lin, Ze Chen

    Abstract: The existing Fréchet regression is actually defined within a linear framework, since the weight function in the Fréchet objective function is linearly defined, and the resulting Fréchet regression function is identified to be a linear model when the random object belongs to a Hilbert space. Even for nonparametric and semiparametric Fréchet regressions, which are usually nonlinear, the existing met… ▽ More

    Submitted 26 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  20. arXiv:2403.16523  [pdf, other

    stat.ML cs.AI cs.LG

    Causal Discovery from Poisson Branching Structural Causal Model Using High-Order Cumulant with Path Analysis

    Authors: Jie Qiao, Yu Xiang, Zhengming Chen, Ruichu Cai, Zhifeng Hao

    Abstract: Count data naturally arise in many fields, such as finance, neuroscience, and epidemiology, and discovering causal structure among count data is a crucial task in various scientific and industrial scenarios. One of the most common characteristics of count data is the inherent branching structure described by a binomial thinning operator and an independent Poisson distribution that captures both br… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted by AAAI-2024

  21. arXiv:2402.12724  [pdf, other

    stat.ME q-bio.GN stat.AP

    Controlled Variable Selection from Summary Statistics Only? A Solution via GhostKnockoffs and Penalized Regression

    Authors: Zhaomeng Chen, Zihuai He, Benjamin B. Chu, Jiaqi Gu, Tim Morrison, Chiara Sabatti, Emmanuel Candès

    Abstract: Identifying which variables do influence a response while controlling false positives pervades statistics and data science. In this paper, we consider a scenario in which we only have access to summary statistics, such as the values of marginal empirical correlations between each dependent variable of potential interest and the response. This situation may arise due to privacy concerns, e.g., to a… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  22. arXiv:2402.10834  [pdf, other

    stat.AP cs.CY

    Agent-based Simulation Evaluation of CBD Tolling: A Case Study from New York City

    Authors: Qingnan Liang, Ruili Yao, Ruixuan Zhang, Zhibin Chen, Guoyuan Wu

    Abstract: Congestion tollings have been widely developed and adopted as an effective tool to mitigate urban traffic congestion and enhance transportation system sustainability. Nevertheless, these tolling schemes are often tailored on a city-by-city or even area-by-area basis, and the cost of conducting field experiments often makes the design and evaluation process challenging. In this work, we leverage MA… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted by 2024 IEEE Forum on Integrated and Sustainable Transportation Systems

  23. arXiv:2402.10210  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

    Authors: Huizhuo Yuan, Zixiang Chen, Kaixuan Ji, Quanquan Gu

    Abstract: Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI), especially when compared with the remarkable progress made in fine-tuning Large Language Models (LLMs). While cutting-edge diffusion models such as Stable Diffusion (SD) and SDXL rely on supervised fine-tuning, their performance inevitably plateaus after seeing a certain volume of data. Re… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: 28 pages, 8 figures, 10 tables

  24. arXiv:2402.10062  [pdf, other

    cs.LG stat.ML

    Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection

    Authors: Chao Chen, Zhihang Fu, Kai Liu, Ze Chen, Mingyuan Tao, Jie** Ye

    Abstract: For a machine learning model deployed in real world scenarios, the ability of detecting out-of-distribution (OOD) samples is indispensable and challenging. Most existing OOD detection methods focused on exploring advanced training skills or training-free tricks to prevent the model from yielding overconfident confidence score for unknown samples. The training-based methods require expensive traini… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted by NeurIPS 2023. 19 pages

    Journal ref: NeurIPS 2023

  25. arXiv:2402.09723  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    Efficient Prompt Optimization Through the Lens of Best Arm Identification

    Authors: Chengshuai Shi, Kun Yang, Zihan Chen, Jundong Li, **g Yang, Cong Shen

    Abstract: The remarkable instruction-following capability of large language models (LLMs) has sparked a growing interest in automatically finding good prompts, i.e., prompt optimization. Most existing works follow the scheme of selecting from a pre-generated pool of candidate prompts. However, these designs mainly focus on the generation strategy, while limited attention has been paid to the selection metho… ▽ More

    Submitted 30 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  26. arXiv:2402.09702  [pdf, other

    cs.LG stat.ML

    Sparse and Faithful Explanations Without Sparse Models

    Authors: Yiyang Sun, Zhi Chen, Vittorio Orlandi, Tong Wang, Cynthia Rudin

    Abstract: Even if a model is not globally sparse, it is possible for decisions made from that model to be accurately and faithfully described by a small number of features. For instance, an application for a large loan might be denied to someone because they have no credit history, which overwhelms any evidence towards their creditworthiness. In this work, we introduce the Sparse Explanation Value (SEV), a… ▽ More

    Submitted 8 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: Accepted in AISTATS 2024

  27. arXiv:2402.02399  [pdf, other

    cs.LG cs.AI stat.AP stat.ML

    FreDF: Learning to Forecast in Frequency Domain

    Authors: Hao Wang, Licheng Pan, Zhichao Chen, Degui Yang, Sen Zhang, Yifei Yang, Xinggao Liu, Haoxuan Li, Dacheng Tao

    Abstract: Time series modeling is uniquely challenged by the presence of autocorrelation in both historical and label sequences. Current research predominantly focuses on handling autocorrelation within the historical sequence but often neglects its presence in the label sequence. Specifically, emerging forecast models mainly conform to the direct forecast (DF) paradigm, generating multi-step forecasts unde… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  28. arXiv:2402.02357  [pdf, other

    cs.LG stat.ME

    Multi-modal Causal Structure Learning and Root Cause Analysis

    Authors: Lecheng Zheng, Zhengzhang Chen, **grui He, Haifeng Chen

    Abstract: Effective root cause analysis (RCA) is vital for swiftly restoring services, minimizing losses, and ensuring the smooth operation and management of complex systems. Previous data-driven RCA methods, particularly those employing causal discovery techniques, have primarily focused on constructing dependency or causal graphs for backtracking the root causes. However, these methods often fall short as… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted by the Web Conference 2024

  29. arXiv:2401.18017  [pdf, ps, other

    stat.ML cs.LG

    Causal Discovery by Kernel Deviance Measures with Heterogeneous Transforms

    Authors: Tim Tse, Zhitang Chen, Shengyu Zhu, Yue Liu

    Abstract: The discovery of causal relationships in a set of random variables is a fundamental objective of science and has also recently been argued as being an essential component towards real machine intelligence. One class of causal discovery techniques are founded based on the argument that there are inherent structural asymmetries between the causal and anti-causal direction which could be leveraged in… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  30. arXiv:2401.18012  [pdf, other

    stat.ML cs.LG

    Causal Coordinated Concurrent Reinforcement Learning

    Authors: Tim Tse, Isaac Chan, Zhitang Chen

    Abstract: In this work, we propose a novel algorithmic framework for data sharing and coordinated exploration for the purpose of learning more data-efficient and better performing policies under a concurrent reinforcement learning (CRL) setting. In contrast to other work which make the assumption that all agents act under identical environments, we relax this restriction and instead consider the formulation… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  31. arXiv:2401.14535  [pdf, other

    cs.LG cs.CV stat.ME

    CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

    Authors: Guangyi Chen, Yifan Shen, Zhenhao Chen, Xiangchen Song, Yuewen Sun, Weiran Yao, Xiao Liu, Kun Zhang

    Abstract: Identifying the underlying time-delayed latent causal processes in sequential data is vital for gras** temporal dynamics and making downstream reasoning. While some recent methods can robustly identify these latent causal variables, they rely on strict assumptions about the invertible generation process from latent variables to observed data. However, these assumptions are often hard to satisfy… ▽ More

    Submitted 30 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: To appear at ICML 2024, 24 pages

  32. arXiv:2401.09073  [pdf, other

    cs.LG cs.AI cs.IT math.ST stat.ML

    Fixed-Budget Differentially Private Best Arm Identification

    Authors: Zhirui Chen, P. N. Karthik, Yeow Meng Chee, Vincent Y. F. Tan

    Abstract: We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints, when the arm rewards are supported on the unit interval. Given a finite budget $T$ and a privacy parameter $\varepsilon>0$, the goal is to minimise the error probability in finding the arm with the largest mean after $T$ sampling rounds, subject to the constraint that the pol… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: Accepted to ICLR 2024

  33. arXiv:2401.01335  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

    Authors: Zixiang Chen, Yihe Deng, Huizhuo Yuan, Kaixuan Ji, Quanquan Gu

    Abstract: Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the prospect of growing a strong LLM out of a weak one without the need for acquiring additional human-annotated data. We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN), which starts from a supervised fine-tuned… ▽ More

    Submitted 14 June, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: 22 pages, 6 figures, 7 tables. In ICML 2024

  34. arXiv:2312.17199  [pdf, other

    stat.ML cs.AI cs.LG

    Tractable Function-Space Variational Inference in Bayesian Neural Networks

    Authors: Tim G. J. Rudner, Zonghao Chen, Yee Whye Teh, Yarin Gal

    Abstract: Reliable predictive uncertainty estimation plays an important role in enabling the deployment of neural networks to safety-critical settings. A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters, infer an approximate posterior distribution, and use it to make stochastic predictions. However, explicit inference… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Advances in Neural Information Processing Systems 35 (NeurIPS 2022)

  35. arXiv:2312.15148  [pdf, other

    cs.LG cs.IT eess.SP stat.ML

    Personalized Federated Learning with Attention-based Client Selection

    Authors: Zihan Chen, Jundong Li, Cong Shen

    Abstract: Personalized Federated Learning (PFL) relies on collective data knowledge to build customized models. However, non-IID data between clients poses significant challenges, as collaborating with clients who have diverse data distributions can harm local model performance, especially with limited training data. To address this issue, we propose FedACS, a new PFL algorithm with an Attention-based Clien… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  36. arXiv:2312.12823  [pdf

    stat.ME stat.AP

    Detecting Multiple Change Points in Distributional Sequences Derived from Structural Health Monitoring Data: An Application to Bridge Damage Detection

    Authors: Xinyi Lei, Zhicheng Chen

    Abstract: Detecting damage in critical structures using monitored data is a fundamental task of structural health monitoring, which is extremely important for maintaining structures' safety and life-cycle management. Based on statistical pattern recognition paradigm, damage detection can be conducted by assessing changes in the distribution of properly extracted damage-sensitive features (DSFs). This can be… ▽ More

    Submitted 20 March, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  37. arXiv:2312.12206  [pdf, other

    cs.LG cs.AI stat.ME

    Identification of Causal Structure in the Presence of Missing Data with Additive Noise Model

    Authors: Jie Qiao, Zhengming Chen, Jianhua Yu, Ruichu Cai, Zhifeng Hao

    Abstract: Missing data are an unavoidable complication frequently encountered in many causal discovery tasks. When a missing process depends on the missing values themselves (known as self-masking missingness), the recovery of the joint distribution becomes unattainable, and detecting the presence of such self-masking missingness remains a perplexing challenge. Consequently, due to the inability to reconstr… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI-2024

  38. arXiv:2312.09758  [pdf, other

    cs.LG cs.AI stat.ME

    Diagnosing and Rectifying Fake OOD Invariance: A Restructured Causal Approach

    Authors: Ziliang Chen, Yongsen Zheng, Zhao-Rong Lai, Quanlong Guan, Liang Lin

    Abstract: Invariant representation learning (IRL) encourages the prediction from invariant causal features to labels de-confounded from the environments, advancing the technical roadmap of out-of-distribution (OOD) generalization. Despite spotlights around, recent theoretical results verified that some causal features recovered by IRLs merely pretend domain-invariantly in the training environments but fail… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: AAAI-2024

  39. arXiv:2312.09393  [pdf

    stat.AP

    Bi-scale Car-following Model Calibration for Corridor Based on Trajectory

    Authors: Keke Long, Haotian Shi, Zhiwei Chen, Zhaohui Liang, Xiaopeng Li, Felipe de Souza

    Abstract: The precise estimation of macroscopic traffic parameters, such as travel time and fuel consumption, is essential for the optimization of traffic management systems. Despite its importance, the comprehensive acquisition of vehicle trajectory data for the calculation of these macroscopic measures presents a challenge. To bridge this gap, this study aims to calibrate car-following models capable of p… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  40. arXiv:2312.09193  [pdf, other

    cs.LG cs.AI stat.ML

    Fast Sampling via Discrete Non-Markov Diffusion Models

    Authors: Zixiang Chen, Huizhuo Yuan, Yongqian Li, Yiwen Kou, Junkai Zhang, Quanquan Gu

    Abstract: Discrete diffusion models have emerged as powerful tools for high-quality data generation. Despite their success in discrete spaces, such as text generation tasks, the acceleration of discrete diffusion models remains under explored. In this paper, we propose a discrete non-Markov diffusion model, which admits an accelerated reverse sampling for discrete data generation. Our method significantly r… ▽ More

    Submitted 27 June, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 33 pages, 5 figures, 12 tables

  41. arXiv:2311.12293  [pdf

    stat.ME stat.AP

    Sample size calculation based on the difference in restricted mean time lost for clinical trials with competing risks

    Authors: Xiang Geng, Zhao** Li, Chengfeng Zhang, Yanjie Wang, Haoning Shen, Zhiheng Huang, Yawen Hou, Zheng Chen

    Abstract: Computation of sample size is important when designing clinical trials. The presence of competing risks makes the design of clinical trials with time-to-event endpoints cumbersome. A model based on the subdistribution hazard ratio (SHR) is commonly used for trials under competing risks. However, this approach has some limitations related to model assumptions and clinical interpretation. Considerin… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  42. arXiv:2311.11563  [pdf

    stat.ME stat.AP

    Time-varying effect in the competing risks based on restricted mean time lost

    Authors: Zhiyin Yu, Zhao** Li, Chengfeng Zhang, Yawen Hou, Derun Zhou, Zheng Chen

    Abstract: Patients with breast cancer tend to die from other diseases, so for studies that focus on breast cancer, a competing risks model is more appropriate. Considering subdistribution hazard ratio, which is used often, limited to model assumptions and clinical interpretation, we aimed to quantify the effects of prognostic factors by an absolute indicator, the difference in restricted mean time lost (RMT… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  43. arXiv:2310.20145  [pdf, other

    cs.LG stat.ML

    Efficient Robust Bayesian Optimization for Arbitrary Uncertain Inputs

    Authors: Lin Yang, Junlong Lyu, Wenlong Lyu, Zhitang Chen

    Abstract: Bayesian Optimization (BO) is a sample-efficient optimization algorithm widely employed across various applications. In some challenging BO tasks, input uncertainty arises due to the inevitable randomness in the optimization process, such as machining errors, execution noise, or contextual variability. This uncertainty deviates the input from the intended value before evaluation, resulting in sign… ▽ More

    Submitted 3 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023

  44. arXiv:2310.18935  [pdf, other

    cs.LG math.OC stat.ML

    Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data

    Authors: Yiwen Kou, Zixiang Chen, Quanquan Gu

    Abstract: The implicit bias towards solutions with favorable properties is believed to be a key reason why neural networks trained by gradient-based optimization can generalize well. While the implicit bias of gradient flow has been widely studied for homogeneous neural networks (including ReLU and leaky ReLU networks), the implicit bias of gradient descent is currently only understood for smooth neural net… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: 55 pages, 7 figures. In NeurIPS 2023

  45. arXiv:2310.18884  [pdf, other

    cs.LG stat.ML

    Simple and Asymmetric Graph Contrastive Learning without Augmentations

    Authors: Teng Xiao, Huaisheng Zhu, Zhengyu Chen, Suhang Wang

    Abstract: Graph Contrastive Learning (GCL) has shown superior performance in representation learning in graph-structured data. Despite their success, most existing GCL methods rely on prefabricated graph augmentation and homophily assumptions. Thus, they fail to generalize well to heterophilic graphs where connected nodes may have different class labels and dissimilar features. In this paper, we study the p… ▽ More

    Submitted 24 February, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023 Main Track

  46. arXiv:2310.18286  [pdf, other

    cs.LG stat.AP stat.ML

    Optimal Transport for Treatment Effect Estimation

    Authors: Hao Wang, Zhichao Chen, Jiajun Fan, Haoxuan Li, Tianqiao Liu, Weiming Liu, Quanyu Dai, Yichao Wang, Zhenhua Dong, Ruiming Tang

    Abstract: Estimating conditional average treatment effect from observational data is highly challenging due to the existence of treatment selection bias. Prevalent methods mitigate this issue by aligning distributions of different treatment groups in the latent space. However, there are two critical problems that these methods fail to address: (1) mini-batch sampling effects (MSE), which causes misalignment… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted as NeurIPS 2023 Poster

  47. arXiv:2310.15069  [pdf, other

    stat.ME q-bio.GN stat.AP

    Second-order group knockoffs with applications to GWAS

    Authors: Benjamin B Chu, Jiaqi Gu, Zhaomeng Chen, Tim Morrison, Emmanuel Candes, Zihuai He, Chiara Sabatti

    Abstract: Conditional testing via the knockoff framework allows one to identify -- among large number of possible explanatory variables -- those that carry unique information about an outcome of interest, and also provides a false discovery rate guarantee on the selection. This approach is particularly well suited to the analysis of genome wide association studies (GWAS), which have the goal of identifying… ▽ More

    Submitted 3 March, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: 46 pages, 10 figures, 2 tables, 3 algorithms

  48. arXiv:2310.14399  [pdf, other

    stat.ME stat.AP

    The role of randomization inference in unraveling individual treatment effects in early phase vaccine trials

    Authors: Zhe Chen, Xinran Li, Bo Zhang

    Abstract: Randomization inference is a powerful tool in early phase vaccine trials when estimating the causal effect of a regimen against a placebo or another regimen. Randomization-based inference often focuses on testing either Fisher's sharp null hypothesis of no treatment effect for any participant or Neyman's weak null hypothesis of no sample average treatment effect. Many recent efforts have explored… ▽ More

    Submitted 26 February, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

  49. arXiv:2310.09493  [pdf, other

    stat.ME stat.AP

    Summary Statistics Knockoffs Inference with Family-wise Error Rate Control

    Authors: Catherine Xinrui Yu, Jiaqi Gu, Zhaomeng Chen, Zihuai He

    Abstract: Testing multiple hypotheses of conditional independence with provable error rate control is a fundamental problem with various applications. To infer conditional independence with family-wise error rate (FWER) control when only summary statistics of marginal dependence are accessible, we adopt GhostKnockoff to directly generate knockoff copies of summary statistics and propose a new filter to sele… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

    Comments: 35 pages

  50. arXiv:2310.08391  [pdf, other

    stat.ML cs.LG

    How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?

    Authors: **gfeng Wu, Difan Zou, Zixiang Chen, Vladimir Braverman, Quanquan Gu, Peter L. Bartlett

    Abstract: Transformers pretrained on diverse tasks exhibit remarkable in-context learning (ICL) capabilities, enabling them to solve unseen tasks solely based on input contexts without adjusting model parameters. In this paper, we study ICL in one of its simplest setups: pretraining a linearly parameterized single-layer linear attention model for linear regression with a Gaussian prior. We establish a stati… ▽ More

    Submitted 14 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 Camera Ready