Skip to main content

Showing 1–28 of 28 results for author: Wei, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.07455  [pdf, other

    cs.LG stat.ML

    Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis

    Authors: Qining Zhang, Honghao Wei, Lei Ying

    Abstract: In this paper, we study reinforcement learning from human feedback (RLHF) under an episodic Markov decision process with a general trajectory-wise reward model. We developed a model-free RLHF best policy identification algorithm, called $\mathsf{BSAD}$, without explicit reward model inference, which is a critical intermediate step in the contemporary RLHF paradigms for training large language mode… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2401.05517  [pdf, other

    stat.ME econ.EM math.ST

    On Efficient Inference of Causal Effects with Multiple Mediators

    Authors: Haoyu Wei, Hengrui Cai, Chengchun Shi, Rui Song

    Abstract: This paper provides robust estimators and efficient inference of causal effects involving multiple interacting mediators. Most existing works either impose a linear model assumption among the mediators or are restricted to handle conditionally independent mediators given the exposure. To overcome these limitations, we define causal and individual mediation effects in a general setting, and employ… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    MSC Class: 62A09; 62G05; 62G35

  3. arXiv:2312.15595  [pdf, other

    stat.ML cs.LG econ.EM

    Zero-Inflated Bandits

    Authors: Haoyu Wei, Runzhe Wan, Lei Shi, Rui Song

    Abstract: Many real applications of bandits have sparse non-zero rewards, leading to slow learning rates. A careful distribution modeling that utilizes problem-specific structures is known as critical to estimation efficiency in the statistics literature, yet is under-explored in bandits. To fill the gap, we initiate the study of zero-inflated bandits, where the reward is modeled as a classic semi-parametri… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

  4. arXiv:2311.04550  [pdf, other

    cs.LG stat.ML

    Regression with Cost-based Rejection

    Authors: Xin Cheng, Yuzhou Cao, Haobo Wang, Hongxin Wei, Bo An, Lei Feng

    Abstract: Learning with rejection is an important framework that can refrain from making predictions to avoid critical mispredictions by balancing between prediction and rejection. Previous studies on cost-based rejection only focused on the classification setting, which cannot handle the continuous and infinite target space in the regression setting. In this paper, we investigate a novel regression problem… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Accepted by NeurIPS 2023

  5. arXiv:2306.04746  [pdf, other

    stat.ME cs.CL cs.LG stat.ML

    Using Imperfect Surrogates for Downstream Inference: Design-based Supervised Learning for Social Science Applications of Large Language Models

    Authors: Naoki Egami, Musashi Hinck, Brandon M. Stewart, Hanying Wei

    Abstract: In computational social science (CSS), researchers analyze documents to explain social and political phenomena. In most scenarios, CSS researchers first obtain labels for documents and then explain labels using interpretable regression analyses in the second step. One increasingly common way to annotate documents cheaply at scale is through large language models (LLMs). However, like other scalabl… ▽ More

    Submitted 14 January, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  6. arXiv:2303.07287  [pdf, other

    stat.ML cs.LG econ.EM

    Tight Non-asymptotic Inference via Sub-Gaussian Intrinsic Moment Norm

    Authors: Huiming Zhang, Haoyu Wei, Guang Cheng

    Abstract: In non-asymptotic learning, variance-type parameters of sub-Gaussian distributions are of paramount importance. However, directly estimating these parameters using the empirical moment generating function (MGF) is infeasible. To address this, we suggest using the sub-Gaussian intrinsic moment norm [Buldygin and Kozachenko (2000), Theorem 1.3] achieved by maximizing a sequence of normalized moments… ▽ More

    Submitted 19 January, 2024; v1 submitted 13 March, 2023; originally announced March 2023.

  7. arXiv:2209.01289  [pdf, other

    stat.OT stat.CO stat.ML

    elhmc: An R Package for Hamiltonian Monte Carlo Sampling in Bayesian Empirical Likelihood

    Authors: Dang Trung Kien, Neo Han Wei, Sanjay Chaudhuri

    Abstract: In this article, we describe a {\tt R} package for sampling from an empirical likelihood-based posterior using a Hamiltonian Monte Carlo method. Empirical likelihood-based methodologies have been used in Bayesian modeling of many problems of interest in recent times. This semiparametric procedure can easily combine the flexibility of a non-parametric distribution estimator together with the interp… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

  8. arXiv:2206.04733  [pdf, other

    stat.AP eess.SY math.ST

    On Low-Complexity Quickest Intervention of Mutated Diffusion Processes Through Local Approximation

    Authors: Qining Zhang, Honghao Wei, Weina Wang, Lei Ying

    Abstract: We consider the problem of controlling a mutated diffusion process with an unknown mutation time. The problem is formulated as the quickest intervention problem with the mutation modeled by a change-point, which is a generalization of the quickest change-point detection (QCD). Our goal is to intervene in the mutated process as soon as possible while maintaining a low intervention cost with optimal… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

  9. arXiv:2202.05612  [pdf, other

    stat.ML cs.LG math.ST

    High-dimensional Inference and FDR Control for Simulated Markov Random Fields

    Authors: Haoyu Wei, Xiaoyu Lei, Yixin Han, Huiming Zhang

    Abstract: Identifying important features linked to a response variable is a fundamental task in various scientific domains. This article explores statistical inference for simulated Markov random fields in high-dimensional settings. We introduce a methodology based on Markov Chain Monte Carlo Maximum Likelihood Estimation (MCMC-MLE) with Elastic-net regularization. Under mild conditions on the MCMC method,… ▽ More

    Submitted 19 January, 2024; v1 submitted 11 February, 2022; originally announced February 2022.

  10. arXiv:2111.01301  [pdf, other

    math.ST econ.EM stat.ML

    Asymptotic in a class of network models with an increasing sub-Gamma degree sequence

    Authors: **g Luo, Haoyu Wei, Xiaoyu Lei, Jiaxin Guo

    Abstract: For the differential privacy under the sub-Gamma noise, we derive the asymptotic properties of a class of network models with binary values with a general link function. In this paper, we release the degree sequences of the binary networks under a general noisy mechanism with the discrete Laplace mechanism as a special case. We establish the asymptotic result including both consistency and asympto… ▽ More

    Submitted 10 November, 2023; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: arXiv admin note: text overlap with arXiv:2002.12733 by other authors

    MSC Class: 62E20; 62F12

  11. arXiv:2110.03552  [pdf, ps, other

    stat.ME econ.EM math.ST stat.ML

    Heterogeneous Overdispersed Count Data Regressions via Double Penalized Estimations

    Authors: Shaomin Li, Haoyu Wei, Xiaoyu Lei

    Abstract: This paper studies the non-asymptotic merits of the double $\ell_1$-regularized for heterogeneous overdispersed count data via negative binomial regressions. Under the restricted eigenvalue conditions, we prove the oracle inequalities for Lasso estimators of two partial regression coefficients for the first time, using concentration inequalities of empirical processes. Furthermore, derived from th… ▽ More

    Submitted 7 February, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

  12. arXiv:2107.04855  [pdf, ps, other

    cs.LG stat.ML

    Kernel Mean Estimation by Marginalized Corrupted Distributions

    Authors: Xiaobo Xia, Shuo Shan, Mingming Gong, Nannan Wang, Fei Gao, Haikun Wei, Tongliang Liu

    Abstract: Estimating the kernel mean in a reproducing kernel Hilbert space is a critical component in many kernel learning algorithms. Given a finite sample, the standard estimate of the target kernel mean is the empirical average. Previous works have shown that better estimators can be constructed by shrinkage methods. In this work, we propose to corrupt data examples with noise from known distributions an… ▽ More

    Submitted 10 July, 2021; originally announced July 2021.

  13. arXiv:2107.00179  [pdf

    math.ST cs.DC cs.LG stat.ML

    Distributed Nonparametric Function Estimation: Optimal Rate of Convergence and Cost of Adaptation

    Authors: T. Tony Cai, Hongji Wei

    Abstract: Distributed minimax estimation and distributed adaptive estimation under communication constraints for Gaussian sequence model and white noise model are studied. The minimax rate of convergence for distributed estimation over a given Besov class, which serves as a benchmark for the cost of adaptation, is established. We then quantify the exact communication cost for adaptation and construct an opt… ▽ More

    Submitted 30 June, 2021; originally announced July 2021.

    MSC Class: 62F30

  14. arXiv:2103.07920  [pdf, other

    stat.ME

    A two-way factor model for high-dimensional matrix data

    Authors: Gao Zhigen, Yuan Chaofeng, **g Bingyi, Huang Wei, Guo Jianhua

    Abstract: In this article, we introduce a two-way factor model for a high-dimensional data matrix and study the properties of the maximum likelihood estimation (MLE). The proposed model assumes separable effects of row and column attributes and captures the correlation across rows and columns with low-dimensional hidden factors. The model inherits the dimension-reduction feature of classical factor models b… ▽ More

    Submitted 15 March, 2021; v1 submitted 14 March, 2021; originally announced March 2021.

    Comments: 35 pages, 5 figures

  15. arXiv:2102.02450  [pdf, other

    math.ST math.PR stat.ML

    Sharper Sub-Weibull Concentrations

    Authors: Huiming Zhang, Haoyu Wei

    Abstract: Constant-specified and exponential concentration inequalities play an essential role in the finite-sample theory of machine learning and high-dimensional statistics area. We obtain sharper and constants-specified concentration inequalities for the sum of independent sub-Weibull random variables, which leads to a mixture of two tails: sub-Gaussian for small deviations and sub-Weibull for large devi… ▽ More

    Submitted 26 June, 2022; v1 submitted 4 February, 2021; originally announced February 2021.

    MSC Class: 60E15; 62F25; 62F99

    Journal ref: Mathematics. 2022; 10(13):2252

  16. arXiv:2010.01652  [pdf, other

    cs.LG stat.ML

    FORK: A Forward-Looking Actor For Model-Free Reinforcement Learning

    Authors: Honghao Wei, Lei Ying

    Abstract: In this paper, we propose a new type of Actor, named forward-looking Actor or FORK for short, for Actor-Critic algorithms. FORK can be easily integrated into a model-free Actor-Critic algorithm. Our experiments on six Box2D and MuJoCo environments with continuous state and action spaces demonstrate significant performance improvement FORK can bring to the state-of-the-art algorithms. A variation o… ▽ More

    Submitted 29 September, 2021; v1 submitted 4 October, 2020; originally announced October 2020.

  17. arXiv:2009.06573  [pdf, other

    cs.AI cs.MM stat.ML

    Themes Informed Audio-visual Correspondence Learning

    Authors: Runze Su, Fei Tao, Xudong Liu, Haoran Wei, Xiaorong Mei, Zhiyao Duan, Lei Yuan, Ji Liu, Yuying Xie

    Abstract: The applications of short-term user-generated video (UGV), such as Snapchat, and Youtube short-term videos, booms recently, raising lots of multimodal machine learning tasks. Among them, learning the correspondence between audio and visual information from videos is a challenging one. Most previous work of the audio-visual correspondence(AVC) learning only investigated constrained videos or simple… ▽ More

    Submitted 19 October, 2020; v1 submitted 14 September, 2020; originally announced September 2020.

    Comments: Submitting to ICASSP 2021

  18. arXiv:2008.01613  [pdf, other

    cs.LG cs.HC stat.ML

    Peer-inspired Student Performance Prediction in Interactive Online Question Pools with Graph Neural Network

    Authors: Haotian Li, Huan Wei, Yong Wang, Yangqiu Song, Huamin Qu

    Abstract: Student performance prediction is critical to online education. It can benefit many downstream tasks on online learning platforms, such as estimating dropout rates, facilitating strategic intervention, and enabling adaptive online learning. Interactive online question pools provide students with interesting interactive questions to practice their knowledge in online education. However, little rese… ▽ More

    Submitted 15 August, 2020; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: 8 pages, 8 figures. Accepted at CIKM 2020

  19. arXiv:2003.02752  [pdf, other

    cs.CV cs.LG stat.ML

    Combating noisy labels by agreement: A joint training method with co-regularization

    Authors: Hongxin Wei, Lei Feng, Xiangyu Chen, Bo An

    Abstract: Deep Learning with noisy labels is a practically challenging problem in weakly supervised learning. The state-of-the-art approaches "Decoupling" and "Co-teaching+" claim that the "disagreement" strategy is crucial for alleviating the problem of learning with noisy labels. In this paper, we start from a different perspective and propose a robust learning paradigm called JoCoR, which aims to reduce… ▽ More

    Submitted 22 April, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

    Comments: Accepted by CVPR 2020; Code is available at: https://github.com/hongxin001/JoCoR. arXiv admin note: text overlap with arXiv:1901.04215 by other authors

  20. arXiv:2002.03328  [pdf, other

    cs.LG cs.CV stat.ML

    Kullback-Leibler Divergence-Based Out-of-Distribution Detection with Flow-Based Generative Models

    Authors: Yufeng Zhang, Jialu Pan, Wanwei Liu, Zhenbang Chen, Ji Wang, Zhiming Liu, Kenli Li, Hongmei Wei

    Abstract: Recent research has revealed that deep generative models including flow-based models and Variational Autoencoders may assign higher likelihoods to out-of-distribution (OOD) data than in-distribution (ID) data. However, we cannot sample OOD data from the model. This counterintuitive phenomenon has not been satisfactorily explained and brings obstacles to OOD detection with flow-based models. In thi… ▽ More

    Submitted 2 March, 2023; v1 submitted 9 February, 2020; originally announced February 2020.

  21. arXiv:2001.08877  [pdf, other

    math.ST cs.DC cs.IT cs.LG stat.ML

    Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms

    Authors: T. Tony Cai, Hongji Wei

    Abstract: We study distributed estimation of a Gaussian mean under communication constraints in a decision theoretical framework. Minimax rates of convergence, which characterize the tradeoff between the communication costs and statistical accuracy, are established in both the univariate and multivariate settings. Communication-efficient and statistically optimal procedures are developed. In the univariate… ▽ More

    Submitted 23 January, 2020; originally announced January 2020.

  22. arXiv:1906.02903  [pdf, other

    math.ST cs.LG stat.ME stat.ML

    Transfer Learning for Nonparametric Classification: Minimax Rate and Adaptive Classifier

    Authors: T. Tony Cai, Hongji Wei

    Abstract: Human learners have the natural ability to use knowledge gained in one setting for learning in a different but related setting. This ability to transfer knowledge from one task to another is essential for effective learning. In this paper, we study transfer learning in the context of nonparametric classification based on observations from different distributions under the posterior drift model, wh… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

  23. arXiv:1905.04722  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Phase Competition for Traffic Signal Control

    Authors: Guanjie Zheng, Yuanhao Xiong, Xinshi Zang, Jie Feng, Hua Wei, Huichu Zhang, Yong Li, Kai Xu, Zhenhui Li

    Abstract: Increasingly available city data and advanced learning techniques have empowered people to improve the efficiency of our city functions. Among them, improving the urban transportation efficiency is one of the most prominent topics. Recent studies have proposed to use reinforcement learning (RL) for traffic signal control. Different from traditional transportation approaches which rely heavily on p… ▽ More

    Submitted 12 May, 2019; originally announced May 2019.

  24. arXiv:1904.10642  [pdf, ps, other

    cs.LG stat.ML

    Towards Combining On-Off-Policy Methods for Real-World Applications

    Authors: Kai-Chun Hu, Chen-Huan Pi, Ting Han Wei, I-Chen Wu, Stone Cheng, Yi-Wei Dai, Wei-Yuan Ye

    Abstract: In this paper, we point out a fundamental property of the objective in reinforcement learning, with which we can reformulate the policy gradient objective into a perceptron-like loss function, removing the need to distinguish between on and off policy training. Namely, we posit that it is sufficient to only update a policy $π$ for cases that satisfy the condition $A(\fracπμ-1)\leq0$, where $A$ is… ▽ More

    Submitted 24 April, 2019; originally announced April 2019.

  25. arXiv:1904.08117  [pdf, other

    cs.LG cs.AI stat.ML

    A Survey on Traffic Signal Control Methods

    Authors: Hua Wei, Guanjie Zheng, Vikash Gayah, Zhenhui Li

    Abstract: Traffic signal control is an important and challenging real-world problem, which aims to minimize the travel time of vehicles by coordinating their movements at the road intersections. Current traffic signal control systems in use still rely heavily on oversimplified information and rule-based methods, although we now have richer data, more computing power and advanced methods to drive the develop… ▽ More

    Submitted 16 January, 2020; v1 submitted 17 April, 2019; originally announced April 2019.

    Comments: 32 pages

    MSC Class: 68Txx

  26. arXiv:1903.04887  [pdf, ps, other

    cs.SI cs.LG stat.ML

    QuickStop: A Markov Optimal Stop** Approach for Quickest Misinformation Detection

    Authors: Honghao Wei, Xiaohan Kang, Weina Wang, Lei Ying

    Abstract: This paper combines data-driven and model-driven methods for real-time misinformation detection. Our algorithm, named QuickStop, is an optimal stop** algorithm based on a probabilistic information spreading model obtained from labeled data. The algorithm consists of an offline machine learning algorithm for learning the probabilistic information spreading model and an online optimal stop** alg… ▽ More

    Submitted 5 December, 2019; v1 submitted 4 March, 2019; originally announced March 2019.

  27. arXiv:1809.08360  [pdf, other

    cs.LG stat.ML

    Comment on "All-optical machine learning using diffractive deep neural networks"

    Authors: Haiqing Wei, Gang Huang, Xiuqing Wei, Yanlong Sun, Hongbin Wang

    Abstract: Lin et al. (Reports, 7 September 2018, p. 1004) reported a remarkable proposal that employs a passive, strictly linear optical setup to perform pattern classifications. But interpreting the multilayer diffractive setup as a deep neural network and advocating it as an all-optical deep learning framework are not well justified and represent a mischaracterization of the system by overlooking its defi… ▽ More

    Submitted 20 November, 2018; v1 submitted 21 September, 2018; originally announced September 2018.

    Comments: 5 pages

  28. arXiv:1805.12243  [pdf, other

    cs.CV cs.LG stat.ML

    Novel Video Prediction for Large-scale Scene using Optical Flow

    Authors: Henglai Wei, Xiaochuan Yin, Penghong Lin

    Abstract: Making predictions of future frames is a critical challenge in autonomous driving research. Most of the existing methods for video prediction attempt to generate future frames in simple and fixed scenes. In this paper, we propose a novel and effective optical flow conditioned method for the task of video prediction with an application to complex urban scenes. In contrast with previous work, the pr… ▽ More

    Submitted 30 May, 2018; originally announced May 2018.