Skip to main content

Showing 1–50 of 108 results for author: He, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.13353  [pdf, other

    stat.ME stat.ML

    Adaptive Bayesian Multivariate Spline Knot Inference with Prior Specifications on Model Complexity

    Authors: Junhui He, Ying Yang, Jian Kang

    Abstract: In multivariate spline regression, the number and locations of knots influence the performance and interpretability significantly. However, due to non-differentiability and varying dimensions, there is no desirable frequentist method to make inference on knots. In this article, we propose a fully Bayesian approach for knot inference in multivariate spline regression. The existing Bayesian method o… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  2. arXiv:2405.13342  [pdf, other

    stat.ME math.ST

    Scalable Bayesian inference for heat kernel Gaussian processes on manifolds

    Authors: Junhui He, Guoxuan Ma, Jian Kang, Ying Yang

    Abstract: We develop scalable manifold learning methods and theory, motivated by the problem of estimating manifold of fMRI activation in the Human Connectome Project (HCP). We propose the Fast Graph Laplacian Estimation for Heat Kernel Gaussian Processes (FLGP) in the natural exponential family model. FLGP handles large sample sizes $ n $, preserves the intrinsic geometry of data, and significantly reduces… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  3. arXiv:2405.02810  [pdf, other

    math.NA stat.ML

    Adaptive deep density approximation for stochastic dynamical systems

    Authors: Junjie He, Qifeng Liao, Xiaoliang Wan

    Abstract: In this paper we consider adaptive deep neural network approximation for stochastic dynamical systems. Based on the Liouville equation associated with the stochastic dynamical systems, a new temporal KRnet (tKRnet) is proposed to approximate the probability density functions (PDFs) of the state variables. The tKRnet gives an explicit density model for the solution of the Liouville equation, which… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 24 pages, 13 figures

    MSC Class: 34F05; 60H35; 62M45; 65C30

  4. arXiv:2405.00576  [pdf, other

    q-fin.RM stat.ME

    Calibration of the rating transition model for high and low default portfolios

    Authors: Jian He, Asma Khedher, Peter Spreij

    Abstract: In this paper we develop Maximum likelihood (ML) based algorithms to calibrate the model parameters in credit rating transition models. Since the credit rating transition models are not Gaussian linear models, the celebrated Kalman filter is not suitable to compute the likelihood of observed migrations. Therefore, we develop a Laplace approximation of the likelihood function and as a result the Ka… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    MSC Class: 91G40; 91G60; 65D15

  5. arXiv:2404.12648  [pdf, ps, other

    cs.LG stat.ML

    Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation

    Authors: Jianliang He, Han Zhong, Zhuoran Yang

    Abstract: We study infinite-horizon average-reward Markov decision processes (AMDPs) in the context of general function approximation. Specifically, we propose a novel algorithmic framework named Local-fitted Optimization with OPtimism (LOOP), which incorporates both model-based and value-based incarnations. In particular, LOOP features a novel construction of confidence sets and a low-switching policy upda… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: ICLR 2024

  6. arXiv:2404.08164  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    Language Model Prompt Selection via Simulation Optimization

    Authors: Haoting Zhang, **ghai He, Rhonda Righter, Zeyu Zheng

    Abstract: With the advancement in generative language models, the selection of prompts has gained significant attention in recent years. A prompt is an instruction or description provided by the user, serving as a guide for the generative language model in content generation. Despite existing methods for prompt selection that are based on human labor, we consider facilitating this selection through simulati… ▽ More

    Submitted 19 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  7. arXiv:2402.09401  [pdf, other

    cs.LG cs.AI cs.CL math.OC stat.ML

    Reinforcement Learning from Human Feedback with Active Queries

    Authors: Kaixuan Ji, Jiafan He, Quanquan Gu

    Abstract: Aligning large language models (LLM) with human preference plays a key role in building modern generative models and can be achieved by reinforcement learning from human feedback (RLHF). Despite their superior performance, current RLHF approaches often require a large amount of human-labelled preference data, which is expensive to collect. In this paper, inspired by the success of active learning,… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 28 pages, 1 figure, 4 table

  8. arXiv:2402.08998  [pdf, other

    cs.LG stat.ML

    Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path

    Authors: Qiwei Di, Jiafan He, Dongruo Zhou, Quanquan Gu

    Abstract: We study the Stochastic Shortest Path (SSP) problem with a linear mixture transition kernel, where an agent repeatedly interacts with a stochastic environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the cost function or an upper bound of the expected length for the optimal policy. In this paper, we p… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 28 pages, 1 figure, In ICML 2023

  9. arXiv:2402.08991  [pdf, ps, other

    stat.ML cs.LG

    Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption

    Authors: Chenlu Ye, Jiafan He, Quanquan Gu, Tong Zhang

    Abstract: This study tackles the challenges of adversarial corruption in model-based reinforcement learning (RL), where the transition dynamics can be corrupted by an adversary. Existing studies on corruption-robust RL mostly focus on the setting of model-free RL, where robust least-square regression is often employed for value function estimation. However, these techniques cannot be directly applied to mod… ▽ More

    Submitted 14 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  10. arXiv:2402.02357  [pdf, other

    cs.LG stat.ME

    Multi-modal Causal Structure Learning and Root Cause Analysis

    Authors: Lecheng Zheng, Zhengzhang Chen, **grui He, Haifeng Chen

    Abstract: Effective root cause analysis (RCA) is vital for swiftly restoring services, minimizing losses, and ensuring the smooth operation and management of complex systems. Previous data-driven RCA methods, particularly those employing causal discovery techniques, have primarily focused on constructing dependency or causal graphs for backtracking the root causes. However, these methods often fall short as… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted by the Web Conference 2024

  11. arXiv:2402.00152  [pdf, other

    cs.LG math.NA stat.ML

    Deeper or Wider: A Perspective from Optimal Generalization Error with Sobolev Loss

    Authors: Yahong Yang, Juncai He

    Abstract: Constructing the architecture of a neural network is a challenging pursuit for the machine learning community, and the dilemma of whether to go deeper or wider remains a persistent question. This paper explores a comparison between deeper neural networks (DeNNs) with a flexible number of layers and wider neural networks (WeNNs) with limited hidden layers, focusing on their optimal generalization e… ▽ More

    Submitted 12 May, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.10766, arXiv:2305.08466

    MSC Class: 68T05

  12. arXiv:2401.04933  [pdf, other

    cs.LG stat.ML

    Rethinking Test-time Likelihood: The Likelihood Path Principle and Its Application to OOD Detection

    Authors: Sicong Huang, Jiawei He, Kry Yik Chau Lui

    Abstract: While likelihood is attractive in theory, its estimates by deep generative models (DGMs) are often broken in practice, and perform poorly for out of distribution (OOD) Detection. Various recent works started to consider alternative scores and achieved better performances. However, such recipes do not come with provable guarantees, nor is it clear that their choices extract sufficient information.… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  13. arXiv:2401.00085  [pdf, other

    cs.CE stat.CO

    A dimension reduction approach for loss valuation in credit risk modelling

    Authors: Jian He, Asma Khedher, Peter Spreij

    Abstract: This paper addresses the ``curse of dimensionality'' in the loss valuation of credit risk models. A dimension reduction methodology based on the Bayesian filter and smoother is proposed. This methodology is designed to achieve a fast and accurate loss valuation algorithm in credit risk modelling, but it can also be extended to valuation models of other risk types. The proposed methodology is gener… ▽ More

    Submitted 29 December, 2023; originally announced January 2024.

    Comments: 43 pages

    MSC Class: 62P05; 91G40

  14. arXiv:2312.07145  [pdf, other

    cs.LG stat.ML

    Contextual Bandits with Online Neural Regression

    Authors: Rohan Deb, Yikun Ban, Shiliang Zuo, **grui He, Arindam Banerjee

    Abstract: Recent works have shown a reduction from contextual bandits to online regression under a realizability assumption [Foster and Rakhlin, 2020, Foster and Krishnamurthy, 2021]. In this work, we investigate the use of neural networks for such online regression and associated Neural Contextual Bandits (NeuCBs). Using existing results for wide networks, one can readily show a ${\mathcal{O}}(\sqrt{T})$ r… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  15. arXiv:2311.15238  [pdf, other

    cs.LG math.OC stat.ML

    A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation

    Authors: Heyang Zhao, Jiafan He, Quanquan Gu

    Abstract: The exploration-exploitation dilemma has been a central challenge in reinforcement learning (RL) with complex model classes. In this paper, we propose a new algorithm, Monotonic Q-Learning with Upper Confidence Bound (MQL-UCB) for RL with general function approximation. Our key algorithmic design includes (1) a general deterministic policy-switching strategy that achieves low switching cost, (2) a… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

    Comments: 52 pages, 1 table

  16. arXiv:2311.01709  [pdf, other

    stat.ME stat.ML

    Causal inference with Machine Learning-Based Covariate Representation

    Authors: Yuhang Wu, **ghai He, Zeyu Zheng

    Abstract: Utilizing covariate information has been a powerful approach to improve the efficiency and accuracy for causal inference, which support massive amount of randomized experiments run on data-driven enterprises. However, state-of-art approaches can become practically unreliable when the dimension of covariate increases to just 50, whereas experiments on large platforms can observe even higher dimensi… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  17. arXiv:2310.01380  [pdf, other

    cs.LG math.OC stat.ML

    Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning

    Authors: Qiwei Di, Heyang Zhao, Jiafan He, Quanquan Gu

    Abstract: Offline reinforcement learning (RL), where the agent aims to learn the optimal policy based on the data collected by a behavior policy, has attracted increasing attention in recent years. While offline RL with linear function approximation has been extensively studied with optimal results achieved under certain assumptions, many works shift their interest to offline RL with non-linear function app… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: 43 pages, 1 table

  18. arXiv:2308.11026  [pdf, other

    stat.ME

    Harnessing The Collective Wisdom: Fusion Learning Using Decision Sequences From Diverse Sources

    Authors: Trambak Banerjee, Bowen Gang, Jianliang He

    Abstract: Learning from the collective wisdom of crowds enhances the transparency of scientific findings by incorporating diverse perspectives into the decision-making process. Synthesizing such collective wisdom is related to the statistical notion of fusion learning from multiple data sources or studies. However, fusing inferences from diverse sources is challenging since cross-source heterogeneity and po… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: 29 pages and 10 figures. Under review at a journal

  19. arXiv:2305.19185  [pdf, other

    cs.LG cs.IT stat.ML

    Compression with Bayesian Implicit Neural Representations

    Authors: Zongyu Guo, Gergely Flamich, Jiajun He, Zhibo Chen, José Miguel Hernández-Lobato

    Abstract: Many common types of data can be represented as functions that map coordinates to signal values, such as pixel locations to RGB values in the case of an image. Based on this view, data can be compressed by overfitting a compact neural network to its functional representation and then encoding the network weights. However, most current solutions for this are inefficient, as quantization to low-bit… ▽ More

    Submitted 29 October, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted as a Spotlight paper in NeurIPS 2023. Updated camera-ready version

  20. arXiv:2305.08359  [pdf, other

    cs.LG math.OC stat.ML

    Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs

    Authors: Kaixuan Ji, Qingyue Zhao, Jiafan He, Weitong Zhang, Quanquan Gu

    Abstract: Recent studies have shown that episodic reinforcement learning (RL) is no harder than bandits when the total reward is bounded by $1$, and proved regret bounds that have a polylogarithmic dependence on the planning horizon $H$. However, it remains an open question that if such results can be carried over to adversarial RL, where the reward is adversarially chosen at each episode. In this paper, we… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: 34 pages

  21. arXiv:2305.08350  [pdf, other

    cs.LG math.OC stat.ML

    Uniform-PAC Guarantees for Model-Based RL with Bounded Eluder Dimension

    Authors: Yue Wu, Jiafan He, Quanquan Gu

    Abstract: Recently, there has been remarkable progress in reinforcement learning (RL) with general function approximation. However, all these works only provide regret or sample complexity guarantees. It is still an open question if one can achieve stronger performance guarantees, i.e., the uniform probably approximate correctness (Uniform-PAC) guarantee that can imply both a sub-linear regret bound and a p… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: 21 pages, 1 table. To appear in UAI 2023

  22. arXiv:2303.09390  [pdf, other

    cs.LG stat.ML

    On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits

    Authors: Weitong Zhang, Jiafan He, Zhiyuan Fan, Quanquan Gu

    Abstract: We study linear contextual bandits in the misspecified setting, where the expected reward function can be approximated by a linear function class up to a bounded misspecification level $ζ>0$. We propose an algorithm based on a novel data selection scheme, which only selects the contextual vectors with large uncertainty for online regression. We show that, when the misspecification level $ζ$ is dom… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 28 pages, 2 figures, 2 tables

  23. arXiv:2303.03582  [pdf, other

    stat.ME math.ST stat.AP

    Statistical inferences for complex dependence of multimodal imaging data

    Authors: **yuan Chang, **g He, Jian Kang, Mingcong Wu

    Abstract: Statistical analysis of multimodal imaging data is a challenging task, since the data involves high-dimensionality, strong spatial correlations and complex data structures. In this paper, we propose rigorous statistical testing procedures for making inferences on the complex dependence of multimodal imaging data. Motivated by the analysis of multi-task fMRI data in the Human Connectome Project (HC… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  24. arXiv:2302.10371  [pdf, other

    cs.LG math.OC stat.ML

    Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency

    Authors: Heyang Zhao, Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu

    Abstract: Recently, several studies (Zhou et al., 2021a; Zhang et al., 2021b; Kim et al., 2021; Zhou and Gu, 2022) have provided variance-dependent regret bounds for linear contextual bandits, which interpolates the regret for the worst-case regime and the deterministic reward regime. However, these algorithms are either computationally intractable or unable to handle unknown variance of the noise. In this… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: 43 pages, 2 tables

  25. arXiv:2301.01107  [pdf

    stat.CO cs.LG

    Computing the Performance of A New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

    Authors: James K. He, Sofía S. Villar, Lida Mavrogonatou

    Abstract: Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP tha… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

    Comments: Accepted by Computing Conference, London 2023

  26. arXiv:2212.06132  [pdf, ps, other

    cs.LG math.OC stat.ML

    Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

    Authors: Jiafan He, Heyang Zhao, Dongruo Zhou, Quanquan Gu

    Abstract: We study reinforcement learning (RL) with linear function approximation. For episodic time-inhomogeneous linear Markov decision processes (linear MDPs) whose transition probability can be parameterized as a linear function of a given feature map**, we propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret $\tilde O(d\sqrt{H^3K})$, where $d$ is the d… ▽ More

    Submitted 3 November, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

    Comments: 33 pages, 1 table. In ICML 2023

  27. arXiv:2212.01539  [pdf, other

    cs.LG stat.ML

    Exploring the Limits of Differentially Private Deep Learning with Group-wise Clip**

    Authors: Jiyan He, Xuechen Li, Da Yu, Huishuai Zhang, Janardhan Kulkarni, Yin Tat Lee, Arturs Backurs, Nenghai Yu, Jiang Bian

    Abstract: Differentially private deep learning has recently witnessed advances in computational efficiency and privacy-utility trade-off. We explore whether further improvements along the two axes are possible and provide affirmative answers leveraging two instantiations of \emph{group-wise clip**}. To reduce the compute time overhead of private learning, we show that \emph{per-layer clip**}, where the… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

    Comments: 25 pages

  28. arXiv:2210.00423  [pdf, other

    cs.LG stat.ML

    Improved Algorithms for Neural Active Learning

    Authors: Yikun Ban, Yuheng Zhang, Hanghang Tong, Arindam Banerjee, **grui He

    Abstract: We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. In particular, we introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work. Then, the proposed algorithm leverages the powerful representation o… ▽ More

    Submitted 16 January, 2023; v1 submitted 2 October, 2022; originally announced October 2022.

    Comments: Published on NeurIPS 2022

  29. arXiv:2209.06998  [pdf, other

    stat.ML cs.LG

    Stochastic Tree Ensembles for Estimating Heterogeneous Effects

    Authors: Nikolay Krantsevich, **gyu He, P. Richard Hahn

    Abstract: Determining subgroups that respond especially well (or poorly) to specific interventions (medical or policy) requires new supervised learning methods tailored specifically for causal inference. Bayesian Causal Forest (BCF) is a recent method that has been documented to perform well on data generating processes with strong confounding of the sort that is plausible in many applications. This paper d… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

    Comments: 12 pages, 1 figure

  30. arXiv:2207.03106  [pdf, other

    cs.LG stat.ML

    A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits

    Authors: Jiafan He, Tianhao Wang, Yifei Min, Quanquan Gu

    Abstract: We study federated contextual linear bandits, where $M$ agents cooperate with each other to solve a global contextual linear bandit problem with the help of a central server. We consider the asynchronous setting, where all agents work independently and the communication between one agent and the server will not trigger other agents' communication. We propose a simple algorithm named \texttt{FedLin… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: 25 pages, 1 figure, 2 tables

  31. Neural Bandit with Arm Group Graph

    Authors: Yunzhe Qi, Yikun Ban, **grui He

    Abstract: Contextual bandits aim to identify among a set of arms the optimal one with the highest reward based on their contextual information. Motivated by the fact that the arms usually exhibit group behaviors and the mutual impacts exist among groups, we introduce a new model, Arm Group Graph (AGG), where the nodes represent the groups of arms and the weighted edges formulate the correlations among group… ▽ More

    Submitted 9 June, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: Accepted to SIGKDD 2022

  32. arXiv:2205.06811  [pdf, other

    cs.LG stat.ML

    Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions

    Authors: Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu

    Abstract: We study the linear contextual bandit problem in the presence of adversarial corruption, where the reward at each round is corrupted by an adversary, and the corruption level (i.e., the sum of corruption magnitudes over the horizon) is $C\geq 0$. The best-known algorithms in this setting are limited in that they either are computationally inefficient or require a strong assumption on the corruptio… ▽ More

    Submitted 9 July, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

    Comments: 25 pages, 1 table. This version simplifies the proof of the regret upper bound in Version 1, and provides a stronger result for the lower bound

  33. arXiv:2204.10963  [pdf, other

    stat.ME econ.EM stat.CO stat.ML

    Local Gaussian process extrapolation for BART models with applications to causal inference

    Authors: Meijiang Wang, **gyu He, P. Richard Hahn

    Abstract: Bayesian additive regression trees (BART) is a semi-parametric regression model offering state-of-the-art performance on out-of-sample prediction. Despite this success, standard implementations of BART typically provide inaccurate prediction and overly narrow prediction intervals at points outside the range of the training data. This paper proposes a novel extrapolation strategy that grafts Gaussi… ▽ More

    Submitted 24 February, 2023; v1 submitted 22 April, 2022; originally announced April 2022.

  34. Network of Low-cost Air Quality Sensor for Monitoring Indoor, Outdoor, and Personal PM2.5 Exposure in Seattle during the 2020 Wildfire Season

    Authors: Jiayang He, Ching-Hsuan Huang, Nanhsun Yuan, Elena Austin, Edmund Seto, Igor Novosselov

    Abstract: The increased frequency of wildfires in the Western United States has raised public concerns. Exposure to wildfire smoke has been linked to an increased risk of cancer and cardiorespiratory morbidity. Evidence-driven interventions can alleviate the adverse health impact of wildfire smoke. Public health guidance during wildfires is based on regional air quality data with limited spatiotemporal reso… ▽ More

    Submitted 26 March, 2022; originally announced March 2022.

  35. arXiv:2203.00614  [pdf, other

    cs.LG math.NA stat.ML

    Side Effects of Learning from Low-dimensional Data Embedded in a Euclidean Space

    Authors: Juncai He, Richard Tsai, Rachel Ward

    Abstract: The low-dimensional manifold hypothesis posits that the data found in many applications, such as those involving natural images, lie (approximately) on low-dimensional manifolds embedded in a high-dimensional Euclidean space. In this setting, a typical neural network defines a function that takes a finite number of vectors in the embedding space as input. However, one often needs to consider evalu… ▽ More

    Submitted 4 February, 2023; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: 53 pages (11 pages for Appendix), 24 figures

  36. arXiv:2202.13603  [pdf, other

    cs.LG math.OC stat.ML

    Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits

    Authors: Heyang Zhao, Dongruo Zhou, Jiafan He, Quanquan Gu

    Abstract: We study the problem of online generalized linear regression in the stochastic setting, where the label is generated from a generalized linear model with possibly unbounded additive noise. We provide a sharp analysis of the classical follow-the-regularized-leader (FTRL) algorithm to cope with the label noise. More specifically, for $σ$-sub-Gaussian label noise, our analysis provides a regret upper… ▽ More

    Submitted 27 March, 2023; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: 27 pages, 3 figures. In this updated version, we have changed the paper title, added new theoretical results on the FTRL algorithm and mainly focused on stochastic online regression. Refer to arXiv:2202.13603v1 for the previous version, which contains more results on heteroscedastic nonlinear bandits

  37. arXiv:2201.05759  [pdf, other

    cs.LG cs.CY stat.ML

    FairIF: Boosting Fairness in Deep Learning via Influence Functions with Validation Set Sensitive Attributes

    Authors: Haonan Wang, Ziwei Wu, **grui He

    Abstract: Most fair machine learning methods either highly rely on the sensitive information of the training samples or require a large modification on the target models, which hinders their practical application. To address this issue, we propose a two-stage training algorithm named FAIRIF. It minimizes the loss over the reweighted data set (second stage) where the sample weights are computed to balance th… ▽ More

    Submitted 23 December, 2023; v1 submitted 15 January, 2022; originally announced January 2022.

  38. arXiv:2201.01051  [pdf

    cs.CR eess.SP stat.ML

    Open Access Dataset for Electromyography based Multi-code Biometric Authentication

    Authors: Ashirbad Pradhan, Jiayuan He, Ning Jiang

    Abstract: Recently, surface electromyogram (EMG) has been proposed as a novel biometric trait for addressing some key limitations of current biometrics, such as spoofing and liveness. The EMG signals possess a unique characteristic: they are inherently different for individuals (biometrics), and they can be customized to realize multi-length codes or passwords (for example, by performing different gestures)… ▽ More

    Submitted 5 January, 2022; v1 submitted 4 January, 2022; originally announced January 2022.

    Comments: manuscript for open access dataset (paper and appendix)

    Journal ref: Sci Data 9, 733 (2022)

  39. Modelling matrix time series via a tensor CP-decomposition

    Authors: **yuan Chang, **g He, Lin Yang, Qiwei Yao

    Abstract: We consider to model matrix time series based on a tensor CP-decomposition. Instead of using an iterative algorithm which is the standard practice for estimating CP-decompositions, we propose a new and one-pass estimation procedure based on a generalized eigenanalysis constructed from the serial dependence structure of the underlying process. To overcome the intricacy of solving a rank-reduced gen… ▽ More

    Submitted 25 July, 2022; v1 submitted 31 December, 2021; originally announced December 2021.

    Journal ref: Journal of the Royal Statistical Society Series B 2023, Vol. 85, pp. 127-148

  40. arXiv:2110.12727  [pdf, other

    cs.LG math.OC stat.ML

    Learning Stochastic Shortest Path with Linear Function Approximation

    Authors: Yifei Min, Jiafan He, Tianhao Wang, Quanquan Gu

    Abstract: We study the stochastic shortest path (SSP) problem in reinforcement learning with linear function approximation, where the transition kernel is represented as a linear mixture of unknown models. We call this class of SSP problems as linear mixture SSPs. We propose a novel algorithm with Hoeffding-type confidence sets for learning the linear mixture SSP, which can attain an… ▽ More

    Submitted 5 July, 2022; v1 submitted 25 October, 2021; originally announced October 2021.

    Comments: 46 pages, 1 figure. In ICML 2022

  41. arXiv:2110.10133  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

    Authors: Chonghua Liao, Jiafan He, Quanquan Gu

    Abstract: Reinforcement learning (RL) algorithms can be used to provide personalized services, which rely on users' private and sensitive data. To protect the users' privacy, privacy-preserving RL algorithms are in demand. In this paper, we study RL with linear function approximation and local differential privacy (LDP) guarantees. We propose a novel $(\varepsilon, δ)$-LDP algorithm for learning a class of… ▽ More

    Submitted 19 October, 2021; originally announced October 2021.

    Comments: 25 pages, 2 figures

  42. arXiv:2110.03833  [pdf, other

    stat.ME

    A Maximum Weighted Logrank Test in Detecting Crossing Hazards

    Authors: Huan Cheng, Jianghua He

    Abstract: In practice, the logrank test is the most widely used method for testing the equality of survival distributions. It is the optimal method under the proportional hazard assumption. However, since non-proportional hazards are often encountered in oncology trials, alternative tests have been proposed. The maximum weighted logrank test was shown to be robust in general situations. In this manuscript,… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: 22 pages, 6 figures

  43. arXiv:2110.03177  [pdf, other

    cs.LG stat.ML

    EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits

    Authors: Yikun Ban, Yuchen Yan, Arindam Banerjee, **grui He

    Abstract: In this paper, we propose a novel neural exploration strategy in contextual bandits, EE-Net, distinct from the standard UCB-based and TS-based approaches. Contextual multi-armed bandits have been studied for decades with various applications. To solve the exploitation-exploration tradeoff in bandits, there are three main techniques: epsilon-greedy, Thompson Sampling (TS), and Upper Confidence Boun… ▽ More

    Submitted 12 May, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: Published on ICLR 2022

  44. Approximation Properties of Deep ReLU CNNs

    Authors: Juncai He, Lin Li, **chao Xu

    Abstract: This paper focuses on establishing $L^2$ approximation properties for deep ReLU convolutional neural networks (CNNs) in two-dimensional space. The analysis is based on a decomposition theorem for convolutional kernels with a large spatial size and multi-channels. Given the decomposition result, the property of the ReLU activation function, and a specific structure for channels, a universal approxi… ▽ More

    Submitted 26 June, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

    Comments: 30 pages

    MSC Class: 41A30; 68T07; 65D40

    Journal ref: Research in the Mathematical Sciences, 2022

  45. arXiv:2106.11935  [pdf, other

    cs.LG math.OC stat.ML

    Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL

    Authors: Weitong Zhang, Jiafan He, Dongruo Zhou, Amy Zhang, Quanquan Gu

    Abstract: The success of deep reinforcement learning (DRL) lies in its ability to learn a representation that is well-suited for the exploration and exploitation task. To understand how the choice of representation can improve the efficiency of reinforcement learning (RL), we study representation selection for a class of low-rank Markov Decision Processes (MDPs) where the transition kernel can be represente… ▽ More

    Submitted 14 February, 2024; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: 32 pages, 2 figures, 7 tables, In UAI 2023

  46. arXiv:2106.11612  [pdf, ps, other

    cs.LG math.OC stat.ML

    Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation

    Authors: Jiafan He, Dongruo Zhou, Quanquan Gu

    Abstract: We study reinforcement learning (RL) with linear function approximation. Existing algorithms for this problem only have high-probability regret and/or Probably Approximately Correct (PAC) sample complexity guarantees, which cannot guarantee the convergence to the optimal policy. In this paper, in order to overcome the limitation of existing algorithms, we propose a new algorithm called FLUTE, whic… ▽ More

    Submitted 31 December, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: 27 pages. In NeurIPS 2021

  47. arXiv:2106.01906   

    stat.ME stat.CO stat.ML

    Bayesian Inference for Gamma Models

    Authors: **gyu He, Nicholas Polson, Jianeng Xu

    Abstract: We use the theory of normal variance-mean mixtures to derive a data augmentation scheme for models that include gamma functions. Our methodology applies to many situations in statistics and machine learning, including Multinomial-Dirichlet distributions, Negative binomial regression, Poisson-Gamma hierarchical models, Extreme value models, to name but a few. All of those models include a gamma fun… ▽ More

    Submitted 21 June, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

    Comments: Duplicate submission of arXiv:1905.12141 Please check arXiv:1905.12141 for future update

  48. arXiv:2102.12679  [pdf, other

    cs.LG stat.ML

    Variational Selective Autoencoder: Learning from Partially-Observed Heterogeneous Data

    Authors: Yu Gong, Hossein Hajimirsadeghi, Jiawei He, Thibaut Durand, Greg Mori

    Abstract: Learning from heterogeneous data poses challenges such as combining data from various sources and of different types. Meanwhile, heterogeneous data are often associated with missingness in real-world applications due to heterogeneity and noise of input sources. In this work, we propose the variational selective autoencoder (VSAE), a general framework to learn representations from partially-observe… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: International Conference on Artificial Intelligence and Statistics (AISTATS) 2021

  49. arXiv:2102.08940  [pdf, other

    cs.LG math.OC stat.ML

    Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs

    Authors: Jiafan He, Dongruo Zhou, Quanquan Gu

    Abstract: Learning Markov decision processes (MDPs) in the presence of the adversary is a challenging problem in reinforcement learning (RL). In this paper, we study RL in episodic MDPs with adversarial reward and full information feedback, where the unknown transition probability function is a linear function of a given feature map**, and the reward function can change arbitrarily episode by episode. We… ▽ More

    Submitted 20 April, 2022; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: 22 pages, 1 figure. In AISTATS 2022

  50. arXiv:2102.06539  [pdf, other

    cs.LG cs.AI stat.ML

    Jacobian Determinant of Normalizing Flows

    Authors: Huadong Liao, Jiawei He

    Abstract: Normalizing flows learn a diffeomorphic map** between the target and base distribution, while the Jacobian determinant of that map** forms another real-valued function. In this paper, we show that the Jacobian determinant map** is unique for the given distributions, hence the likelihood objective of flows has a unique global optimum. In particular, the likelihood for a class of flows is expl… ▽ More

    Submitted 17 February, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: 14 pages