Skip to main content

Showing 1–33 of 33 results for author: Lian, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.12212  [pdf, other

    stat.AP stat.ME

    Identifying Genetic Variants for Obesity Incorporating Prior Insights: Quantile Regression with Insight Fusion for Ultra-high Dimensional Data

    Authors: Jiantong Wang, Heng Lian, Yan Yu, He** Zhang

    Abstract: Obesity is widely recognized as a critical and pervasive health concern. We strive to identify important genetic risk factors from hundreds of thousands of single nucleotide polymorphisms (SNPs) for obesity. We propose and apply a novel Quantile Regression with Insight Fusion (QRIF) approach that can integrate insights from established studies or domain knowledge to simultaneously select variables… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: This article is submitted to Journal of the American Statistical Association

  2. arXiv:2405.14652  [pdf, ps, other

    stat.ME

    Statistical inference for high-dimensional convoluted rank regression

    Authors: Leheng Cai, Xu Guo, Heng Lian, Li** Zhu

    Abstract: High-dimensional penalized rank regression is a powerful tool for modeling high-dimensional data due to its robustness and estimation efficiency. However, the non-smoothness of the rank loss brings great challenges to the computation. To solve this critical issue, high-dimensional convoluted rank regression is recently proposed, and penalized convoluted rank regression estimators are introduced. H… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  3. arXiv:2405.02539  [pdf, ps, other

    stat.ME

    Distributed Iterative Hard Thresholding for Variable Selection in Tobit Models

    Authors: Changxin Yang, Zhongyi Zhu, Heng Lian

    Abstract: While extensive research has been conducted on high-dimensional data and on regression with left-censored responses, simultaneously addressing these complexities remains challenging, with only a few proposed methods available. In this paper, we utilize the Iterative Hard Thresholding (IHT) algorithm on the Tobit model in such a setting. Theoretical analysis demonstrates that our estimator converge… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  4. arXiv:2105.01278  [pdf, other

    stat.ME

    Nonparametric Quantile Regression for Homogeneity Pursuit in Panel Data Models

    Authors: Xiaoyu Zhang, Di Wang, Heng Lian, Guodong Li

    Abstract: Many panel data have the latent subgroup effect on individuals, and it is important to correctly identify these groups since the efficiency of resulting estimators can be improved significantly by pooling the information of individuals within each group. However, the currently assumed parametric and semiparametric relationship between the response and predictors may be misspecified, which leads to… ▽ More

    Submitted 22 August, 2022; v1 submitted 3 May, 2021; originally announced May 2021.

    Comments: To appear at the Journal of Business & Economic Statistics

  5. arXiv:1909.06624  [pdf, other

    stat.ME

    High-dimensional vector autoregressive time series modeling via tensor decomposition

    Authors: Di Wang, Yao Zheng, Heng Lian, Guodong Li

    Abstract: The classical vector autoregressive model is a fundamental tool for multivariate time series analysis. However, it involves too many parameters when the number of time series and lag order are even moderately large. This paper proposes to rearrange the transition matrices of the model into a tensor form such that the parameter space can be restricted along three directions simultaneously via tenso… ▽ More

    Submitted 3 November, 2020; v1 submitted 14 September, 2019; originally announced September 2019.

  6. arXiv:1802.03511  [pdf, other

    stat.ME

    A General Framework For Frequentist Model Averaging

    Authors: Priyam Mitra, Heng Lian, Ritwik Mitra, Hua Liang, Min-ge Xie

    Abstract: Model selection strategies have been routinely employed to determine a model for data analysis in statistics, and further study and inference then often proceed as though the selected model were the true model that were known a priori. This practice does not account for the uncertainty introduced by the selection process and the fact that the selected model can possibly be a wrong one. Model avera… ▽ More

    Submitted 9 February, 2018; originally announced February 2018.

  7. arXiv:1708.05487  [pdf, ps, other

    stat.ML

    Debiased distributed learning for sparse partial linear models in high dimensions

    Authors: Shaogao Lv, Heng Lian

    Abstract: Although various distributed machine learning schemes have been proposed recently for pure linear models and fully nonparametric models, little attention has been paid on distributed optimization for semi-paramemetric models with multiple-level structures (e.g. sparsity, linearity and nonlinearity). To address these issues, the current paper proposes a new communication-efficient distributed learn… ▽ More

    Submitted 3 November, 2019; v1 submitted 17 August, 2017; originally announced August 2017.

  8. arXiv:1701.03772  [pdf, other

    stat.ME math.ST

    Additive Partially Linear Models for Massive Heterogeneous Data

    Authors: Binhuan Wang, Yixin Fang, Heng Lian, Hua Liang

    Abstract: We consider an additive partially linear framework for modelling massive heterogeneous data. The major goal is to extract multiple common features simultaneously across all sub-populations while exploring heterogeneity of each sub-population. We propose an aggregation type of estimators for the commonality parameters that possess the asymptotic optimal bounds and the asymptotic distributions as if… ▽ More

    Submitted 28 December, 2018; v1 submitted 13 January, 2017; originally announced January 2017.

  9. arXiv:1511.01124  [pdf, ps, other

    stat.ME

    Greedy Forward Regression for Variable Screening

    Authors: Ming-Yen Cheng, Sanying Feng, Gaorong Li, Heng Lian

    Abstract: Two popular variable screening methods under the ultra-high dimensional setting with the desirable sure screening property are the sure independence screening (SIS) and the forward regression (FR). Both are classical variable screening methods and recently have attracted greater attention under the new light of high-dimensional data analysis. We consider a new and simple screening method that inco… ▽ More

    Submitted 3 November, 2015; originally announced November 2015.

  10. arXiv:1402.1649  [pdf, ps, other

    stat.ME

    Variable Selection and Estimation for Partially Linear Single-index Models with Longitudinal Data

    Authors: Gaorong Li, Peng Lai, Heng Lian

    Abstract: In this paper, we consider the partially linear single-index models with longitudinal data. To deal with the variable selection problem in this context, we propose a penalized procedure combined with two bias correction methods, resulting in the bias-corrected generalized estimating equation (GEE) and the bias-corrected quadratic inference function (QIF), which can take into account the correlatio… ▽ More

    Submitted 7 February, 2014; originally announced February 2014.

    Comments: to appear in Statistics and Computing

  11. Letter to the Editor

    Authors: Yuao Hu, Ye Tian, Heng Lian

    Abstract: The paper by Alfons, Croux and Gelper (2013), Sparse least trimmed squares regression for analyzing high-dimensional large data sets, considered a combination of least trimmed squares (LTS) and lasso penalty for robust and sparse high-dimensional regression. In a recent paper [She and Owen (2011)], a method for outlier detection based on a sparsity penalty on the mean shift parameter was proposed… ▽ More

    Submitted 9 December, 2013; originally announced December 2013.

    Comments: Published in at http://dx.doi.org/10.1214/13-AOAS640 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS640

    Journal ref: Annals of Applied Statistics 2013, Vol. 7, No. 2, 1244-1246

  12. arXiv:1309.6058  [pdf, ps, other

    stat.ME

    Reduced-rank Regression in Sparse Multivariate Varying-Coefficient Models with High-dimensional Covariates

    Authors: Heng Lian, Shujie Ma

    Abstract: In genetic studies, not only can the number of predictors obtained from microarray measurements be extremely large, there can also be multiple response variables. Motivated by such a situation, we consider semiparametric dimension reduction methods in sparse multivariate regression models. Previous studies on joint variable and rank selection have focused on parametric models while here we conside… ▽ More

    Submitted 24 September, 2013; originally announced September 2013.

  13. arXiv:1307.2668  [pdf, ps, other

    stat.CO stat.ME

    Bayesian Quantile Regression for Partially Linear Additive Models

    Authors: Yuao Hu, Kaifeng Zhao, Heng Lian

    Abstract: In this article, we develop a semiparametric Bayesian estimation and model selection approach for partially linear additive models in conditional quantile regression. The asymmetric Laplace distribution provides a mechanism for Bayesian inferences of quantile regression models based on the check loss. The advantage of this new method is that nonlinear, linear and zero function components can be se… ▽ More

    Submitted 10 July, 2013; originally announced July 2013.

  14. arXiv:1211.4080  [pdf, ps, other

    stat.ME

    Minimax Prediction for Functional Linear Regression with Functional Responses in Reproducing Kernel Hilbert Spaces

    Authors: Heng Lian

    Abstract: In this article, we consider convergence rates in functional linear regression with functional responses, where the linear coefficient lies in a reproducing kernel Hilbert space (RKHS). Without assuming that the reproducing kernel and the covariate covariance kernel are aligned, or assuming polynomial rate of decay of the eigenvalues of the covariance kernel, convergence rates in prediction risk a… ▽ More

    Submitted 17 November, 2012; originally announced November 2012.

  15. arXiv:1110.0219  [pdf, ps, other

    stat.CO stat.ME

    Bayesian Quantile Regression for Single-Index Models

    Authors: Yuao Hua, Robert B. Gramacy, Heng Lian

    Abstract: Using an asymmetric Laplace distribution, which provides a mechanism for Bayesian inference of quantile regression models, we develop a fully Bayesian approach to fitting single-index models in conditional quantile regression. In this work, we use a Gaussian process prior for the unknown nonparametric link function and a Laplace distribution on the index vector, with the latter motivated by the re… ▽ More

    Submitted 29 December, 2011; v1 submitted 2 October, 2011; originally announced October 2011.

    Comments: 26 pages, 8 figures, 10 tables

  16. arXiv:1108.3904  [pdf, ps, other

    stat.ME

    Shrinkage Estimation and Selection for Multiple Functional Regression

    Authors: Heng Lian

    Abstract: Functional linear regression is a useful extension of simple linear regression and has been investigated by many researchers. However, functional variable selection problems when multiple functional observations exist, which is the counterpart in the functional context of multiple linear regression, is seldom studied. Here we propose a method using group smoothly clipped absolute deviation penalty… ▽ More

    Submitted 19 August, 2011; originally announced August 2011.

  17. arXiv:1108.1260  [pdf, ps, other

    stat.ME

    Bias-corrected GEE estimation and smooth-threshold GEE variable selection for single-index models with clustered data

    Authors: Peng Lai, Qihua Wang, Heng Lian

    Abstract: In this paper, we present a generalized estimating equations based estimation approach and a variable selection procedure for single-index models when the observed data are clustered. Unlike the case of independent observations, bias-correction is necessary when general working correlation matrices are used in the estimating equations. Our variable selection procedure based on smooth-threshold est… ▽ More

    Submitted 5 August, 2011; originally announced August 2011.

  18. arXiv:1107.4861  [pdf, ps, other

    stat.ME

    Semiparametric Bayesian Information Criterion for Model Selection in Ultra-high Dimensional Additive Models

    Authors: Heng Lian

    Abstract: For linear models with a diverging number of parameters, it has recently been shown that modified versions of Bayesian information criterion (BIC) can identify the true model consistently. However, in many cases there is little justification that the effects of the covariates are actually linear. Thus a semiparametric model such as the additive model studied here, is a viable alternative. We demon… ▽ More

    Submitted 25 July, 2011; originally announced July 2011.

  19. arXiv:1009.4241  [pdf, other

    stat.ME stat.CO

    Gaussian process single-index models as emulators for computer experiments

    Authors: Robert B. Gramacy, Heng Lian

    Abstract: A single-index model (SIM) provides for parsimonious multi-dimensional nonlinear regression by combining parametric (linear) projection with univariate nonparametric (non-linear) regression models. We show that a particular Gaussian process (GP) formulation is simple to work with and ideal as an emulator for some types of computer experiment as it can outperform the canonical separable GP regressi… ▽ More

    Submitted 17 August, 2011; v1 submitted 21 September, 2010; originally announced September 2010.

    Comments: 23 pages, 9 figures, 1 table

  20. arXiv:1008.2271  [pdf, ps, other

    stat.ME

    Flexible Shrinkage Estimation in High-Dimensional Varying Coefficient Models

    Authors: Heng Lian

    Abstract: We consider the problem of simultaneous variable selection and constant coefficient identification in high-dimensional varying coefficient models based on B-spline basis expansion. Both objectives can be considered as some type of model selection problems and we show that they can be achieved by a double shrinkage strategy. We apply the adaptive group Lasso penalty in models involving a diverging… ▽ More

    Submitted 13 August, 2010; originally announced August 2010.

    Comments: 26 pages

  21. arXiv:1008.1647  [pdf, ps, other

    stat.ME

    Gaussian Process Models for Nonparametric Functional Regression with Functional Responses

    Authors: Heng Lian

    Abstract: Recently nonparametric functional model with functional responses has been proposed within the functional reproducing kernel Hilbert spaces (fRKHS) framework. Motivated by its superior performance and also its limitations, we propose a Gaussian process model whose posterior mode coincide with the fRKHS estimator. The Bayesian approach has several advantages compared to its predecessor. Firstly, th… ▽ More

    Submitted 10 August, 2010; originally announced August 2010.

  22. arXiv:1005.5085  [pdf, ps, other

    stat.CO

    A simple and efficient algorithm for fused lasso signal approximator with convex loss function

    Authors: Heng Lian

    Abstract: We consider the augmented Lagrangian method (ALM) as a solver for the fused lasso signal approximator (FLSA) problem. The ALM is a dual method in which squares of the constraint functions are added as penalties to the Lagrangian. In order to apply this method to FLSA, two types of auxiliary variables are introduced to transform the original unconstrained minimization problem into a linearly constr… ▽ More

    Submitted 27 May, 2010; originally announced May 2010.

  23. arXiv:0910.1027  [pdf, ps, other

    math.ST stat.ME

    Time-varying Coefficients Estimation in Differential Equation Models with Noisy Time-varying Covariates

    Authors: Heng Lian

    Abstract: We study the problem of estimating time-varying coefficients in ordinary differential equations. Current theory only applies to the case when the associated state variables are observed without measurement errors as presented in \cite{chenwu08b,chenwu08}. The difficulty arises from the quadratic functional of observations that one needs to deal with instead of the linear functional that appears… ▽ More

    Submitted 6 October, 2009; originally announced October 2009.

  24. arXiv:0909.1123  [pdf, ps, other

    stat.ME

    Shrinkage Tuning Parameter Selection in Precision Matrices Estimation

    Authors: Heng Lian

    Abstract: Recent literature provides many computational and modeling approaches for covariance matrices estimation in a penalized Gaussian graphical models but relatively little study has been carried out on the choice of the tuning parameter. This paper tries to fill this gap by focusing on the problem of shrinkage parameter selection when estimating sparse precision matrices using the penalized likeliho… ▽ More

    Submitted 6 September, 2009; originally announced September 2009.

  25. arXiv:0908.0618  [pdf, ps, other

    stat.ME

    Functional Partial Linear Model

    Authors: Heng Lian

    Abstract: When predicting scalar responses in the situation where the explanatory variables are functions, it is sometimes the case that some functional variables are related to responses linearly while other variables have more complicated relationships with the responses. In this paper, we propose a new semi-parametric model to take advantage of both parametric and nonparametric functional modeling. Asymp… ▽ More

    Submitted 27 November, 2012; v1 submitted 5 August, 2009; originally announced August 2009.

  26. arXiv:0906.0434  [pdf, ps, other

    cs.CV math.NA stat.ME

    Total Variation, Adaptive Total Variation and Nonconvex Smoothly Clipped Absolute Deviation Penalty for Denoising Blocky Images

    Authors: Aditya Chopra, Heng Lian

    Abstract: The total variation-based image denoising model has been generalized and extended in numerous ways, improving its performance in different contexts. We propose a new penalty function motivated by the recent progress in the statistical literature on high-dimensional variable selection. Using a particular instantiation of the majorization-minimization algorithm, the optimization problem can be eff… ▽ More

    Submitted 2 June, 2009; originally announced June 2009.

  27. arXiv:0904.2906  [pdf, ps, other

    stat.ME stat.CO

    Sparse Bayesian Hierarchical Modeling of High-dimensional Clustering Problems

    Authors: Heng Lian

    Abstract: Clustering is one of the most widely used procedures in the analysis of microarray data, for example with the goal of discovering cancer subtypes based on observed heterogeneity of genetic marks between different tissues. It is well-known that in such high-dimensional settings, the existence of many noise variables can overwhelm the few signals embedded in the high-dimensional space. We propose… ▽ More

    Submitted 19 April, 2009; originally announced April 2009.

  28. arXiv:0904.0843  [pdf, ps, other

    stat.ME

    Empirical Likelihood Confidence Intervals for Nonparametric Functional Data Analysis

    Authors: Heng Lian

    Abstract: We consider the problem of constructing confidence intervals for nonparametric functional data analysis using empirical likelihood. In this doubly infinite-dimensional context, we demonstrate the Wilks's phenomenon and propose a bias-corrected construction that requires neither undersmoothing nor direct bias estimation. We also extend our results to partially linear regression involving function… ▽ More

    Submitted 6 April, 2009; originally announced April 2009.

  29. arXiv:0812.2628  [pdf, ps, other

    stat.ME math.ST

    Nonparametric Estimation of Variance Function for Functional Data

    Authors: Heng Lian

    Abstract: This article investigates nonparametric estimation of variance functions for functional data when the mean function is unknown. We obtain asymptotic results for the kernel estimator based on squared residuals. Similar to the finite dimensional case, our asymptotic result shows the smoothness of the unknown mean function has an effect on the rate of convergence. Our simulaton studies demonstrate… ▽ More

    Submitted 14 December, 2008; originally announced December 2008.

  30. arXiv:0810.2010  [pdf, ps, other

    stat.ME

    A note on conditional Akaike information for Poisson regression with random effects

    Authors: Heng Lian

    Abstract: A popular model selection approach for generalized linear mixed-effects models is the Akaike information criterion, or AIC. Among others, \cite{vaida05} pointed out the distinction between the marginal and conditional inference depending on the focus of research. The conditional AIC was derived for the linear mixed-effects model which was later generalized by \cite{liang08}. We show that the sim… ▽ More

    Submitted 11 October, 2008; originally announced October 2008.

    Comments: 7 pages, 1 figure

  31. arXiv:0712.1342  [pdf, ps, other

    stat.ME

    Stochastic adaptation of importance sampler

    Authors: Heng Lian

    Abstract: Improving efficiency of importance sampler is at the center of research in Monte Carlo methods. While adaptive approach is usually difficult within the Markov Chain Monte Carlo framework, the counterpart in importance sampling can be justified and validated easily. We propose an iterative adaptation method for learning the proposal distribution of an importance sampler based on stochastic approx… ▽ More

    Submitted 10 December, 2007; originally announced December 2007.

    Comments: 11 pages, minor changes

  32. arXiv:0709.1309  [pdf, ps, other

    stat.CO stat.ME

    Bayes and empirical Bayes changepoint problems

    Authors: Heng Lian

    Abstract: We generalize the approach of Liu and Lawrence (1999) for multiple changepoint problems where the number of changepoints is unknown. The approach is based on dynamic programming recursion for efficient calculation of the marginal probability of the data with the hidden parameters integrated out. For the estimation of the hyperparameters, we propose to use Monte Carlo EM when training data are av… ▽ More

    Submitted 10 September, 2007; originally announced September 2007.

  33. arXiv:0709.1307  [pdf, ps, other

    stat.AP

    MOST: detecting cancer differential gene expression

    Authors: Heng Lian

    Abstract: We propose a new statistics for the detection of differentially expressed genes, when the genes are activated only in a subset of the samples. Statistics designed for this unconventional circumstance has proved to be valuable for most cancer studies, where oncogenes are activated for a small number of disease samples. Previous efforts made in this direction include COPA, OS and ORT. We propose a… ▽ More

    Submitted 10 September, 2007; originally announced September 2007.

    Journal ref: Biostatistics, 2008 9(3):411-418