Skip to main content

Showing 1–14 of 14 results for author: Hui, F K C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.01883  [pdf, other

    stat.ME

    Robust Linear Mixed Models using Hierarchical Gamma-Divergence

    Authors: Shonosuke Sugasawa, Francis K. C. Hui, Alan H. Welsh

    Abstract: Linear mixed models (LMMs), which typically assume normality for both the random effects and error terms, are a popular class of methods for analyzing longitudinal and clustered data. However, such models can be sensitive to outliers, and this can lead to poor statistical results (e.g., biased inference on model parameters and inaccurate prediction of random effects) if the data are contaminated.… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 30 pages (main) + 6 pages (supplement)

  2. arXiv:2403.11562  [pdf, other

    stat.ME

    A Comparison of Joint Species Distribution Models for Percent Cover Data

    Authors: Pekka Korhonen, Francis K. C. Hui, Jenni Niku, Sara Taskinen, Bert van der Veen

    Abstract: 1. Joint species distribution models (JSDMs) have gained considerable traction among ecologists over the past decade, due to their capacity to answer a wide range of questions at both the species- and the community-level. The family of generalized linear latent variable models in particular has proven popular for building JSDMs, being able to handle many response types including presence-absence d… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  3. arXiv:2402.12803  [pdf, other

    stat.ME math.ST stat.AP

    Joint Mean and Correlation Regression Models for Multivariate Data

    Authors: Zhi Yang Tho, Francis K. C. Hui, Tao Zou

    Abstract: We propose a new joint mean and correlation regression model for correlated multivariate discrete responses, that simultaneously regresses the mean of each response against a set of covariates, and the correlations between responses against a set of similarity/distance measures. A set of joint estimating equations are formulated to construct an estimator of both the mean regression coefficients an… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  4. arXiv:2402.12719  [pdf, other

    stat.ME math.ST

    Restricted maximum likelihood estimation in generalized linear mixed models

    Authors: Luca Maestrini, Francis K. C. Hui, Alan H. Welsh

    Abstract: Restricted maximum likelihood (REML) estimation is a widely accepted and frequently used method for fitting linear mixed models, with its principal advantage being that it produces less biased estimates of the variance components. However, the concept of REML does not immediately generalize to the setting of non-normally distributed responses, and it is not always clear the extent to which, either… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  5. arXiv:2401.13379  [pdf, other

    stat.ME math.ST stat.AP

    An Ising Similarity Regression Model for Modeling Multivariate Binary Data

    Authors: Zhi Yang Tho, Francis K. C. Hui, Tao Zou

    Abstract: Understanding the dependence structure between response variables is an important component in the analysis of correlated multivariate data. This article focuses on modeling dependence structures in multivariate binary data, motivated by a study aiming to understand how patterns in different U.S. senators' votes are determined by similarities (or lack thereof) in their attributes, e.g., political… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  6. arXiv:2310.13858  [pdf, other

    stat.ME

    Likelihood-based surrogate dimension reduction

    Authors: Linh H. Nghiem, Francis K. C. Hui, Samuel Mueller, A. H. Welsh

    Abstract: We consider the problem of surrogate sufficient dimension reduction, that is, estimating the central subspace of a regression model, when the covariates are contaminated by measurement error. When no measurement error is present, a likelihood-based dimension reduction method that relies on maximizing the likelihood of a Gaussian inverse regression model on the Grassmann manifold is well-known to h… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  7. arXiv:2310.05548  [pdf, other

    stat.ME stat.AP

    Cokrig-and-Regress for Spatially Misaligned Environmental Data

    Authors: Z. Y. Tho, F. K. C. Hui, A. H. Welsh, T. Zou

    Abstract: Spatially misaligned data, where the response and covariates are observed at different spatial locations, commonly arise in many environmental studies. Much of the statistical literature on handling spatially misaligned data has been devoted to the case of a single covariate and a linear relationship between the response and this covariate. Motivated by spatially misaligned data collected on air p… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  8. arXiv:2107.02627  [pdf, other

    stat.ME

    Fast, universal estimation of latent variable models using extended variational approximations

    Authors: Pekka Korhonen, Francis K. C. Hui, Jenni Niku, Sara Taskinen

    Abstract: Generalized linear latent variable models (GLLVMs) are a class of methods for analyzing multi-response data which has garnered considerable popularity in recent years, for example, in the analysis of multivariate abundance data in ecology. One of the main features of GLLVMs is their capacity to handle a variety of responses types, such as (overdispersed) counts, binomial responses, (semi-)continuo… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

  9. arXiv:2104.09838  [pdf, other

    stat.ME

    Sparse Sliced Inverse Regression via Cholesky Matrix Penalization

    Authors: Linh Nghiem, Francis K. C. Hui, Samuel Mueller, A. H. Welsh

    Abstract: We introduce a new sparse sliced inverse regression estimator called Cholesky matrix penalization and its adaptive version for achieving sparsity in estimating the dimensions of the central subspace. The new estimators use the Cholesky decomposition of the covariance matrix of the covariates and include a regularization term in the objective function to achieve sparsity in a computationally effici… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

  10. arXiv:2104.09812  [pdf, other

    stat.ME

    Screening methods for linear errors-in-variables models in high dimensions

    Authors: Linh Nghiem, Francis K. C. Hui, Samuel Mueller, A. H. Welsh

    Abstract: Microarray studies, in order to identify genes associated with an outcome of interest, usually produce noisy measurements for a large number of gene expression features from a small number of subjects. One common approach to analyzing such high-dimensional data is to use linear errors-in-variables models; however, current methods for fitting such models are computationally expensive. In this paper… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

  11. arXiv:2010.02469  [pdf, other

    cs.LG stat.CO stat.ML

    Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays

    Authors: Łukasz Kidziński, Francis K. C. Hui, David I. Warton, Trevor Hastie

    Abstract: Unmeasured or latent variables are often the cause of correlations between multivariate measurements, which are studied in a variety of fields such as psychology, ecology, and medicine. For Gaussian measurements, there are classical tools such as factor analysis or principal component analysis with a well-established theory and fast algorithms. Generalized Linear Latent Variable models (GLLVMs) ge… ▽ More

    Submitted 27 January, 2022; v1 submitted 6 October, 2020; originally announced October 2020.

  12. Symbolic Formulae for Linear Mixed Models

    Authors: Emi Tanaka, Francis K. C. Hui

    Abstract: A statistical model is a mathematical representation of an often simplified or idealised data-generating process. In this paper, we focus on a particular type of statistical model, called linear mixed models (LMMs), that is widely used in many disciplines e.g.~agriculture, ecology, econometrics, psychology. Mixed models, also commonly known as multi-level, nested, hierarchical or panel data models… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

  13. Multi-species distribution modeling using penalized mixture of regressions

    Authors: Francis K. C. Hui, David I. Warton, Scott D. Foster

    Abstract: Multi-species distribution modeling, which relates the occurrence of multiple species to environmental variables, is an important tool used by ecologists for both predicting the distribution of species in a community and identifying the important variables driving species co-occurrences. Recently, Dunstan, Foster and Darnell [Ecol. Model. 222 (2011) 955-963] proposed using finite mixture of regres… ▽ More

    Submitted 16 September, 2015; originally announced September 2015.

    Comments: Published at http://dx.doi.org/10.1214/15-AOAS813 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS813

    Journal ref: Annals of Applied Statistics 2015, Vol. 9, No. 2, 866-882

  14. arXiv:1211.3460  [pdf, other

    stat.ME math.ST

    A Nonparametric Measure of Local Association for two-way Contingency Tables

    Authors: Francis K. C. Hui, Gery Geenens

    Abstract: In contingency table analysis, the odds ratio is a commonly applied measure used to summarize the degree of association between two categorical variables, say R and S. Suppose now that for each individual in the table, a vector of continuous variables X is also observed. It is then vital to analyze whether and how the degree of association varies with X. In this work, we extend the classical odds… ▽ More

    Submitted 14 November, 2012; originally announced November 2012.