Skip to main content

Showing 1–5 of 5 results for author: Westerhuis, J A

Searching in archive stat. Search in all archives.
.
  1. All Sparse PCA Models Are Wrong, But Some Are Useful. Part I: Computation of Scores, Residuals and Explained Variance

    Authors: J. Camacho, A. K. Smilde, E. Saccenti, J. A. Westerhuis

    Abstract: Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA) that combines variance maximization and sparsity with the ultimate goal of improving data interpretation. When moving from PCA to sPCA, there are a number of implications that the practitioner needs to be aware of. A relevant one is that scores and loadings in sPCA may… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Journal ref: Chemometrics and Intelligent Laboratory Systems, 2020, 196: 1039072-

  2. arXiv:1904.10279  [pdf, other

    stat.ME

    Heterofusion: Fusing genomics data of different measurement scales

    Authors: Age K. Smilde, Yipeng Song, Johan A. Westerhuis, Henk A. L. Kiers, Nanne Aben, Lodewyk F. A. Wessels

    Abstract: In systems biology, it is becoming increasingly common to measure biochemical entities at different levels of the same biological system. Hence, data fusion problems are abundant in the life sciences. With the availability of a multitude of measuring techniques, one of the central problems is the heterogeneity of the data. In this paper, we discuss a specific form of heterogeneity, namely that of… ▽ More

    Submitted 23 April, 2019; originally announced April 2019.

  3. Logistic principal component analysis via non-convex singular value thresholding

    Authors: Yipeng Song, Johan A. Westerhuis, Age K. Smilde

    Abstract: Multivariate binary data is becoming abundant in current biological research. Logistic principal component analysis (PCA) is one of the commonly used tools to explore the relationships inside a multivariate binary data set by exploiting the underlying low rank structure. We re-expressed the logistic PCA model based on the latent variable interpretation of the generalized linear model on binary dat… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

    Comments: 19 pages, 14 figures

    Journal ref: Chemometrics and Intelligent Laboratory Systems 204 (2020) 104089

  4. arXiv:1902.06241  [pdf, other

    stat.ME stat.ML

    Separating common (global and local) and distinct variation in multiple mixed types data sets

    Authors: Yipeng Song, Johan A. Westerhuis, Age K. Smilde

    Abstract: Multiple sets of measurements on the same objects obtained from different platforms may reflect partially complementary information of the studied system. The integrative analysis of such data sets not only provides us with the opportunity of a deeper understanding of the studied system, but also introduces some new statistical challenges. First, the separation of information that is common across… ▽ More

    Submitted 23 December, 2019; v1 submitted 17 February, 2019; originally announced February 2019.

    Comments: 32 pages, 14 figures

    Journal ref: Journal of Chemometrics 34 (2020) e3197

  5. arXiv:1807.04982  [pdf, other

    stat.ME

    Generalized simultaneous component analysis of binary and quantitative data

    Authors: Yipeng Song, Johan A. Westerhuis, Nanne Aben, Lodewyk F. A. Wessels, Patrick J. F. Groenen, Age K. Smilde

    Abstract: In the current era of systems biological research there is a need for the integrative analysis of binary and quantitative genomics data sets measured on the same objects. One standard tool of exploring the underlying dependence structure present in multiple quantitative data sets is simultaneous component analysis (SCA) model. However, it does not have any provisions when a part of the data are bi… ▽ More

    Submitted 3 June, 2019; v1 submitted 13 July, 2018; originally announced July 2018.

    Comments: 19 pages, 10 figures