-
Tuning-Free Disentanglement via Projection
Authors:
Yue Bai,
Leo L. Duan
Abstract:
In representation learning and non-linear dimension reduction, there is a huge interest to learn the 'disentangled' latent variables, where each sub-coordinate almost uniquely controls a facet of the observed data. While many regularization approaches have been proposed on variational autoencoders, heuristic tuning is required to balance between disentanglement and loss in reconstruction accuracy…
▽ More
In representation learning and non-linear dimension reduction, there is a huge interest to learn the 'disentangled' latent variables, where each sub-coordinate almost uniquely controls a facet of the observed data. While many regularization approaches have been proposed on variational autoencoders, heuristic tuning is required to balance between disentanglement and loss in reconstruction accuracy -- due to the unsupervised nature, there is no principled way to find an optimal weight for regularization. Motivated to completely bypass regularization, we consider a projection strategy: modifying the canonical Gaussian encoder, we add a layer of scaling and rotation to the Gaussian mean, such that the marginal correlations among latent sub-coordinates become exactly zero. This achieves a theoretically maximal disentanglement, as guaranteed by zero cross-correlation between one latent sub-coordinate and the observed varying with the rest. Unlike regularizations, the extra projection layer does not impact the flexibility of the previous encoder layers, leading to almost no loss in expressiveness. This approach is simple to implement in practice. Our numerical experiments demonstrate very good performance, with no tuning required.
△ Less
Submitted 5 September, 2019; v1 submitted 27 June, 2019;
originally announced June 2019.
-
Latent Simplex Position Model: High Dimensional Multi-view Clustering with Uncertainty Quantification
Authors:
Leo L Duan
Abstract:
High dimensional data often contain multiple facets, and several clustering patterns can co-exist under different variable subspaces, also known as the views. While multi-view clustering algorithms were proposed, the uncertainty quantification remains difficult --- a particular challenge is in the high complexity of estimating the cluster assignment probability under each view, and sharing informa…
▽ More
High dimensional data often contain multiple facets, and several clustering patterns can co-exist under different variable subspaces, also known as the views. While multi-view clustering algorithms were proposed, the uncertainty quantification remains difficult --- a particular challenge is in the high complexity of estimating the cluster assignment probability under each view, and sharing information among views. In this article, we propose an approximate Bayes approach --- treating the similarity matrices generated over the views as rough first-stage estimates for the co-assignment probabilities; in its Kullback-Leibler neighborhood, we obtain a refined low-rank matrix, formed by the pairwise product of simplex coordinates. Interestingly, each simplex coordinate directly encodes the cluster assignment uncertainty. For multi-view clustering, we let each view draw a parameterization from a few candidates, leading to dimension reduction. With high model flexibility, the estimation can be efficiently carried out as a continuous optimization problem, hence enjoys gradient-based computation. The theory establishes the connection of this model to a random partition distribution under multiple views. Compared to single-view clustering approaches, substantially more interpretable results are obtained when clustering brains from a human traumatic brain injury study, using high-dimensional gene expression data.
KEY WORDS: Co-regularized Clustering, Consensus, PAC-Bayes, Random Cluster Graph, Variable Selection
△ Less
Submitted 7 October, 2019; v1 submitted 21 March, 2019;
originally announced March 2019.
-
Bayesian Distance Clustering
Authors:
Leo L Duan,
David B Dunson
Abstract:
Model-based clustering is widely-used in a variety of application areas. However, fundamental concerns remain about robustness. In particular, results can be sensitive to the choice of kernel representing the within-cluster data density. Leveraging on properties of pairwise differences between data points, we propose a class of Bayesian distance clustering methods, which rely on modeling the likel…
▽ More
Model-based clustering is widely-used in a variety of application areas. However, fundamental concerns remain about robustness. In particular, results can be sensitive to the choice of kernel representing the within-cluster data density. Leveraging on properties of pairwise differences between data points, we propose a class of Bayesian distance clustering methods, which rely on modeling the likelihood of the pairwise distances in place of the original data. Although some information in the data is discarded, we gain substantial robustness to modeling assumptions. The proposed approach represents an appealing middle ground between distance- and model-based clustering, drawing advantages from each of these canonical approaches. We illustrate dramatic gains in the ability to infer clusters that are not well represented by the usual choices of kernel. A simulation study is included to assess performance relative to competitors, and we apply the approach to clustering of brain genome expression data.
Keywords: Distance-based clustering; Mixture model; Model-based clustering; Model misspecification; Pairwise distance matrix; Partial likelihood; Robustness.
△ Less
Submitted 25 June, 2019; v1 submitted 19 October, 2018;
originally announced October 2018.
-
Mixed-Stationary Gaussian Process for Flexible Non-Stationary Modeling of Spatial Outcomes
Authors:
Leo L. Duan,
Xia Wang,
Rhonda D. Szczesniak
Abstract:
Gaussian processes (GPs) are commonplace in spatial statistics. Although many non-stationary models have been developed, there is arguably a lack of flexibility compared to equip** each location with its own parameters. However, the latter suffers from intractable computation and can lead to overfitting. Taking the instantaneous stationarity idea, we construct a non-stationary GP with the statio…
▽ More
Gaussian processes (GPs) are commonplace in spatial statistics. Although many non-stationary models have been developed, there is arguably a lack of flexibility compared to equip** each location with its own parameters. However, the latter suffers from intractable computation and can lead to overfitting. Taking the instantaneous stationarity idea, we construct a non-stationary GP with the stationarity parameter individually set at each location. Then we utilize the non-parametric mixture model to reduce the effective number of unique parameters. Different from a simple mixture of independent GPs, the mixture in stationarity allows the components to be spatial correlated, leading to improved prediction efficiency. Theoretical properties are examined and a linearly scalable algorithm is provided. The application is shown through several simulated scenarios as well as the massive spatiotemporally correlated temperature data.
△ Less
Submitted 17 July, 2018;
originally announced July 2018.