-
Nested stochastic block model for simultaneously clustering networks and nodes
Authors:
Nathaniel Josephs,
Arash A. Amini,
Marina Paez,
Lizhen Lin
Abstract:
We introduce the nested stochastic block model (NSBM) to cluster a collection of networks while simultaneously detecting communities within each network. NSBM has several appealing features including the ability to work on unlabeled networks with potentially different node sets, the flexibility to model heterogeneous communities, and the means to automatically select the number of classes for the…
▽ More
We introduce the nested stochastic block model (NSBM) to cluster a collection of networks while simultaneously detecting communities within each network. NSBM has several appealing features including the ability to work on unlabeled networks with potentially different node sets, the flexibility to model heterogeneous communities, and the means to automatically select the number of classes for the networks and the number of communities within each network. This is accomplished via a Bayesian model, with a novel application of the nested Dirichlet process (NDP) as a prior to jointly model the between-network and within-network clusters. The dependency introduced by the network data creates nontrivial challenges for the NDP, especially in the development of efficient samplers. For posterior inference, we propose several Markov chain Monte Carlo algorithms including a standard Gibbs sampler, a collapsed Gibbs sampler, and two blocked Gibbs samplers that ultimately return two levels of clustering labels from both within and across the networks. Extensive simulation studies are carried out which demonstrate that the model provides very accurate estimates of both levels of the clustering structure. We also apply our model to two social network datasets that cannot be analyzed using any previous method in the literature due to the anonymity of the nodes and the varying number of nodes in each network.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Hierarchical Stochastic Block Model for Community Detection in Multiplex Networks
Authors:
Arash A. Amini,
Marina S. Paez,
Lizhen Lin
Abstract:
Multiplex networks have become increasingly more prevalent in many fields, and have emerged as a powerful tool for modeling the complexity of real networks. There is a critical need for develo** inference models for multiplex networks that can take into account potential dependencies across different layers, particularly when the aim is community detection. We add to a limited literature by prop…
▽ More
Multiplex networks have become increasingly more prevalent in many fields, and have emerged as a powerful tool for modeling the complexity of real networks. There is a critical need for develo** inference models for multiplex networks that can take into account potential dependencies across different layers, particularly when the aim is community detection. We add to a limited literature by proposing a novel and efficient Bayesian model for community detection in multiplex networks. A key feature of our approach is the ability to model varying communities at different network layers. In contrast, many existing models assume the same communities for all layers. Moreover, our model automatically picks up the necessary number of communities at each layer (as validated by real data examples). This is appealing, since deciding the number of communities is a challenging aspect of community detection, and especially so in the multiplex setting, if one allows the communities to change across layers. Borrowing ideas from hierarchical Bayesian modeling, we use a hierarchical Dirichlet prior to model community labels across layers, allowing dependency in their structure. Given the community labels, a stochastic block model (SBM) is assumed for each layer. We develop an efficient slice sampler for sampling the posterior distribution of the community labels as well as the link probabilities between communities. In doing so, we address some unique challenges posed by coupling the complex likelihood of SBM with the hierarchical nature of the prior on the labels. An extensive empirical validation is performed on simulated and real data, demonstrating the superior performance of the model over single-layer alternatives, as well as the ability to uncover interesting structures in real networks.
△ Less
Submitted 12 February, 2023; v1 submitted 29 March, 2019;
originally announced April 2019.
-
Exact slice sampler for Hierarchical Dirichlet Processes
Authors:
Arash A. Amini,
Marina Paez,
Lizhen Lin,
Zahra S. Razaee
Abstract:
We propose an exact slice sampler for Hierarchical Dirichlet process (HDP) and its associated mixture models (Teh et al., 2006). Although there are existing MCMC algorithms for sampling from the HDP, a slice sampler has been missing from the literature. Slice sampling is well-known for its desirable properties including its fast mixing and its natural potential for parallelization. On the other ha…
▽ More
We propose an exact slice sampler for Hierarchical Dirichlet process (HDP) and its associated mixture models (Teh et al., 2006). Although there are existing MCMC algorithms for sampling from the HDP, a slice sampler has been missing from the literature. Slice sampling is well-known for its desirable properties including its fast mixing and its natural potential for parallelization. On the other hand, the hierarchical nature of HDPs poses challenges to adopting a full-fledged slice sampler that automatically truncates all the infinite measures involved without ad-hoc modifications. In this work, we adopt the powerful idea of Bayesian variable augmentation to address this challenge. By introducing new latent variables, we obtain a full factorization of the joint distribution that is suitable for slice sampling. Our algorithm has several appealing features such as (1) fast mixing; (2) remaining exact while allowing natural truncation of the underlying infinite-dimensional measures, as in (Kalli et al., 2011), resulting in updates of only a finite number of necessary atoms and weights in each iteration; and (3) being naturally suited to parallel implementations. The underlying principle for joint factorization of the full likelihood is simple and can be applied to many other settings, such as designing sampling algorithms for general dependent Dirichlet process (DDP) models.
△ Less
Submitted 21 March, 2019;
originally announced March 2019.
-
Modeling with a Large Class of Unimodal Multivariate Distributions
Authors:
Marina S. Paez,
Stephen G. Walker
Abstract:
In this paper we introduce a new class of multivariate unimodal distributions, motivated by Khintchine's representation. We start by proposing a univariate model, whose support covers all the unimodal distributions on the real line. The proposed class of unimodal distributions can be naturally extended to higher dimensions, by using the multivariate Gaussian copula. Under both univariate and multi…
▽ More
In this paper we introduce a new class of multivariate unimodal distributions, motivated by Khintchine's representation. We start by proposing a univariate model, whose support covers all the unimodal distributions on the real line. The proposed class of unimodal distributions can be naturally extended to higher dimensions, by using the multivariate Gaussian copula. Under both univariate and multivariate settings, we provide MCMC algorithms to perform inference about the model parameters and predictive densities. The methodology is illustrated with univariate and bivariate examples, and with variables taken from a real data-set.
△ Less
Submitted 24 June, 2015;
originally announced June 2015.