Search | arXiv e-print repository

arXiv:1906.04582 [pdf, other]

Intertemporal Community Detection in Human Mobility Networks

Authors: Mark He, Joseph Glasser, Shankar Bhamidi, Nikhil Kaza

Abstract: We introduce a community detection method that finds clusters in network time-series by introducing an algorithm that finds significantly interconnected nodes across time. These connections are either increasing, decreasing, or constant over time. Significance of nodal connectivity within a set is judged using the Weighted Configuration Null Model at each time-point, then a novel significance-test… ▽ More We introduce a community detection method that finds clusters in network time-series by introducing an algorithm that finds significantly interconnected nodes across time. These connections are either increasing, decreasing, or constant over time. Significance of nodal connectivity within a set is judged using the Weighted Configuration Null Model at each time-point, then a novel significance-testing scheme is used to assess connectivity at all time points and the direction of its time-trend. We apply this method to bikeshare networks in New York City and Chicago and taxicab pickups and dropoffs in New York to find and illustrate patterns in human mobility in urban zones. Results show stark geographical patterns in clusters that are growing and declining in relative usage across time and potentially elucidate latent economic or demographic trends. △ Less

Submitted 3 April, 2020; v1 submitted 10 June, 2019; originally announced June 2019.

Comments: 29 pages

arXiv:1903.06029 [pdf, other]

doi 10.1371/journal.pone.0230941

Demarcating Geographic Regions using Community Detection in Commuting Networks with Significant Self-Loops

Authors: Mark He, Joseph Glasser, Nathaniel Pritchard, Shankar Bhamidi, Nikhil Kaza

Abstract: We develop a method to identify statistically significant communities in a weighted network with a high proportion of self-loo** weights. We use this method to find overlap** agglomerations of U.S. counties by representing inter-county commuting as a weighted network. We identify three types of communities; non-nodal, nodal and monads, which correspond to different types of regions. The result… ▽ More We develop a method to identify statistically significant communities in a weighted network with a high proportion of self-loo** weights. We use this method to find overlap** agglomerations of U.S. counties by representing inter-county commuting as a weighted network. We identify three types of communities; non-nodal, nodal and monads, which correspond to different types of regions. The results suggest that traditional regional delineations that rely on ad hoc thresholds do not account for important and pervasive connections that extend far beyond expected metropolitan boundaries or megaregions. △ Less

Submitted 26 March, 2020; v1 submitted 13 March, 2019; originally announced March 2019.

Comments: 38 pages

arXiv:1810.01300 [pdf, other]

Sampling-based Estimation of In-degree Distribution with Applications to Directed Complex Networks

Authors: Nelson Antunes, Shankar Bhamidi, Tianjian Guo, Vladas Pipiras, Bang Wang

Abstract: The focus of this work is on estimation of the in-degree distribution in directed networks from sampling network nodes or edges. A number of sampling schemes are considered, including random sampling with and without replacement, and several approaches based on random walks with possible jumps. When sampling nodes, it is assumed that only the out-edges of that node are visible, that is, the in-deg… ▽ More The focus of this work is on estimation of the in-degree distribution in directed networks from sampling network nodes or edges. A number of sampling schemes are considered, including random sampling with and without replacement, and several approaches based on random walks with possible jumps. When sampling nodes, it is assumed that only the out-edges of that node are visible, that is, the in-degree of that node is not observed. The suggested estimation of the in-degree distribution is based on two approaches. The inversion approach exploits the relation between the original and sample in-degree distributions, and can estimate the bulk of the in-degree distribution, but not the tail of the distribution. The tail of the in-degree distribution is estimated through an asymptotic approach, which itself has two versions: one assuming a power-law tail and the other for a tail of general form. The two estimation approaches are examined on synthetic and real networks, with good performance results, especially striking for the asymptotic approach. △ Less

Submitted 2 October, 2018; originally announced October 2018.

Comments: 30 pages , 6 figures

arXiv:1610.06511 [pdf, other]

Community extraction in multilayer networks with heterogeneous community structure

Authors: James D. Wilson, John Palowitch, Shankar Bhamidi, Andrew B. Nobel

Abstract: Multilayer networks are a useful way to capture and model multiple, binary or weighted relationships among a fixed group of objects. While community detection has proven to be a useful exploratory technique for the analysis of single-layer networks, the development of community detection methods for multilayer networks is still in its infancy. We propose and investigate a procedure, called Multila… ▽ More Multilayer networks are a useful way to capture and model multiple, binary or weighted relationships among a fixed group of objects. While community detection has proven to be a useful exploratory technique for the analysis of single-layer networks, the development of community detection methods for multilayer networks is still in its infancy. We propose and investigate a procedure, called Multilayer Extraction, that identifies densely connected vertex-layer sets in multilayer networks. Multilayer Extraction makes use of a significance based score that quantifies the connectivity of an observed vertex-layer set through comparison with a fixed degree random graph model. Multilayer Extraction directly handles networks with heterogeneous layers where community structure may be different from layer to layer. The procedure can capture overlap** communities, as well as background vertex-layer pairs that do not belong to any community. We establish consistency of the vertex-layer set optimizer of our proposed multilayer score under the multilayer stochastic block model. We investigate the performance of Multilayer Extraction on three applications and a test bed of simulations. Our theoretical and numerical evaluations suggest that Multilayer Extraction is an effective exploratory tool for analyzing complex multilayer networks. Publicly available R software for Multilayer Extraction is available at https://github.com/jdwilson4/MultilayerExtraction. △ Less

Submitted 7 November, 2017; v1 submitted 20 October, 2016; originally announced October 2016.

Comments: 46 pages. Accepted at the Journal of Machine Learning Research (11/17)

arXiv:1601.05630 [pdf, other]

Significance-based community detection in weighted networks

Authors: John Palowitch, Shankar Bhamidi, Andrew B. Nobel

Abstract: Community detection is the process of grou** strongly connected nodes in a network. Many community detection methods for un-weighted networks have a theoretical basis in a null model. Communities discovered by these methods therefore have interpretations in terms of statistical signficance. In this paper, we introduce a null for weighted networks called the continuous configuration model. We use… ▽ More Community detection is the process of grou** strongly connected nodes in a network. Many community detection methods for un-weighted networks have a theoretical basis in a null model. Communities discovered by these methods therefore have interpretations in terms of statistical signficance. In this paper, we introduce a null for weighted networks called the continuous configuration model. We use the model both as a tool for community detection and for simulating weighted networks with null nodes. First, we propose a community extraction algorithm for weighted networks which incorporates iterative hypothesis testing under the null. We prove a central limit theorem for edge-weight sums and asymptotic consistency of the algorithm under a weighted stochastic block model. We then incorporate the algorithm in a community detection method called CCME. To benchmark the method, we provide a simulation framework incorporating the null to plant "background" nodes in weighted networks with communities. We show that the empirical performance of CCME on these simulations is competitive with existing methods, particularly when overlap** communities and background nodes are present. To further validate the method, we present two real-world networks with potential background nodes and analyze them with CCME, yielding results that reveal macro-features of the corresponding systems. △ Less

Submitted 23 October, 2017; v1 submitted 21 January, 2016; originally announced January 2016.

Comments: Code and supplemental info available at http://stats.johnpalowitch.com/ccme. V3 changes: based on lengthy referee revision process, new theoretical sections added, + major organizational changes. V2 changes: grant info added, 1 reference added, bibliography section moved to end, condensed bib line spacing, corrected typos

arXiv:1308.0777 [pdf, ps, other]

doi 10.1214/14-AOAS760

A testing based extraction algorithm for identifying significant communities in networks

Authors: James D. Wilson, Simi Wang, Peter J. Mucha, Shankar Bhamidi, Andrew B. Nobel

Abstract: A common and important problem arising in the study of networks is how to divide the vertices of a given network into one or more groups, called communities, in such a way that vertices of the same community are more interconnected than vertices belonging to different ones. We propose and investigate a testing based community detection procedure called Extraction of Statistically Significant Commu… ▽ More A common and important problem arising in the study of networks is how to divide the vertices of a given network into one or more groups, called communities, in such a way that vertices of the same community are more interconnected than vertices belonging to different ones. We propose and investigate a testing based community detection procedure called Extraction of Statistically Significant Communities (ESSC). The ESSC procedure is based on $p$-values for the strength of connection between a single vertex and a set of vertices under a reference distribution derived from a conditional configuration network model. The procedure automatically selects both the number of communities in the network and their size. Moreover, ESSC can handle overlap** communities and, unlike the majority of existing methods, identifies "background" vertices that do not belong to a well-defined community. The method has only one parameter, which controls the stringency of the hypothesis tests. We investigate the performance and potential use of ESSC and compare it with a number of existing methods, through a validation study using four real network data sets. In addition, we carry out a simulation study to assess the effectiveness of ESSC in networks with various types of community structure, including networks with overlap** communities and those with background vertices. These results suggest that ESSC is an effective exploratory tool for the discovery of relevant community structure in complex network systems. Data and software are available at \urlhttp://www.unc.edu/~jameswd/research.html. △ Less

Submitted 3 December, 2014; v1 submitted 4 August, 2013; originally announced August 2013.

Comments: Published in at http://dx.doi.org/10.1214/14-AOAS760 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS760

Journal ref: Annals of Applied Statistics 2014, Vol. 8, No. 3, 1853-1891

Showing 1–6 of 6 results for author: Bhamidi, S