-
Targeting influence in a harmonic opinion model
Authors:
Zachary M. Boyd,
Nicolas Fraiman,
Jeremy L. Marzuola,
Peter J. Mucha,
Braxton Osting
Abstract:
Influence propagation in social networks is a central problem in modern social network analysis, with important societal applications in politics and advertising. A large body of work has focused on cascading models, viral marketing, and finite-horizon diffusion. There is, however, a need for more developed, mathematically principled \emph{adversarial models}, in which multiple, opposed actors str…
▽ More
Influence propagation in social networks is a central problem in modern social network analysis, with important societal applications in politics and advertising. A large body of work has focused on cascading models, viral marketing, and finite-horizon diffusion. There is, however, a need for more developed, mathematically principled \emph{adversarial models}, in which multiple, opposed actors strategically select nodes whose influence will maximally sway the crowd to their point of view.
In the present work, we develop and analyze such a model based on harmonic functions and linear diffusion. We prove that our general problem is NP-hard and that the objective function is monotone and submodular; consequently, we can greedily approximate the solution within a constant factor. Introducing and analyzing a convex relaxation, we show that the problem can be approximately solved using smooth optimization methods. We illustrate the effectiveness of our approach on a variety of example networks.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Link Prediction Accuracy on Real-World Networks Under Non-Uniform Missing Edge Patterns
Authors:
Xie He,
Amir Ghasemian,
Eun Lee,
Alice Schwarze,
Aaron Clauset,
Peter J. Mucha
Abstract:
Real-world network datasets are typically obtained in ways that fail to capture all edges. The patterns of missing data are often non-uniform as they reflect biases and other shortcomings of different data collection methods. Nevertheless, uniform missing data is a common assumption made when no additional information is available about the underlying missing-edge pattern, and link prediction meth…
▽ More
Real-world network datasets are typically obtained in ways that fail to capture all edges. The patterns of missing data are often non-uniform as they reflect biases and other shortcomings of different data collection methods. Nevertheless, uniform missing data is a common assumption made when no additional information is available about the underlying missing-edge pattern, and link prediction methods are frequently tested against uniformly missing edges. To investigate the impact of different missing-edge patterns on link prediction accuracy, we employ 9 link prediction algorithms from 4 different families to analyze 20 different missing-edge patterns that we categorize into 5 groups. Our comparative simulation study, spanning 250 real-world network datasets from 6 different domains, provides a detailed picture of the significant variations in the performance of different link prediction algorithms in these different settings. With this study, we aim to provide a guide for future researchers to help them select a link prediction algorithm that is well suited to their sampled network data, considering the data collection process and application domain.
△ Less
Submitted 30 April, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Correlation networks: Interdisciplinary approaches beyond thresholding
Authors:
Naoki Masuda,
Zachary M. Boyd,
Diego Garlaschelli,
Peter J. Mucha
Abstract:
Many empirical networks originate from correlational data, arising in domains as diverse as psychology, neuroscience, genomics, microbiology, finance, and climate science. Specialized algorithms and theory have been developed in different application domains for working with such networks, as well as in statistics, network science, and computer science, often with limited communication between pra…
▽ More
Many empirical networks originate from correlational data, arising in domains as diverse as psychology, neuroscience, genomics, microbiology, finance, and climate science. Specialized algorithms and theory have been developed in different application domains for working with such networks, as well as in statistics, network science, and computer science, often with limited communication between practitioners in different fields. This leaves significant room for cross-pollination across disciplines. A central challenge is that it is not always clear how to best transform correlation matrix data into networks for the application at hand, and probably the most widespread method, i.e., thresholding on the correlation value to create either unweighted or weighted networks, suffers from multiple problems. In this article, we review various methods of constructing and analyzing correlation networks, ranging from thresholding and its improvements to weighted networks, regularization, dynamic correlation networks, threshold-free approaches, and more. Finally, we propose and discuss a variety of key open questions currently confronting this field.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Escape times for subgraph detection and graph partitioning
Authors:
Zachary M. Boyd,
Nicolas Fraiman,
Jeremy L. Marzuola,
Peter J. Mucha,
Braxton Osting
Abstract:
We provide a rearrangement based algorithm for fast detection of subgraphs of $k$ vertices with long escape times for directed or undirected networks. Complementing other notions of densest subgraphs and graph cuts, our method is based on the mean hitting time required for a random walker to leave a designated set and hit the complement. We provide a new relaxation of this notion of hitting time o…
▽ More
We provide a rearrangement based algorithm for fast detection of subgraphs of $k$ vertices with long escape times for directed or undirected networks. Complementing other notions of densest subgraphs and graph cuts, our method is based on the mean hitting time required for a random walker to leave a designated set and hit the complement. We provide a new relaxation of this notion of hitting time on a given subgraph and use that relaxation to construct a fast subgraph detection algorithm and a generalization to $K$-partitioning schemes. Using a modification of the subgraph detector on each component, we propose a graph partitioner that identifies regions where random walks live for comparably large times. Importantly, our method implicitly respects the directed nature of the data for directed graphs while also being applicable to undirected graphs. We apply the partitioning method for community detection to a large class of model and real-world data sets.
△ Less
Submitted 24 December, 2022;
originally announced December 2022.
-
Ranking Edges by their Impact on the Spectral Complexity of Information Diffusion over Networks
Authors:
Jeremy Kazimer,
Manlio de Domenico,
Peter J. Mucha,
Dane Taylor
Abstract:
Despite the numerous ways now available to quantify which parts or subsystems of a network are most important, there remains a lack of centrality measures that are related to the complexity of information flows and are derived directly from entropy measures. Here, we introduce a ranking of edges based on how each edge's removal would change a system's von Neumann entropy (VNE), which is a spectral…
▽ More
Despite the numerous ways now available to quantify which parts or subsystems of a network are most important, there remains a lack of centrality measures that are related to the complexity of information flows and are derived directly from entropy measures. Here, we introduce a ranking of edges based on how each edge's removal would change a system's von Neumann entropy (VNE), which is a spectral-entropy measure that has been adapted from quantum information theory to quantify the complexity of information dynamics over networks. We show that a direct calculation of such rankings is computationally inefficient (or unfeasible) for large networks: e.g.\ the scaling is $\mathcal{O}(N^3)$ per edge for networks with $N$ nodes. To overcome this limitation, we employ spectral perturbation theory to estimate VNE perturbations and derive an approximate edge-ranking algorithm that is accurate and fast to compute, scaling as $\mathcal{O}(N)$ per edge. Focusing on a form of VNE that is associated with a transport operator $e^{-β{ L}}$, where ${ L}$ is a graph Laplacian matrix and $β>0$ is a diffusion timescale parameter, we apply this approach to diverse applications including a network encoding polarized voting patterns of the 117th U.S. Senate, a multimodal transportation system including roads and metro lines in London, and a multiplex brain network encoding correlated human brain activity. Our experiments highlight situations where the edges that are considered to be most important for information diffusion complexity can dramatically change as one considers short, intermediate and long timescales $β$ for diffusion.
△ Less
Submitted 7 May, 2024; v1 submitted 26 October, 2022;
originally announced October 2022.
-
One Node at a Time: Node-Level Network Classification
Authors:
Saray Shai,
Isaac Jacobs,
Peter J. Mucha
Abstract:
Network classification aims to group networks (or graphs) into distinct categories based on their structure. We study the connection between classification of a network and of its constituent nodes, and whether nodes from networks in different groups are distinguishable based on structural node characteristics such as centrality and clustering coefficient. We demonstrate, using various network dat…
▽ More
Network classification aims to group networks (or graphs) into distinct categories based on their structure. We study the connection between classification of a network and of its constituent nodes, and whether nodes from networks in different groups are distinguishable based on structural node characteristics such as centrality and clustering coefficient. We demonstrate, using various network datasets and random network models, that a classifier can be trained to accurately predict the network category of a given node (without seeing the whole network), implying that complex networks display distinct structural patterns even at the node level. Finally, we discuss two applications of node-level network classification: (i) whole-network classification from small samples of nodes, and (ii) network bootstrap**.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
A metric on directed graphs and Markov chains based on hitting probabilities
Authors:
Zachary M. Boyd,
Nicolas Fraiman,
Jeremy L. Marzuola,
Peter J. Mucha,
Braxton Osting,
Jonathan Weare
Abstract:
The shortest-path, commute time, and diffusion distances on undirected graphs have been widely employed in applications such as dimensionality reduction, link prediction, and trip planning. Increasingly, there is interest in using asymmetric structure of data derived from Markov chains and directed graphs, but few metrics are specifically adapted to this task. We introduce a metric on the state sp…
▽ More
The shortest-path, commute time, and diffusion distances on undirected graphs have been widely employed in applications such as dimensionality reduction, link prediction, and trip planning. Increasingly, there is interest in using asymmetric structure of data derived from Markov chains and directed graphs, but few metrics are specifically adapted to this task. We introduce a metric on the state space of any ergodic, finite-state, time-homogeneous Markov chain and, in particular, on any Markov chain derived from a directed graph. Our construction is based on hitting probabilities, with nearness in the metric space related to the transfer of random walkers from one node to another at stationarity. Notably, our metric is insensitive to shortest and average walk distances, thus giving new information compared to existing metrics. We use possible degeneracies in the metric to develop an interesting structural theory of directed graphs and explore a related quotienting procedure. Our metric can be computed in $O(n^3)$ time, where $n$ is the number of states, and in examples we scale up to $n=10,000$ nodes and $\approx 38M$ edges on a desktop computer. In several examples, we explore the nature of the metric, compare it to alternative methods, and demonstrate its utility for weak recovery of community structure in dense graphs, visualization, structure recovering, dynamics exploration, and multiscale cluster detection.
△ Less
Submitted 18 January, 2021; v1 submitted 25 June, 2020;
originally announced June 2020.
-
Multilayer Modularity Belief Propagation To Assess Detectability Of Community Structure
Authors:
William H. Weir,
Benjamin Walker,
Lenka Zdeborová,
Peter J. Mucha
Abstract:
Modularity based community detection encompasses a number of widely used, efficient heuristics for identification of structure in networks. Recently, a belief propagation approach to modularity optimization provided a useful guide for identifying non-trivial structure in single-layer networks in a way that other optimization heuristics have not. In this paper, we extend modularity belief propagati…
▽ More
Modularity based community detection encompasses a number of widely used, efficient heuristics for identification of structure in networks. Recently, a belief propagation approach to modularity optimization provided a useful guide for identifying non-trivial structure in single-layer networks in a way that other optimization heuristics have not. In this paper, we extend modularity belief propagation to multilayer networks. As part of this development, we also directly incorporate a resolution parameter. We show that adjusting the resolution parameter affects the convergence properties of the algorithm and yields different community structures than the baseline. We compare our approach with a widely used community detection tool, GenLouvain, across a range of synthetic, multilayer benchmark networks, demonstrating that our method performs comparably to the state of the art. Finally, we demonstrate the practical advantages of the additional information provided by our tool by way of two real-world network examples. We show how the convergence properties of the algorithm can be used in selecting the appropriate resolution and coupling parameters and how the node-level marginals provide an interpretation for the strength of attachment to the identified communities. We have released our tool as a Python package for convenient use.
△ Less
Submitted 3 July, 2020; v1 submitted 13 August, 2019;
originally announced August 2019.
-
Supracentrality Analysis of Temporal Networks with Directed Interlayer Coupling
Authors:
Dane Taylor,
Mason A. Porter,
Peter J. Mucha
Abstract:
We describe centralities in temporal networks using a supracentrality framework to study centrality trajectories, which characterize how the importances of nodes change in time. We study supracentrality generalizations of eigenvector-based centralities, a family of centrality measures for time-independent networks that includes PageRank, hub and authority scores, and eigenvector centrality. We sta…
▽ More
We describe centralities in temporal networks using a supracentrality framework to study centrality trajectories, which characterize how the importances of nodes change in time. We study supracentrality generalizations of eigenvector-based centralities, a family of centrality measures for time-independent networks that includes PageRank, hub and authority scores, and eigenvector centrality. We start with a sequence of adjacency matrices, each of which represents a time layer of a network at a different point or interval of time. Coupling centrality matrices across time layers with weighted interlayer edges yields a \emph{supracentrality matrix} $\mathbb{C}(ω)$, where $ω$ controls the extent to which centrality trajectories change over time. We can flexibly tune the weight and topology of the interlayer coupling to cater to different scientific applications. The entries of the dominant eigenvector of $\mathbb{C}(ω)$ represent \emph{joint centralities}, which simultaneously quantify the importance of every node in every time layer. Inspired by probability theory, we also compute \emph{marginal} and \emph{conditional centralities}. We illustrate how to adjust the coupling between time layers to tune the extent to which nodes' centrality trajectories are influenced by the oldest and newest time layers. We support our findings by analysis in the limits of small and large $ω$.
△ Less
Submitted 18 September, 2019; v1 submitted 14 June, 2019;
originally announced June 2019.
-
Tunable Eigenvector-Based Centralities for Multiplex and Temporal Networks
Authors:
Dane Taylor,
Mason A. Porter,
Peter J. Mucha
Abstract:
Characterizing the importances (i.e., centralities) of nodes in social, biological, and technological networks is a core topic in both network science and data science. We present a linear-algebraic framework that generalizes eigenvector-based centralities, including PageRank and hub/authority scores, to provide a common framework for two popular classes of multilayer networks: multiplex networks…
▽ More
Characterizing the importances (i.e., centralities) of nodes in social, biological, and technological networks is a core topic in both network science and data science. We present a linear-algebraic framework that generalizes eigenvector-based centralities, including PageRank and hub/authority scores, to provide a common framework for two popular classes of multilayer networks: multiplex networks (which have layers that encode different types of relationships) and temporal networks (in which the relationships change over time). Our approach involves the study of joint, marginal, and conditional "supracentralities" that one can calculate from the dominant eigenvector of a supracentrality matrix [Taylor et al., 2017], which couples centrality matrices that are associated with individual network layers. We extend this prior work (which was restricted to temporal networks with layers that are coupled by adjacent-in-time coupling) by allowing the layers to be coupled through a (possibly asymmetric) interlayer-adjacency matrix $\tilde{\bf A}$, where the entry $\tilde{A}_{tt'} \geq 0$ encodes the coupling between layers $t$ and $t'$. Our framework provides a unifying foundation for centrality analysis of multiplex and temporal networks; it also illustrates a complicated dependency of the supracentralities on the topology and weights of interlayer coupling. By scaling $\tilde{\bf A}$ by an interlayer-coupling strength $ω\ge0$ and develo** a singular perturbation theory for the limits of weak ($ω\to0^+$) and strong coupling ($ω\to\infty$), we also reveal an interesting dependence of supracentralities on the dominant left and right eigenvectors of $\tilde{\bf A}$.
△ Less
Submitted 3 August, 2020; v1 submitted 3 April, 2019;
originally announced April 2019.
-
Local Symmetry and Global Structure in Adaptive Voter Models
Authors:
Philip S. Chodrow,
Peter J. Mucha
Abstract:
Adaptive voter models (AVMs) are simple mechanistic systems that model the emergence of mesoscopic structure from local networked processes driven by conflict and homophily. AVMs display rich behavior, including a phase transition from a fully-fragmented regime of "echo-chambers" to a regime of persistent disagreement governed by low-dimensional quasistable manifolds. Many extant methods for appro…
▽ More
Adaptive voter models (AVMs) are simple mechanistic systems that model the emergence of mesoscopic structure from local networked processes driven by conflict and homophily. AVMs display rich behavior, including a phase transition from a fully-fragmented regime of "echo-chambers" to a regime of persistent disagreement governed by low-dimensional quasistable manifolds. Many extant methods for approximating the behavior of AVMs are either restricted in scope, expensive in computation, or inaccurate in predicting important statistics. In this work, we develop a novel, second-order moment closure approximation method for binary-state rewire-to-random and rewire-to-same model variants. We incorporate a small amount of noise via a random mutation term, which renders the system ergodic. Using ergodicity, we then approximate the voting process, which is non-Markovian in the second moments of the system, with a Markovian term near the phase transition. This approximation exploits an asymmetry between different classes of voting events. The resulting scheme enables us to predict the location of the phase transition and the active edge density in the regime of persistent disagreement, across the entire space of parameters and opinion densities. Numerically, our results are nearly exact for the rewire-to-random model, and competitive with other current approaches for the rewire-to-same model. Moreover, our computations display constant scaling in the mean degree, enabling approximations for denser systems than previously possible. We conclude with suggestions for model refinements and extensions.
△ Less
Submitted 24 December, 2019; v1 submitted 13 December, 2018;
originally announced December 2018.
-
A Map Equation with Metadata: Varying the Role of Attributes in Community Detection
Authors:
Scott Emmons,
Peter J. Mucha
Abstract:
Much of the community detection literature studies structural communities, communities defined solely by the connectivity patterns of the network. Often, networks contain additional metadata which can inform community detection such as the grade and gender of students in a high school social network. In this work, we introduce a tuning parameter to the content map equation that allows users of the…
▽ More
Much of the community detection literature studies structural communities, communities defined solely by the connectivity patterns of the network. Often, networks contain additional metadata which can inform community detection such as the grade and gender of students in a high school social network. In this work, we introduce a tuning parameter to the content map equation that allows users of the Infomap community detection algorithm to control the metadata's relative importance for identifying network structure. On synthetic networks, we show that our algorithm can overcome the structural detectability limit when the metadata is well-aligned with community structure. On real-world networks, we show how our algorithm can achieve greater mutual information with the metadata at a cost in the traditional map equation. Our tuning parameter, like the focusing knob of a microscope, allows users to "zoom in" and "zoom out" on communities with varying levels of focus on the metadata.
△ Less
Submitted 22 September, 2019; v1 submitted 24 October, 2018;
originally announced October 2018.
-
Infectivity Enhances Prediction of Viral Cascades in Twitter
Authors:
Weihua Li,
Skyler J. Cranmer,
Zhiming Zheng,
Peter J. Mucha
Abstract:
Models of contagion dynamics, originally developed for infectious diseases, have proven relevant to the study of information, news, and political opinions in online social systems. Modelling diffusion processes and predicting viral information cascades are important problems in network science. Yet, many studies of information cascades neglect the variation in infectivity across different pieces o…
▽ More
Models of contagion dynamics, originally developed for infectious diseases, have proven relevant to the study of information, news, and political opinions in online social systems. Modelling diffusion processes and predicting viral information cascades are important problems in network science. Yet, many studies of information cascades neglect the variation in infectivity across different pieces of information. Here, we employ early-time observations of online cascades to estimate the infectivity of distinct pieces of information. Using simulations and data from real-world Twitter retweets, we demonstrate that these estimated infectivities can be used to improve predictions about the virality of an information cascade. Develo** our simulations to mimic the real-world data, we consider the effect of the limited effective time for transmission of a cascade and demonstrate that a simple model for slow but non-negligible decay of the infectivity captures the essential properties of retweet distributions. These results demonstrate the interplay between the intrinsic infectivity of a tweet and the complex network environment within which it diffuses, strongly influencing the likelihood of becoming a viral cascade.
△ Less
Submitted 11 September, 2018;
originally announced September 2018.
-
Testing Alignment of Node Attributes with Network Structure Through Label Propagation
Authors:
Natalie Stanley,
Marc Niethammer,
Peter J. Mucha
Abstract:
Attributed network data is becoming increasingly common across fields, as we are often equipped with information about nodes in addition to their pairwise connectivity patterns. This extra information can manifest as a classification, or as a multidimensional vector of features. Recently developed methods that seek to extend community detection approaches to attributed networks have explored how t…
▽ More
Attributed network data is becoming increasingly common across fields, as we are often equipped with information about nodes in addition to their pairwise connectivity patterns. This extra information can manifest as a classification, or as a multidimensional vector of features. Recently developed methods that seek to extend community detection approaches to attributed networks have explored how to most effectively combine connectivity and attribute information to identify quality communities. These methods often rely on some assumption of the dependency relationships between attributes and connectivity. In this work, we seek to develop a statistical test to assess whether node attributes align with network connectivity. The objective is to quantitatively evaluate whether nodes with similar connectivity patterns also have similar attributes. To address this problem, we use a node sampling and label propagation approach. We apply our method to several synthetic examples that explore how network structure and attribute characteristics affect the empirical p-value computed by our method. Finally, we apply the test to a network generated from a single cell mass cytometry (CyTOF) dataset and show that our test can identify markers associated with distinct sub populations of single cells.
△ Less
Submitted 18 May, 2018;
originally announced May 2018.
-
Stochastic Block Models with Multiple Continuous Attributes
Authors:
Natalie Stanley,
Thomas Bonacci,
Roland Kwitt,
Marc Niethammer,
Peter J. Mucha
Abstract:
The stochastic block model (SBM) is a probabilistic model for community structure in networks. Typically, only the adjacency matrix is used to perform SBM parameter inference. In this paper, we consider circumstances in which nodes have an associated vector of continuous attributes that are also used to learn the node-to-community assignments and corresponding SBM parameters. While this assumption…
▽ More
The stochastic block model (SBM) is a probabilistic model for community structure in networks. Typically, only the adjacency matrix is used to perform SBM parameter inference. In this paper, we consider circumstances in which nodes have an associated vector of continuous attributes that are also used to learn the node-to-community assignments and corresponding SBM parameters. While this assumption is not realistic for every application, our model assumes that the attributes associated with the nodes in a network's community can be described by a common multivariate Gaussian model. In this augmented, attributed SBM, the objective is to simultaneously learn the SBM connectivity probabilities with the multivariate Gaussian parameters describing each community. While there are recent examples in the literature that combine connectivity and attribute information to inform community detection, our model is the first augmented stochastic block model to handle multiple continuous attributes. This provides the flexibility in biological data to, for example, augment connectivity information with continuous measurements from multiple experimental modalities. Because the lack of labeled network data often makes community detection results difficult to validate, we highlight the usefulness of our model for two network prediction tasks: link prediction and collaborative filtering. As a result of fitting this attributed stochastic block model, one can predict the attribute vector or connectivity patterns for a new node in the event of the complementary source of information (connectivity or attributes, respectively). We also highlight two biological examples where the attributed stochastic block model provides satisfactory performance in the link prediction and collaborative filtering tasks.
△ Less
Submitted 7 March, 2018;
originally announced March 2018.
-
Social Clustering in Epidemic Spread on Coevolving Networks
Authors:
Hsuan-Wei Lee,
Nishant Malik,
Feng Shi,
Peter J. Mucha
Abstract:
Even though transitivity is a central structural feature of social networks, its influence on epidemic spread on coevolving networks has remained relatively unexplored. Here we introduce and study an adaptive SIS epidemic model wherein the infection and network coevolve with non-trivial probability to close triangles during edge rewiring, leading to substantial reinforcement of network transitivit…
▽ More
Even though transitivity is a central structural feature of social networks, its influence on epidemic spread on coevolving networks has remained relatively unexplored. Here we introduce and study an adaptive SIS epidemic model wherein the infection and network coevolve with non-trivial probability to close triangles during edge rewiring, leading to substantial reinforcement of network transitivity. This new model provides a unique opportunity to study the role of transitivity in altering the SIS dynamics on a coevolving network. Using numerical simulations and Approximate Master Equations (AME), we identify and examine a rich set of dynamical features in the new model. In many cases, the AME including transitivity reinforcement provides accurate predictions of stationary-state disease prevalences and network degree distributions. Furthermore, for some parameter settings, the AME accurately trace the temporal evolution of the system. We show that higher transitivity reinforcement in the model leads to lower levels of infective individuals in the population, when closing a triangle is the dominant rewiring mechanism. These methods and results may be useful in develo** ideas and modeling strategies for controlling SIS type epidemics.
△ Less
Submitted 2 June, 2019; v1 submitted 16 July, 2017;
originally announced July 2017.
-
Compressing networks with super nodes
Authors:
Natalie Stanley,
Roland Kwitt,
Marc Niethammer,
Peter J. Mucha
Abstract:
Community detection is a commonly used technique for identifying groups in a network based on similarities in connectivity patterns. To facilitate community detection in large networks, we recast the network to be partitioned into a smaller network of 'super nodes', each super node comprising one or more nodes in the original network. To define the seeds of our super nodes, we apply the 'CoreHD' r…
▽ More
Community detection is a commonly used technique for identifying groups in a network based on similarities in connectivity patterns. To facilitate community detection in large networks, we recast the network to be partitioned into a smaller network of 'super nodes', each super node comprising one or more nodes in the original network. To define the seeds of our super nodes, we apply the 'CoreHD' ranking from dismantling and decycling. We test our approach through the analysis of two common methods for community detection: modularity maximization with the Louvain algorithm and maximum likelihood optimization for fitting a stochastic block model. Our results highlight that applying community detection to the compressed network of super nodes is significantly faster while successfully producing partitions that are more aligned with the local network connectivity, more stable across multiple (stochastic) runs within and between community detection algorithms, and overlap well with the results obtained using the full network.
△ Less
Submitted 13 June, 2017;
originally announced June 2017.
-
Post-processing partitions to identify domains of modularity optimization
Authors:
William H. Weir,
Scott Emmons,
Ryan Gibson,
Dane Taylor,
Peter J. Mucha
Abstract:
We introduce the Convex Hull of Admissible Modularity Partitions (CHAMP) algorithm to prune and prioritize different network community structures identified across multiple runs of possibly various computational heuristics. Given a set of partitions, CHAMP identifies the domain of modularity optimization for each partition ---i.e., the parameter-space domain where it has the largest modularity rel…
▽ More
We introduce the Convex Hull of Admissible Modularity Partitions (CHAMP) algorithm to prune and prioritize different network community structures identified across multiple runs of possibly various computational heuristics. Given a set of partitions, CHAMP identifies the domain of modularity optimization for each partition ---i.e., the parameter-space domain where it has the largest modularity relative to the input set---discarding partitions with empty domains to obtain the subset of partitions that are "admissible" candidate community structures that remain potentially optimal over indicated parameter domains. Importantly, CHAMP can be used for multi-dimensional parameter spaces, such as those for multilayer networks where one includes a resolution parameter and interlayer coupling. Using the results from CHAMP, a user can more appropriately select robust community structures by observing the sizes of domains of optimization and the pairwise comparisons between partitions in the admissible subset. We demonstrate the utility of CHAMP with several example networks. In these examples, CHAMP focuses attention onto pruned subsets of admissible partitions that are 20-to-1785 times smaller than the sets of unique partitions obtained by community detection heuristics that were input into CHAMP.
△ Less
Submitted 21 August, 2017; v1 submitted 12 June, 2017;
originally announced June 2017.
-
Case studies in network community detection
Authors:
Saray Shai,
Natalie Stanley,
Clara Granell,
Dane Taylor,
Peter J. Mucha
Abstract:
Community structure describes the organization of a network into subgraphs that contain a prevalence of edges within each subgraph and relatively few edges across boundaries between subgraphs. The development of community-detection methods has occurred across disciplines, with numerous and varied algorithms proposed to find communities. As we present in this Chapter via several case studies, commu…
▽ More
Community structure describes the organization of a network into subgraphs that contain a prevalence of edges within each subgraph and relatively few edges across boundaries between subgraphs. The development of community-detection methods has occurred across disciplines, with numerous and varied algorithms proposed to find communities. As we present in this Chapter via several case studies, community detection is not just an "end game" unto itself, but rather a step in the analysis of network data which is then useful for furthering research in the disciplinary domain of interest. These case-study examples arise from diverse applications, ranging from social and political science to neuroscience and genetics, and we have chosen them to demonstrate key aspects of community detection and to highlight that community detection, in practice, should be directed by the application at hand.
△ Less
Submitted 5 May, 2017;
originally announced May 2017.
-
Network-ensemble comparisons with stochastic rewiring and von Neumann entropy
Authors:
Zichao Li,
Peter J. Mucha,
Dane Taylor
Abstract:
Assessing whether a given network is typical or atypical for a random-network ensemble (i.e., network-ensemble comparison) has widespread applications ranging from null-model selection and hypothesis testing to clustering and classifying networks. We develop a framework for network-ensemble comparison by subjecting the network to stochastic rewiring. We study two rewiring processes, uniform and de…
▽ More
Assessing whether a given network is typical or atypical for a random-network ensemble (i.e., network-ensemble comparison) has widespread applications ranging from null-model selection and hypothesis testing to clustering and classifying networks. We develop a framework for network-ensemble comparison by subjecting the network to stochastic rewiring. We study two rewiring processes, uniform and degree-preserved rewiring, which yield random-network ensembles that converge to the Erdos-Renyi and configuration-model ensembles, respectively. We study convergence through von Neumann entropy (VNE), a network summary statistic measuring information content based on the spectra of a Laplacian matrix, and develop a perturbation analysis for the expected effect of rewiring on VNE. Our analysis yields an estimate for how many rewires are required for a given network to resemble a typical network from an ensemble, offering a computationally efficient quantity for network-ensemble comparison that does not require simulation of the corresponding rewiring process.
△ Less
Submitted 29 November, 2017; v1 submitted 4 April, 2017;
originally announced April 2017.
-
Evolutionary prisoner's dilemma games coevolving on adaptive networks
Authors:
Hsuan-Wei Lee,
Nishant Malik,
Peter J. Mucha
Abstract:
We study a model for switching strategies in the Prisoner's Dilemma game on adaptive networks of player pairings that coevolve as players attempt to maximize their return. We use a node-based strategy model wherein each player follows one strategy at a time (cooperate or defect) across all of its neighbors, changing that strategy and possibly changing partners in response to local changes in the n…
▽ More
We study a model for switching strategies in the Prisoner's Dilemma game on adaptive networks of player pairings that coevolve as players attempt to maximize their return. We use a node-based strategy model wherein each player follows one strategy at a time (cooperate or defect) across all of its neighbors, changing that strategy and possibly changing partners in response to local changes in the network of player pairing and in the strategies used by connected partners. We compare and contrast numerical simulations with existing pair approximation differential equations for describing this system, as well as more accurate equations developed here using the framework of approximate master equations. We explore the parameter space of the model, demonstrating the relatively high accuracy of the approximate master equations for describing the system observations made from simulations. We study two variations of this partner-switching model to investigate the system evolution, predict stationary states, and compare the total utilities and other qualitative differences between these two model variants.
△ Less
Submitted 23 July, 2017; v1 submitted 16 February, 2017;
originally announced February 2017.
-
Feature-Based Classification of Networks
Authors:
Ian Barnett,
Nishant Malik,
Marieke L. Kuijjer,
Peter J. Mucha,
Jukka-Pekka Onnela
Abstract:
Network representations of systems from various scientific and societal domains are neither completely random nor fully regular, but instead appear to contain recurring structural building blocks. These features tend to be shared by networks belonging to the same broad class, such as the class of social networks or the class of biological networks. At a finer scale of classification within each su…
▽ More
Network representations of systems from various scientific and societal domains are neither completely random nor fully regular, but instead appear to contain recurring structural building blocks. These features tend to be shared by networks belonging to the same broad class, such as the class of social networks or the class of biological networks. At a finer scale of classification within each such class, networks describing more similar systems tend to have more similar features. This occurs presumably because networks representing similar purposes or constructions would be expected to be generated by a shared set of domain specific mechanisms, and it should therefore be possible to classify these networks into categories based on their features at various structural levels. Here we describe and demonstrate a new, hybrid approach that combines manual selection of features of potential interest with existing automated classification methods. In particular, selecting well-known and well-studied features that have been used throughout social network analysis and network science and then classifying with methods such as random forests that are of special utility in the presence of feature collinearity, we find that we achieve higher accuracy, in shorter computation time, with greater interpretability of the network classification results.
△ Less
Submitted 19 October, 2016;
originally announced October 2016.
-
Super-resolution community detection for layer-aggregated multilayer networks
Authors:
Dane Taylor,
Rajmonda S. Caceres,
Peter J. Mucha
Abstract:
Applied network science often involves preprocessing network data before applying a network-analysis method, and there is typically a theoretical disconnect between these steps. For example, it is common to aggregate time-varying network data into windows prior to analysis, and the tradeoffs of this preprocessing are not well understood. Focusing on the problem of detecting small communities in mu…
▽ More
Applied network science often involves preprocessing network data before applying a network-analysis method, and there is typically a theoretical disconnect between these steps. For example, it is common to aggregate time-varying network data into windows prior to analysis, and the tradeoffs of this preprocessing are not well understood. Focusing on the problem of detecting small communities in multilayer networks, we study the effects of layer aggregation by develo** random-matrix theory for modularity matrices associated with layer-aggregated networks with $N$ nodes and $L$ layers, which are drawn from an ensemble of Erdős-Rényi networks. We study phase transitions in which eigenvectors localize onto communities (allowing their detection) and which occur for a given community provided its size surpasses a detectability limit $K^*$. When layers are aggregated via a summation, we obtain $K^*\varpropto \mathcal{O}(\sqrt{NL}/T)$, where $T$ is the number of layers across which the community persists. Interestingly, if $T$ is allowed to vary with $L$ then summation-based layer aggregation enhances small-community detection even if the community persists across a vanishing fraction of layers, provided that $T/L$ decays more slowly than $ \mathcal{O}(L^{-1/2})$. Moreover, we find that thresholding the summation can in some cases cause $K^*$ to decay exponentially, decreasing by orders of magnitude in a phenomenon we call super-resolution community detection. That is, layer aggregation with thresholding is a nonlinear data filter enabling detection of communities that are otherwise too small to detect. Importantly, different thresholds generally enhance the detectability of communities having different properties, illustrating that community detection can be obscured if one analyzes network data using a single threshold.
△ Less
Submitted 13 July, 2017; v1 submitted 14 September, 2016;
originally announced September 2016.
-
Enhanced detectability of community structure in multilayer networks through layer aggregation
Authors:
Dane Taylor,
Saray Shai,
Natalie Stanley,
Peter J. Mucha
Abstract:
Many systems are naturally represented by a multilayer network in which edges exist in multiple layers that encode different, but potentially related, types of interactions, and it is important to understand limitations on the detectability of community structure in these networks. Using random matrix theory, we analyze detectability limitations for multilayer (specifically, multiplex) stochastic…
▽ More
Many systems are naturally represented by a multilayer network in which edges exist in multiple layers that encode different, but potentially related, types of interactions, and it is important to understand limitations on the detectability of community structure in these networks. Using random matrix theory, we analyze detectability limitations for multilayer (specifically, multiplex) stochastic block models (SBMs) in which L layers are derived from a common SBM. We study the effect of layer aggregation on detectability for several aggregation methods, including summation of the layers' adjacency matrices for which we show the detectability limit vanishes as O(L^{-1/2}) with increasing number of layers, L. Importantly, we find a similar scaling behavior when the summation is thresholded at an optimal value, providing insight into the common - but not well understood - practice of thresholding pairwise-interaction data to obtain sparse network representations.
△ Less
Submitted 4 May, 2016; v1 submitted 16 November, 2015;
originally announced November 2015.
-
A Local Perspective on Community Structure in Multilayer Networks
Authors:
Lucas G. S. Jeub,
Michael W. Mahoney,
Peter J. Mucha,
Mason A. Porter
Abstract:
The analysis of multilayer networks is among the most active areas of network science, and there are now several methods to detect dense "communities" of nodes in multilayer networks. One way to define a community is as a set of nodes that trap a diffusion-like dynamical process (usually a random walk) for a long time. In this view, communities are sets of nodes that create bottlenecks to the spre…
▽ More
The analysis of multilayer networks is among the most active areas of network science, and there are now several methods to detect dense "communities" of nodes in multilayer networks. One way to define a community is as a set of nodes that trap a diffusion-like dynamical process (usually a random walk) for a long time. In this view, communities are sets of nodes that create bottlenecks to the spreading of a dynamical process on a network. We analyze the local behavior of different random walks on multiplex networks (which are multilayer networks in which different layers correspond to different types of edges) and show that they have very different bottlenecks that hence correspond to rather different notions of what it means for a set of nodes to be a good community. This has direct implications for the behavior of community-detection methods that are based on these random walks.
△ Less
Submitted 22 May, 2016; v1 submitted 17 October, 2015;
originally announced October 2015.
-
Clustering Network Layers With the Strata Multilayer Stochastic Block Model
Authors:
Natalie Stanley,
Saray Shai,
Dane Taylor,
Peter J. Mucha
Abstract:
Multilayer networks are a useful data structure for simultaneously capturing multiple types of relationships between a set of nodes. In such networks, each relational definition gives rise to a layer. While each layer provides its own set of information, community structure across layers can be collectively utilized to discover and quantify underlying relational patterns between nodes. To concisel…
▽ More
Multilayer networks are a useful data structure for simultaneously capturing multiple types of relationships between a set of nodes. In such networks, each relational definition gives rise to a layer. While each layer provides its own set of information, community structure across layers can be collectively utilized to discover and quantify underlying relational patterns between nodes. To concisely extract information from a multilayer network, we propose to identify and combine sets of layers with meaningful similarities in community structure. In this paper, we describe the "strata multilayer stochastic block model'' (sMLSBM), a probabilistic model for multilayer community structure. The central extension of the model is that there exist groups of layers, called "strata'', which are defined such that all layers in a given stratum have community structure described by a common stochastic block model (SBM). That is, layers in a stratum exhibit similar node-to-community assignments and SBM probability parameters. Fitting the sMLSBM to a multilayer network provides a joint clustering that yields node-to-community and layer-to-stratum assignments, which cooperatively aid one another during inference. We describe an algorithm for separating layers into their appropriate strata and an inference technique for estimating the SBM parameters for each stratum. We demonstrate our method using synthetic networks and a multilayer network inferred from data collected in the Human Microbiome Project.
△ Less
Submitted 9 October, 2015; v1 submitted 7 July, 2015;
originally announced July 2015.
-
Eigenvector-Based Centrality Measures for Temporal Networks
Authors:
Dane Taylor,
Sean A. Myers,
Aaron Clauset,
Mason A. Porter,
Peter J. Mucha
Abstract:
Numerous centrality measures have been developed to quantify the importances of nodes in time-independent networks, and many of them can be expressed as the leading eigenvector of some matrix. With the increasing availability of network data that changes in time, it is important to extend such eigenvector-based centrality measures to time-dependent networks. In this paper, we introduce a principle…
▽ More
Numerous centrality measures have been developed to quantify the importances of nodes in time-independent networks, and many of them can be expressed as the leading eigenvector of some matrix. With the increasing availability of network data that changes in time, it is important to extend such eigenvector-based centrality measures to time-dependent networks. In this paper, we introduce a principled generalization of network centrality measures that is valid for any eigenvector-based centrality. We consider a temporal network with N nodes as a sequence of T layers that describe the network during different time windows, and we couple centrality matrices for the layers into a supra-centrality matrix of size NTxNT whose dominant eigenvector gives the centrality of each node i at each time t. We refer to this eigenvector and its components as a joint centrality, as it reflects the importances of both the node i and the time layer t. We also introduce the concepts of marginal and conditional centralities, which facilitate the study of centrality trajectories over time. We find that the strength of coupling between layers is important for determining multiscale properties of centrality, such as localization phenomena and the time scale of centrality changes. In the strong-coupling regime, we derive expressions for time-averaged centralities, which are given by the zeroth-order terms of a singular perturbation expansion. We also study first-order terms to obtain first-order-mover scores, which concisely describe the magnitude of nodes' centrality changes over time. As examples, we apply our method to three empirical temporal networks: the United States Ph.D. exchange in mathematics, costarring relationships among top-billed actors during the Golden Age of Hollywood, and citations of decisions from the United States Supreme Court.
△ Less
Submitted 21 September, 2016; v1 submitted 5 July, 2015;
originally announced July 2015.
-
Topological data analysis of contagion maps for examining spreading processes on networks
Authors:
Dane Taylor,
Florian Klimm,
Heather A. Harrington,
Miroslav Kramar,
Konstantin Mischaikow,
Mason A. Porter,
Peter J. Mucha
Abstract:
Social and biological contagions are influenced by the spatial embeddedness of networks. Historically, many epidemics spread as a wave across part of the Earth's surface; however, in modern contagions long-range edges -- for example, due to airline transportation or communication media -- allow clusters of a contagion to appear in distant locations. Here we study the spread of contagions on networ…
▽ More
Social and biological contagions are influenced by the spatial embeddedness of networks. Historically, many epidemics spread as a wave across part of the Earth's surface; however, in modern contagions long-range edges -- for example, due to airline transportation or communication media -- allow clusters of a contagion to appear in distant locations. Here we study the spread of contagions on networks through a methodology grounded in topological data analysis and nonlinear dimension reduction. We construct "contagion maps" that use multiple contagions on a network to map the nodes as a point cloud. By analyzing the topology, geometry, and dimensionality of manifold structure in such point clouds, we reveal insights to aid in the modeling, forecast, and control of spreading processes. Our approach highlights contagion maps also as a viable tool for inferring low-dimensional structure in networks.
△ Less
Submitted 29 July, 2015; v1 submitted 5 August, 2014;
originally announced August 2014.
-
Think Locally, Act Locally: The Detection of Small, Medium-Sized, and Large Communities in Large Networks
Authors:
Lucas G. S. Jeub,
Prakash Balachandran,
Mason A. Porter,
Peter J. Mucha,
Michael W. Mahoney
Abstract:
It is common in the study of networks to investigate meso-scale features to try to gain an understanding of network structure and function. For example, numerous algorithms have been developed to try to identify "communities," which are typically construed as sets of nodes with denser connections internally than with the remainder of a network. In this paper, we adopt a complementary perspective t…
▽ More
It is common in the study of networks to investigate meso-scale features to try to gain an understanding of network structure and function. For example, numerous algorithms have been developed to try to identify "communities," which are typically construed as sets of nodes with denser connections internally than with the remainder of a network. In this paper, we adopt a complementary perspective that "communities" are associated with bottlenecks of locally-biased dynamical processes that begin at seed sets of nodes, and we employ several different community-identification procedures (using diffusion-based and geodesic-based dynamics) to investigate community quality as a function of community size. Using several empirical and synthetic networks, we identify several distinct scenarios for ``size-resolved community structure'' that can arise in real (and realistic) networks. Depending on which scenario holds, one may or may not be able to successfully identify ``good'' communities in a given network, the manner in which different small communities fit together to form meso-scale network structures can be very different, and processes such as viral propagation and information diffusion can exhibit very different dynamics.In addition, our results suggest that, for many large realistic networks, the output of locally-biased methods that focus on communities that are centered around a given seed node might have better conceptual grounding and greater practical utility than the output of global community-detection methods. They also illustrate subtler structural properties that are important to consider in the development of better benchmark networks to test methods for community detection.
[Note: Because of space limitations in the arXiv's abstract field, this is an abridged version of the paper's abstract.]
△ Less
Submitted 8 October, 2014; v1 submitted 15 March, 2014;
originally announced March 2014.
-
Kantian fractionalization predicts the conflict propensity of the international system
Authors:
Skyler J. Cranmer,
Elizabeth J. Menninga,
Peter J. Mucha
Abstract:
The study of complex social and political phenomena with the perspective and methods of network science has proven fruitful in a variety of areas, including applications in political science and more narrowly the field of international relations. We propose a new line of research in the study of international conflict by showing that the multiplex fractionalization of the international system (whi…
▽ More
The study of complex social and political phenomena with the perspective and methods of network science has proven fruitful in a variety of areas, including applications in political science and more narrowly the field of international relations. We propose a new line of research in the study of international conflict by showing that the multiplex fractionalization of the international system (which we label Kantian fractionalization) is a powerful predictor of the propensity for violent interstate conflict, a key indicator of the system's stability. In so doing, we also demonstrate the first use of multislice modularity for community detection in a multiplex network application. Even after controlling for established system-level conflict indicators, we find that Kantian fractionalization contributes more to model fit for violent interstate conflict than previously established measures. Moreover, evaluating the influence of each of the constituent networks shows that joint democracy plays little, if any, role in predicting system stability, thus challenging a major empirical finding of the international relations literature. Lastly, a series of Granger causal tests shows that the temporal variability of Kantian fractionalization is consistent with a causal relationship with the prevalence of conflict in the international system. This causal relationship has real-world policy implications as changes in Kantian fractionalization could serve as an early warning sign of international instability.
△ Less
Submitted 1 February, 2014;
originally announced February 2014.
-
Network Structure and Biased Variance Estimation in Respondent Driven Sampling
Authors:
Ashton M. Verdery,
Ted Mouw,
Shawn Bauldry,
Peter J. Mucha
Abstract:
This paper explores bias in the estimation of sampling variance in Respondent Driven Sampling (RDS). Prior methodological work on RDS has focused on its problematic assumptions and the biases and inefficiencies of its estimators of the population mean. Nonetheless, researchers have given only slight attention to the topic of estimating sampling variance in RDS, despite the importance of variance e…
▽ More
This paper explores bias in the estimation of sampling variance in Respondent Driven Sampling (RDS). Prior methodological work on RDS has focused on its problematic assumptions and the biases and inefficiencies of its estimators of the population mean. Nonetheless, researchers have given only slight attention to the topic of estimating sampling variance in RDS, despite the importance of variance estimation for the construction of confidence intervals and hypothesis tests. In this paper, we show that the estimators of RDS sampling variance rely on a critical assumption that the network is First Order Markov (FOM) with respect to the dependent variable of interest. We demonstrate, through intuitive examples, mathematical generalizations, and computational experiments that current RDS variance estimators will always underestimate the population sampling variance of RDS in empirical networks that do not conform to the FOM assumption. Analysis of 215 observed university and school networks from Facebook and Add Health indicates that the FOM assumption is violated in every empirical network we analyze, and that these violations lead to substantially biased RDS estimators of sampling variance. We propose and test two alternative variance estimators that show some promise for reducing biases, but which also illustrate the limits of estimating sampling variance with only partial information on the underlying population social network.
△ Less
Submitted 4 December, 2015; v1 submitted 19 September, 2013;
originally announced September 2013.
-
Role of social environment and social clustering in spread of opinions in co-evolving networks
Authors:
Nishant Malik,
Peter J. Mucha
Abstract:
Taking a pragmatic approach to the processes involved in the phenomena of collective opinion formation, we investigate two specific modifications to the co-evolving network voter model of opinion formation, studied by Holme and Newman [1]. First, we replace the rewiring probability parameter by a distribution of probability of accepting or rejecting opinions between individuals, accounting for the…
▽ More
Taking a pragmatic approach to the processes involved in the phenomena of collective opinion formation, we investigate two specific modifications to the co-evolving network voter model of opinion formation, studied by Holme and Newman [1]. First, we replace the rewiring probability parameter by a distribution of probability of accepting or rejecting opinions between individuals, accounting for the asymmetric influences in relationships among individuals in a social group. Second, we modify the rewiring step by a path-length-based preference for rewiring that reinforces local clustering. We have investigated the influences of these modifications on the outcomes of the simulations of this model. We found that varying the shape of the distribution of probability of accepting or rejecting opinions can lead to the emergence of two qualitatively distinct final states, one having several isolated connected components each in internal consensus leading to the existence of diverse set of opinions and the other having one single dominant connected component with each node within it having the same opinion. Furthermore, and more importantly, we found that the initial clustering in network can also induce similar transitions. Our investigation also brings forward that these transitions are governed by a weak and complex dependence on system size. We found that the networks in the final states of the model have rich structural properties including the small world property for some parameter regimes. [1] P. Holme and M. Newman, Phys. Rev. E 74, 056108 (2006).
△ Less
Submitted 12 August, 2013; v1 submitted 8 August, 2013;
originally announced August 2013.
-
A testing based extraction algorithm for identifying significant communities in networks
Authors:
James D. Wilson,
Simi Wang,
Peter J. Mucha,
Shankar Bhamidi,
Andrew B. Nobel
Abstract:
A common and important problem arising in the study of networks is how to divide the vertices of a given network into one or more groups, called communities, in such a way that vertices of the same community are more interconnected than vertices belonging to different ones. We propose and investigate a testing based community detection procedure called Extraction of Statistically Significant Commu…
▽ More
A common and important problem arising in the study of networks is how to divide the vertices of a given network into one or more groups, called communities, in such a way that vertices of the same community are more interconnected than vertices belonging to different ones. We propose and investigate a testing based community detection procedure called Extraction of Statistically Significant Communities (ESSC). The ESSC procedure is based on $p$-values for the strength of connection between a single vertex and a set of vertices under a reference distribution derived from a conditional configuration network model. The procedure automatically selects both the number of communities in the network and their size. Moreover, ESSC can handle overlap** communities and, unlike the majority of existing methods, identifies "background" vertices that do not belong to a well-defined community. The method has only one parameter, which controls the stringency of the hypothesis tests. We investigate the performance and potential use of ESSC and compare it with a number of existing methods, through a validation study using four real network data sets. In addition, we carry out a simulation study to assess the effectiveness of ESSC in networks with various types of community structure, including networks with overlap** communities and those with background vertices. These results suggest that ESSC is an effective exploratory tool for the discovery of relevant community structure in complex network systems. Data and software are available at \urlhttp://www.unc.edu/~jameswd/research.html.
△ Less
Submitted 3 December, 2014; v1 submitted 4 August, 2013;
originally announced August 2013.
-
A multi-opinion evolving voter model with infinitely many phase transitions
Authors:
Feng Shi,
Peter J. Mucha,
Rick Durrett
Abstract:
We consider an idealized model in which individuals' changing opinions and their social network coevolve, with disagreements between neighbors in the network resolved either through one imitating the opinion of the other or by reassignment of the discordant edge. Specifically, an interaction between $x$ and one of its neighbors $y$ leads to $x$ imitating $y$ with probability $(1-α)$ and otherwise…
▽ More
We consider an idealized model in which individuals' changing opinions and their social network coevolve, with disagreements between neighbors in the network resolved either through one imitating the opinion of the other or by reassignment of the discordant edge. Specifically, an interaction between $x$ and one of its neighbors $y$ leads to $x$ imitating $y$ with probability $(1-α)$ and otherwise (i.e., with probability $α$) $x$ cutting its tie to $y$ in order to instead connect to a randomly chosen individual. Building on previous work about the two-opinion case, we study the multiple-opinion situation, finding that the model has infinitely many phase transitions. Moreover, the formulas describing the end states of these processes are remarkably simple when expressed as a function of $β= α/(1-α)$.
△ Less
Submitted 29 March, 2013;
originally announced March 2013.
-
Dynamics on Modular Networks with Heterogeneous Correlations
Authors:
Sergey Melnik,
Mason A. Porter,
Peter J. Mucha,
James P. Gleeson
Abstract:
We develop a new ensemble of modular random graphs in which degree-degree correlations can be different in each module and the inter-module connections are defined by the joint degree-degree distribution of nodes for each pair of modules. We present an analytical approach that allows one to analyze several types of binary dynamics operating on such networks, and we illustrate our approach using bo…
▽ More
We develop a new ensemble of modular random graphs in which degree-degree correlations can be different in each module and the inter-module connections are defined by the joint degree-degree distribution of nodes for each pair of modules. We present an analytical approach that allows one to analyze several types of binary dynamics operating on such networks, and we illustrate our approach using bond percolation, site percolation, and the Watts threshold model. The new network ensemble generalizes existing models (e.g., the well-known configuration model and LFR networks) by allowing a heterogeneous distribution of degree-degree correlations across modules, which is important for the consideration of nonidentical interacting networks.
△ Less
Submitted 4 February, 2014; v1 submitted 7 July, 2012;
originally announced July 2012.
-
Robust Detection of Dynamic Community Structure in Networks
Authors:
Danielle S. Bassett,
Mason A. Porter,
Nicholas F. Wymbs,
Scott T. Grafton,
Jean M. Carlson,
Peter J. Mucha
Abstract:
We describe techniques for the robust detection of community structure in some classes of time-dependent networks. Specifically, we consider the use of statistical null models for facilitating the principled identification of structural modules in semi-decomposable systems. Null models play an important role both in the optimization of quality functions such as modularity and in the subsequent ass…
▽ More
We describe techniques for the robust detection of community structure in some classes of time-dependent networks. Specifically, we consider the use of statistical null models for facilitating the principled identification of structural modules in semi-decomposable systems. Null models play an important role both in the optimization of quality functions such as modularity and in the subsequent assessment of the statistical validity of identified community structure. We examine the sensitivity of such methods to model parameters and show how comparisons to null models can help identify system scales. By considering a large number of optimizations, we quantify the variance of network diagnostics over optimizations (`optimization variance') and over randomizations of network structure (`randomization variance'). Because the modularity quality function typically has a large number of nearly-degenerate local optima for networks constructed using real data, we develop a method to construct representative partitions that uses a null model to correct for statistical noise in sets of partitions. To illustrate our results, we employ ensembles of time-dependent networks extracted from both nonlinear oscillators and empirical neuroscience data.
△ Less
Submitted 12 April, 2013; v1 submitted 19 June, 2012;
originally announced June 2012.
-
Core-Periphery Structure in Networks
Authors:
M. Puck Rombach,
Mason A. Porter,
James H. Fowler,
Peter J. Mucha
Abstract:
Intermediate-scale (or `meso-scale') structures in networks have received considerable attention, as the algorithmic detection of such structures makes it possible to discover network features that are not apparent either at the local scale of nodes and edges or at the global scale of summary statistics. Numerous types of meso-scale structures can occur in networks, but investigations of such feat…
▽ More
Intermediate-scale (or `meso-scale') structures in networks have received considerable attention, as the algorithmic detection of such structures makes it possible to discover network features that are not apparent either at the local scale of nodes and edges or at the global scale of summary statistics. Numerous types of meso-scale structures can occur in networks, but investigations of such features have focused predominantly on the identification and study of community structure. In this paper, we develop a new method to investigate the meso-scale feature known as core-periphery structure, which entails identifying densely-connected core nodes and sparsely-connected periphery nodes. In contrast to communities, the nodes in a core are also reasonably well-connected to those in the periphery. Our new method of computing core-periphery structure can identify multiple cores in a network and takes different possible cores into account. We illustrate the differences between our method and several existing methods for identifying which nodes belong to a core, and we use our technique to examine core-periphery structure in examples of friendship, collaboration, transportation, and voting networks.
△ Less
Submitted 2 April, 2013; v1 submitted 13 February, 2012;
originally announced February 2012.
-
Social Structure of Facebook Networks
Authors:
Amanda L. Traud,
Peter J. Mucha,
Mason A. Porter
Abstract:
We study the social structure of Facebook "friendship" networks at one hundred American colleges and universities at a single point in time, and we examine the roles of user attributes - gender, class year, major, high school, and residence - at these institutions. We investigate the influence of common attributes at the dyad level in terms of assortativity coefficients and regression models. We t…
▽ More
We study the social structure of Facebook "friendship" networks at one hundred American colleges and universities at a single point in time, and we examine the roles of user attributes - gender, class year, major, high school, and residence - at these institutions. We investigate the influence of common attributes at the dyad level in terms of assortativity coefficients and regression models. We then examine larger-scale grou**s by detecting communities algorithmically and comparing them to network partitions based on the user characteristics. We thereby compare the relative importances of different characteristics at different institutions, finding for example that common high school is more important to the social organization of large institutions and that the importance of common major varies significantly between institutions. Our calculations illustrate how microscopic and macroscopic perspectives give complementary insights on the social organization at universities and suggest future studies to investigate such phenomena further.
△ Less
Submitted 10 February, 2011;
originally announced February 2011.
-
Accuracy of Mean-Field Theory for Dynamics on Real-World Networks
Authors:
James P. Gleeson,
Sergey Melnik,
Jonathan A. Ward,
Mason A. Porter,
Peter J. Mucha
Abstract:
Mean-field analysis is an important tool for understanding dynamics on complex networks. However, surprisingly little attention has been paid to the question of whether mean-field predictions are accurate, and this is particularly true for real-world networks with clustering and modular structure. In this paper, we compare mean-field predictions to numerical simulation results for dynamical proces…
▽ More
Mean-field analysis is an important tool for understanding dynamics on complex networks. However, surprisingly little attention has been paid to the question of whether mean-field predictions are accurate, and this is particularly true for real-world networks with clustering and modular structure. In this paper, we compare mean-field predictions to numerical simulation results for dynamical processes running on 21 real-world networks and demonstrate that the accuracy of the theory depends not only on the mean degree of the networks but also on the mean first-neighbor degree. We show that mean-field theory can give (unexpectedly) accurate results for certain dynamics on disassortative real-world networks even when the mean degree is as low as 4.
△ Less
Submitted 4 January, 2012; v1 submitted 16 November, 2010;
originally announced November 2010.
-
Community Structure in the United Nations General Assembly
Authors:
Kevin T. Macon,
Peter J. Mucha,
Mason A. Porter
Abstract:
We study the community structure of networks representing voting on resolutions in the United Nations General Assembly. We construct networks from the voting records of the separate annual sessions between 1946 and 2008 in three different ways: (1) by considering voting similarities as weighted unipartite networks; (2) by considering voting similarities as weighted, signed unipartite networks; and…
▽ More
We study the community structure of networks representing voting on resolutions in the United Nations General Assembly. We construct networks from the voting records of the separate annual sessions between 1946 and 2008 in three different ways: (1) by considering voting similarities as weighted unipartite networks; (2) by considering voting similarities as weighted, signed unipartite networks; and (3) by examining signed bipartite networks in which countries are connected to resolutions. For each formulation, we detect communities by optimizing network modularity using an appropriate null model. We compare and contrast the results that we obtain for these three different network representations. In so doing, we illustrate the need to consider multiple resolution parameters and explore the effectiveness of each network representation for identifying voting groups amidst the large amount of agreement typical in General Assembly votes.
△ Less
Submitted 18 October, 2010;
originally announced October 2010.
-
Communities in Networks
Authors:
Mason A. Porter,
Jukka-Pekka Onnela,
Peter J. Mucha
Abstract:
We survey some of the concepts, methods, and applications of community detection, which has become an increasingly important area of network science. To help ease newcomers into the field, we provide a guide to available methodology and open problems, and discuss why scientists from diverse backgrounds are interested in these problems. As a running theme, we emphasize the connections of communit…
▽ More
We survey some of the concepts, methods, and applications of community detection, which has become an increasingly important area of network science. To help ease newcomers into the field, we provide a guide to available methodology and open problems, and discuss why scientists from diverse backgrounds are interested in these problems. As a running theme, we emphasize the connections of community detection to problems in statistical physics and computational optimization.
△ Less
Submitted 9 September, 2009; v1 submitted 22 February, 2009;
originally announced February 2009.
-
Community Structure in the United States House of Representatives
Authors:
Mason A. Porter,
Peter J. Mucha,
M. E. J. Newman,
A. J. Friend
Abstract:
We investigate the networks of committee and subcommittee assignments in the United States House of Representatives from the 101st--108th Congresses, with the committees connected by ``interlocks'' or common membership. We examine the community structure in these networks using several methods, revealing strong links between certain committees as well as an intrinsic hierarchical structure in th…
▽ More
We investigate the networks of committee and subcommittee assignments in the United States House of Representatives from the 101st--108th Congresses, with the committees connected by ``interlocks'' or common membership. We examine the community structure in these networks using several methods, revealing strong links between certain committees as well as an intrinsic hierarchical structure in the House as a whole. We identify structural changes, including additional hierarchical levels and higher modularity, resulting from the 1994 election, in which the Republican party earned majority status in the House for the first time in more than forty years. We also combine our network approach with analysis of roll call votes using singular value decomposition to uncover correlations between the political and organizational structure of House committees.
△ Less
Submitted 19 July, 2007; v1 submitted 4 February, 2006;
originally announced February 2006.
-
A network analysis of committees in the United States House of Representatives
Authors:
Mason A. Porter,
Peter J. Mucha,
M. E. J. Newman,
Casey M. Warmbrand
Abstract:
Network theory provides a powerful tool for the representation and analysis of complex systems of interacting agents. Here we investigate the United States House of Representatives network of committees and subcommittees, with committees connected according to ``interlocks'' or common membership. Analysis of this network reveals clearly the strong links between different committees, as well as t…
▽ More
Network theory provides a powerful tool for the representation and analysis of complex systems of interacting agents. Here we investigate the United States House of Representatives network of committees and subcommittees, with committees connected according to ``interlocks'' or common membership. Analysis of this network reveals clearly the strong links between different committees, as well as the intrinsic hierarchical structure within the House as a whole. We show that network theory, combined with the analysis of roll call votes using singular value decomposition, successfully uncovers political and organizational correlations between committees in the House without the need to incorporate other political information.
△ Less
Submitted 17 May, 2005;
originally announced May 2005.