-
The simpliciality of higher-order networks
Authors:
Nicholas W. Landry,
Jean-Gabriel Young,
Nicole Eikmeier
Abstract:
Higher-order networks are widely used to describe complex systems in which interactions can involve more than two entities at once. In this paper, we focus on inclusion within higher-order networks, referring to situations where specific entities participate in an interaction, and subsets of those entities also interact with each other. Traditional modeling approaches to higher-order networks tend…
▽ More
Higher-order networks are widely used to describe complex systems in which interactions can involve more than two entities at once. In this paper, we focus on inclusion within higher-order networks, referring to situations where specific entities participate in an interaction, and subsets of those entities also interact with each other. Traditional modeling approaches to higher-order networks tend to either not consider inclusion at all (e.g., hypergraph models) or explicitly assume perfect and complete inclusion (e.g., simplicial complex models). To allow for a more nuanced assessment of inclusion in higher-order networks, we introduce the concept of "simpliciality" and several corresponding measures. Contrary to current modeling practice, we show that empirically observed systems rarely lie at either end of the simpliciality spectrum. In addition, we show that generative models fitted to these datasets struggle to capture their inclusion structure. These findings suggest new modeling directions for the field of higher-order network science.
△ Less
Submitted 7 March, 2024; v1 submitted 26 August, 2023;
originally announced August 2023.
-
Nonbacktracking spectral clustering of nonuniform hypergraphs
Authors:
Philip Chodrow,
Nicole Eikmeier,
Jamie Haddock
Abstract:
Spectral methods offer a tractable, global framework for clustering in graphs via eigenvector computations on graph matrices. Hypergraph data, in which entities interact on edges of arbitrary size, poses challenges for matrix representations and therefore for spectral clustering. We study spectral clustering for nonuniform hypergraphs based on the hypergraph nonbacktracking operator. After reviewi…
▽ More
Spectral methods offer a tractable, global framework for clustering in graphs via eigenvector computations on graph matrices. Hypergraph data, in which entities interact on edges of arbitrary size, poses challenges for matrix representations and therefore for spectral clustering. We study spectral clustering for nonuniform hypergraphs based on the hypergraph nonbacktracking operator. After reviewing the definition of this operator and its basic properties, we prove a theorem of Ihara-Bass type which allows eigenpair computations to take place on a smaller matrix, often enabling faster computation. We then propose an alternating algorithm for inference in a hypergraph stochastic blockmodel via linearized belief-propagation which involves a spectral clustering step again using nonbacktracking operators. We provide proofs related to this algorithm that both formalize and extend several previous results. We pose several conjectures about the limits of spectral methods and detectability in hypergraph stochastic blockmodels in general, supporting these with in-expectation analysis of the eigeinpairs of our studied operators. We perform experiments in real and synthetic data that demonstrate the benefits of hypergraph methods over graph-based ones when interactions of different sizes carry different information about cluster structure.
△ Less
Submitted 3 September, 2022; v1 submitted 26 April, 2022;
originally announced April 2022.
-
Modeling COVID-19 Spread in Small Colleges
Authors:
Riti Bahl,
Nicole Eikmeier,
Alexandra Fraser,
Matthew Junge,
Felicia Keesing,
Kukai Nakahata,
Lily Z. Wang
Abstract:
We develop an agent-based model on a network meant to capture features unique to COVID-19 spread through a small residential college. We find that a safe reopening requires strong policy from administrators combined with cautious behavior from students. Strong policy includes weekly screening tests with quick turnaround and halving the campus population. Cautious behavior from students means weari…
▽ More
We develop an agent-based model on a network meant to capture features unique to COVID-19 spread through a small residential college. We find that a safe reopening requires strong policy from administrators combined with cautious behavior from students. Strong policy includes weekly screening tests with quick turnaround and halving the campus population. Cautious behavior from students means wearing facemasks, socializing less, and showing up for COVID-19 testing. We also find that comprehensive testing and facemasks are the most effective single interventions, building closures can lead to infection spikes in other areas depending on student behavior, and faster return of test results significantly reduces total infections.
△ Less
Submitted 21 August, 2020;
originally announced August 2020.
-
Emergence of Hierarchy in Networked Endorsement Dynamics
Authors:
Mari Kawakatsu,
Philip S. Chodrow,
Nicole Eikmeier,
Daniel B. Larremore
Abstract:
Many social and biological systems are characterized by enduring hierarchies, including those organized around prestige in academia, dominance in animal groups, and desirability in online dating. Despite their ubiquity, the general mechanisms that explain the creation and endurance of such hierarchies are not well understood. We introduce a generative model for the dynamics of hierarchies using ti…
▽ More
Many social and biological systems are characterized by enduring hierarchies, including those organized around prestige in academia, dominance in animal groups, and desirability in online dating. Despite their ubiquity, the general mechanisms that explain the creation and endurance of such hierarchies are not well understood. We introduce a generative model for the dynamics of hierarchies using time-varying networks in which new links are formed based on the preferences of nodes in the current network and old links are forgotten over time. The model produces a range of hierarchical structures, ranging from egalitarianism to bistable hierarchies, and we derive critical points that separate these regimes in the limit of long system memory. Importantly, our model supports statistical inference, allowing for a principled comparison of generative mechanisms using data. We apply the model to study hierarchical structures in empirical data on hiring patterns among mathematicians, dominance relations among parakeets, and friendships among members of a fraternity, observing several persistent patterns as well as interpretable differences in the generative mechanisms favored by each. Our work contributes to the growing literature on statistically grounded models of time-varying networks.
△ Less
Submitted 7 May, 2021; v1 submitted 8 July, 2020;
originally announced July 2020.
-
On Large-Scale Dynamic Topic Modeling with Nonnegative CP Tensor Decomposition
Authors:
Miju Ahn,
Nicole Eikmeier,
Jamie Haddock,
Lara Kassab,
Alona Kryshchenko,
Kathryn Leonard,
Deanna Needell,
R. W. M. A. Madushani,
Elena Sizikova,
Chuntian Wang
Abstract:
There is currently an unprecedented demand for large-scale temporal data analysis due to the explosive growth of data. Dynamic topic modeling has been widely used in social and data sciences with the goal of learning latent topics that emerge, evolve, and fade over time. Previous work on dynamic topic modeling primarily employ the method of nonnegative matrix factorization (NMF), where slices of t…
▽ More
There is currently an unprecedented demand for large-scale temporal data analysis due to the explosive growth of data. Dynamic topic modeling has been widely used in social and data sciences with the goal of learning latent topics that emerge, evolve, and fade over time. Previous work on dynamic topic modeling primarily employ the method of nonnegative matrix factorization (NMF), where slices of the data tensor are each factorized into the product of lower-dimensional nonnegative matrices. With this approach, however, information contained in the temporal dimension of the data is often neglected or underutilized. To overcome this issue, we propose instead adopting the method of nonnegative CANDECOMP/PARAPAC (CP) tensor decomposition (NNCPD), where the data tensor is directly decomposed into a minimal sum of outer products of nonnegative vectors, thereby preserving the temporal information. The viability of NNCPD is demonstrated through application to both synthetic and real data, where significantly improved results are obtained compared to those of typical NMF-based methods. The advantages of NNCPD over such approaches are studied and discussed. To the best of our knowledge, this is the first time that NNCPD has been utilized for the purpose of dynamic topic modeling, and our findings will be transformative for both applications and further developments.
△ Less
Submitted 14 October, 2020; v1 submitted 2 January, 2020;
originally announced January 2020.
-
Centrality in dynamic competition networks
Authors:
Anthony Bonato,
Nicole Eikmeier,
David F. Gleich,
Rehan Malik
Abstract:
Competition networks are formed via adversarial interactions between actors. The Dynamic Competition Hypothesis predicts that influential actors in competition networks should have a large number of common out-neighbors with many other nodes. We empirically study this idea as a centrality score and find the measure predictive of importance in several real-world networks including food webs, confli…
▽ More
Competition networks are formed via adversarial interactions between actors. The Dynamic Competition Hypothesis predicts that influential actors in competition networks should have a large number of common out-neighbors with many other nodes. We empirically study this idea as a centrality score and find the measure predictive of importance in several real-world networks including food webs, conflict networks, and voting data from Survivor.
△ Less
Submitted 15 September, 2019;
originally announced September 2019.
-
Chase-escape with death on trees
Authors:
Erin Beckman,
Keisha Cook,
Nicole Eikmeier,
Sarai Hernandez-Torres,
Matthew Junge
Abstract:
Chase-escape is a competitive growth process in which red particles spread to adjacent uncolored sites, while blue particles overtake adjacent red particles. We introduce the variant in which red particles die and describe the phase diagram for the resulting process on infinite d-ary trees. A novel connection to weighted Catalan numbers makes it possible to characterize the critical behavior.
Chase-escape is a competitive growth process in which red particles spread to adjacent uncolored sites, while blue particles overtake adjacent red particles. We introduce the variant in which red particles die and describe the phase diagram for the resulting process on infinite d-ary trees. A novel connection to weighted Catalan numbers makes it possible to characterize the critical behavior.
△ Less
Submitted 26 May, 2020; v1 submitted 4 September, 2019;
originally announced September 2019.
-
Triangle Preferential Attachment Has Power-law Degrees and Eigenvalues; Eigenvalues Are More Stable to Network Sampling
Authors:
Nicole Eikmeier,
David F. Gleich
Abstract:
Preferential attachment models are a common class of graph models which have been used to explain why power-law distributions appear in the degree sequences of real network data. One of the things they lack, however, is higher-order network clustering, including non-trivial clustering coefficients. In this paper we present a specific Triangle Generalized Preferential Attachment Model (TGPA) that,…
▽ More
Preferential attachment models are a common class of graph models which have been used to explain why power-law distributions appear in the degree sequences of real network data. One of the things they lack, however, is higher-order network clustering, including non-trivial clustering coefficients. In this paper we present a specific Triangle Generalized Preferential Attachment Model (TGPA) that, by construction, has nontrivial clustering. We further prove that this model has a power-law in both the degree distribution and eigenvalue spectra. We use this model to investigate a recent finding that power-laws are more reliably observed in the eigenvalue spectra of real-world networks than in their degree distribution. One conjectured explanation for this is that the spectra of the graph is more robust to various sampling strategies that would have been employed to collect the real-world data compared with the degree distribution. Consequently, we generate random TGPA models that provably have a power-law in both, and sample subgraphs via forest fire, depth-first, and random edge models. We find that the samples show a power-law in the spectra even when only 30\% of the network is seen. Whereas there is a large chance that the degrees will not show a power-law. Our TGPA model shows this behavior much more clearly than a standard preferential attachment model. This provides one possible explanation for why power-laws may be seen frequently in the spectra of real world data.
△ Less
Submitted 29 April, 2019;
originally announced April 2019.
-
The HyperKron Graph Model for higher-order features
Authors:
Nicole Eikmeier,
Arjun S. Ramani,
David F. Gleich
Abstract:
Graph models have long been used in lieu of real data which can be expensive and hard to come by. A common class of models constructs a matrix of probabilities, and samples an adjacency matrix by flip** a weighted coin for each entry. Examples include the Erdős-Rényi model, Chung-Lu model, and the Kronecker model. Here we present the HyperKron Graph model: an extension of the Kronecker Model, bu…
▽ More
Graph models have long been used in lieu of real data which can be expensive and hard to come by. A common class of models constructs a matrix of probabilities, and samples an adjacency matrix by flip** a weighted coin for each entry. Examples include the Erdős-Rényi model, Chung-Lu model, and the Kronecker model. Here we present the HyperKron Graph model: an extension of the Kronecker Model, but with a distribution over hyperedges. We prove that we can efficiently generate graphs from this model in order proportional to the number of edges times a small log-factor, and find that in practice the runtime is linear with respect to the number of edges. We illustrate a number of useful features of the HyperKron model including non-trivial clustering and highly skewed degree distributions. Finally, we fit the HyperKron model to real-world networks, and demonstrate the model's flexibility with a complex application of the HyperKron model to networks with coherent feed-forward loops.
△ Less
Submitted 10 September, 2018;
originally announced September 2018.
-
Dynamic Competition Networks: detecting alliances and leaders
Authors:
Anthony Bonato,
Nicole Eikmeier,
David F. Gleich,
Rehan Malik
Abstract:
We consider social networks of competing agents that evolve dynamically over time. Such dynamic competition networks are directed, where a directed edge from nodes $u$ to $v$ corresponds a negative social interaction. We present a novel hypothesis that serves as a predictive tool to uncover alliances and leaders within dynamic competition networks. Our focus is in the present study is to validate…
▽ More
We consider social networks of competing agents that evolve dynamically over time. Such dynamic competition networks are directed, where a directed edge from nodes $u$ to $v$ corresponds a negative social interaction. We present a novel hypothesis that serves as a predictive tool to uncover alliances and leaders within dynamic competition networks. Our focus is in the present study is to validate it on competitive networks arising from social game shows such as Survivor and Big Brother.
△ Less
Submitted 11 April, 2018; v1 submitted 1 February, 2018;
originally announced March 2018.
-
Coin-flip**, ball-drop**, and grass-hop** for generating random graphs from matrices of edge probabilities
Authors:
Arjun S. Ramani,
Nicole Eikmeier,
David F. Gleich
Abstract:
Common models for random graphs, such as Erdős-Rényi and Kronecker graphs, correspond to generating random adjacency matrices where each entry is non-zero based on a large matrix of probabilities. Generating an instance of a random graph based on these models is easy, although inefficient, by flip** biased coins (i.e. sampling binomial random variables) for each possible edge. This process is in…
▽ More
Common models for random graphs, such as Erdős-Rényi and Kronecker graphs, correspond to generating random adjacency matrices where each entry is non-zero based on a large matrix of probabilities. Generating an instance of a random graph based on these models is easy, although inefficient, by flip** biased coins (i.e. sampling binomial random variables) for each possible edge. This process is inefficient because most large graph models correspond to sparse graphs where the vast majority of coin flips will result in no edges. We describe some not-entirely-well-known, but not-entirely-unknown, techniques that will enable us to sample a graph by finding only the coin flips that will produce edges. Our analogies for these procedures are ball-drop**, which is easier to implement, but may need extra work due to duplicate edges, and grass-hop**, which results in no duplicated work or extra edges.
Grass-hop** does this using geometric random variables. In order to use this idea on complex probability matrices such as those in Kronecker graphs, we decompose the problem into three steps, each of which are independently useful computational primitives: (i) enumerating non-decreasing sequences, (ii) unranking multiset permutations, and (iii) decoding and encoding z-curve and Morton codes and permutations. The third step is the result of a new connection between repeated Kronecker product operations and Morton codes. Throughout, we draw connections to ideas underlying applied math and computer science including coupon collector problems.
△ Less
Submitted 11 September, 2017;
originally announced September 2017.