-
Inference of Edge Correlations in Multilayer Networks
Authors:
A. Roxana Pamfil,
Sam D. Howison,
Mason A. Porter
Abstract:
Many recent developments in network analysis have focused on multilayer networks, which one can use to encode time-dependent interactions, multiple types of interactions, and other complications that arise in complex systems. Like their monolayer counterparts, multilayer networks in applications often have mesoscale features, such as community structure. A prominent type of method for inferring su…
▽ More
Many recent developments in network analysis have focused on multilayer networks, which one can use to encode time-dependent interactions, multiple types of interactions, and other complications that arise in complex systems. Like their monolayer counterparts, multilayer networks in applications often have mesoscale features, such as community structure. A prominent type of method for inferring such structures is the employment of multilayer stochastic block models (SBMs). A common (but {potentially} inadequate) assumption of these models is the sampling of edges in different layers independently, conditioned on the community labels of the nodes. In this paper, we relax this assumption of independence by incorporating edge correlations into an SBM-like model. We derive maximum-likelihood estimates of the key parameters of our model, and we propose a measure of layer correlation that reflects the similarity between connectivity patterns in different layers. Finally, we explain how to use correlated models for edge "prediction" (i.e., inference) in multilayer networks. By taking into account edge correlations, prediction accuracy improves both in synthetic networks and in a temporal network of shoppers who are connected to previously-purchased grocery products.
△ Less
Submitted 8 September, 2020; v1 submitted 11 August, 2019;
originally announced August 2019.
-
Customer mobility and congestion in supermarkets
Authors:
Fabian Ying,
Alisdair O. G. Wallis,
Mariano Beguerisse-Díaz,
Mason A. Porter,
Sam D. Howison
Abstract:
The analysis and characterization of human mobility using population-level mobility models is important for numerous applications, ranging from the estimation of commuter flows in cities to modeling trade flows between countries. However, almost all of these applications have focused on large spatial scales, which typically range between intra-city scales to inter-country scales. In this paper, we…
▽ More
The analysis and characterization of human mobility using population-level mobility models is important for numerous applications, ranging from the estimation of commuter flows in cities to modeling trade flows between countries. However, almost all of these applications have focused on large spatial scales, which typically range between intra-city scales to inter-country scales. In this paper, we investigate population-level human mobility models on a much smaller spatial scale by using them to estimate customer mobility flow between supermarket zones. We use anonymized, ordered customer-basket data to infer empirical mobility flow in supermarkets, and we apply variants of the gravity and intervening-opportunities models to fit this mobility flow and estimate the flow on unseen data. We find that a doubly-constrained gravity model and an extended radiation model (which is a type of intervening-opportunities model) can successfully estimate 65--70\% of the flow inside supermarkets. Using a gravity model as a case study, we then investigate how to reduce congestion in supermarkets using mobility models. We model each supermarket zone as a queue, and we use a gravity model to identify store layouts with low congestion, which we measure either by the maximum number of visits to a zone or by the total mean queue size. We then use a simulated-annealing algorithm to find store layouts with lower congestion than a supermarket's original layout. In these optimized store layouts, we find that popular zones are often in the perimeter of a store. Our research gives insight both into how customers move in supermarkets and into how retailers can arrange stores to reduce congestion. It also provides a case study of human mobility on small spatial scales.
△ Less
Submitted 26 September, 2019; v1 submitted 30 May, 2019;
originally announced May 2019.
-
Pull out all the stops: Textual analysis via punctuation sequences
Authors:
Alexandra N. M. Darmon,
Marya Bazzi,
Sam D. Howison,
Mason A. Porter
Abstract:
Whether enjoying the lucid prose of a favorite author or slogging through some other writer's cumbersome, heavy-set prattle (full of parentheses, em dashes, compound adjectives, and Oxford commas), readers will notice stylistic signatures not only in word choice and grammar, but also in punctuation itself. Indeed, visual sequences of punctuation from different authors produce marvelously different…
▽ More
Whether enjoying the lucid prose of a favorite author or slogging through some other writer's cumbersome, heavy-set prattle (full of parentheses, em dashes, compound adjectives, and Oxford commas), readers will notice stylistic signatures not only in word choice and grammar, but also in punctuation itself. Indeed, visual sequences of punctuation from different authors produce marvelously different (and visually striking) sequences. Punctuation is a largely overlooked stylistic feature in "stylometry", the quantitative analysis of written text. In this paper, we examine punctuation sequences in a corpus of literary documents and ask the following questions: Are the properties of such sequences a distinctive feature of different authors? Is it possible to distinguish literary genres based on their punctuation sequences? Do the punctuation styles of authors evolve over time? Are we on to something interesting in trying to do stylometry without words, or are we full of sound and fury (signifying nothing)?
△ Less
Submitted 16 January, 2020; v1 submitted 31 December, 2018;
originally announced January 2019.
-
Relating modularity maximization and stochastic block models in multilayer networks
Authors:
A. Roxana Pamfil,
Sam D. Howison,
Renaud Lambiotte,
Mason A. Porter
Abstract:
Characterizing large-scale organization in networks, including multilayer networks, is one of the most prominent topics in network science and is important for many applications. One type of mesoscale feature is community structure, in which sets of nodes are densely connected internally but sparsely connected to other dense sets of nodes. Two of the most popular approaches for community detection…
▽ More
Characterizing large-scale organization in networks, including multilayer networks, is one of the most prominent topics in network science and is important for many applications. One type of mesoscale feature is community structure, in which sets of nodes are densely connected internally but sparsely connected to other dense sets of nodes. Two of the most popular approaches for community detection are to maximize an objective function called "modularity" and to perform statistical inference using stochastic block models. Generalizing work by Newman on monolayer networks (Physical Review E 94, 052315), we show in multilayer networks that maximizing modularity is equivalent, under certain conditions, to maximizing the posterior probability of community assignments under a suitably chosen stochastic block model. We derive versions of this equivalence for various types of multilayer structure, including temporal, multiplex, and multilevel networks. We consider cases in which the key parameters are constant, as well as ones in which they vary across layers; in the latter case, this yields a novel, layer-weighted version of the modularity function. Our results also help address a longstanding difficulty of multilayer modularity-maximization algorithms, which require the specification of two sets of tuning parameters that have been difficult to choose in practice. We show how to perform this parameter selection in a statistically-grounded way, and we demonstrate the effectiveness of our approach on both synthetic and empirical networks.
△ Less
Submitted 6 December, 2018; v1 submitted 5 April, 2018;
originally announced April 2018.
-
The Role of Network Analysis in Industrial and Applied Mathematics
Authors:
Mason A. Porter,
Sam D. Howison
Abstract:
Many problems in industry --- and in the social, natural, information, and medical sciences --- involve discrete data and benefit from approaches from subjects such as network science, information theory, optimization, probability, and statistics. The study of networks is concerned explicitly with connectivity between different entities, and it has become very prominent in industrial settings, an…
▽ More
Many problems in industry --- and in the social, natural, information, and medical sciences --- involve discrete data and benefit from approaches from subjects such as network science, information theory, optimization, probability, and statistics. The study of networks is concerned explicitly with connectivity between different entities, and it has become very prominent in industrial settings, an importance that has intensified amidst the modern data deluge. In this commentary, we discuss the role of network analysis in industrial and applied mathematics, and we give several examples of network science in industry. We focus, in particular, on discussing a physical-applied-mathematics approach to the study of networks. We also discuss several of our own collaborations with industry on projects in network analysis.
△ Less
Submitted 6 August, 2018; v1 submitted 20 March, 2017;
originally announced March 2017.
-
A Framework for the Construction of Generative Models for Mesoscale Structure in Multilayer Networks
Authors:
Marya Bazzi,
Lucas G. S. Jeub,
Alex Arenas,
Sam D. Howison,
Mason A. Porter
Abstract:
Multilayer networks allow one to represent diverse and coupled connectivity patterns --- e.g., time-dependence, multiple subsystems, or both --- that arise in many applications and which are difficult or awkward to incorporate into standard network representations. In the study of multilayer networks, it is important to investigate mesoscale (i.e., intermediate-scale) structures, such as dense set…
▽ More
Multilayer networks allow one to represent diverse and coupled connectivity patterns --- e.g., time-dependence, multiple subsystems, or both --- that arise in many applications and which are difficult or awkward to incorporate into standard network representations. In the study of multilayer networks, it is important to investigate mesoscale (i.e., intermediate-scale) structures, such as dense sets of nodes known as communities, to discover network features that are not apparent at the microscale or the macroscale. The ill-defined nature of mesoscale structure and its ubiquity in empirical networks make it crucial to develop generative models that can produce the features that one encounters in empirical networks. Key purposes of such generative models include generating synthetic networks with empirical properties of interest, benchmarking mesoscale-detection methods and algorithms, and inferring structure in empirical multilayer networks. In this paper, we introduce a framework for the construction of generative models for mesoscale structures in multilayer networks. Our framework provides a standardized set of generative models, together with an associated set of principles from which they are derived, for studies of mesoscale structures in multilayer networks. It unifies and generalizes many existing models for mesoscale structures in fully-ordered (e.g., temporal) and unordered (e.g., multiplex) multilayer networks. One can also use it to construct generative models for mesoscale structures in partially-ordered multilayer networks (e.g., networks that are both temporal and multiplex). Our framework has the ability to produce many features of empirical multilayer networks, and it explicitly incorporates a user-specified dependency structure between layers.
△ Less
Submitted 11 December, 2019; v1 submitted 22 August, 2016;
originally announced August 2016.
-
Community detection in temporal multilayer networks, with an application to correlation networks
Authors:
Marya Bazzi,
Mason A. Porter,
Stacy Williams,
Mark McDonald,
Daniel J. Fenn,
Sam D. Howison
Abstract:
Networks are a convenient way to represent complex systems of interacting entities. Many networks contain "communities" of nodes that are more densely connected to each other than to nodes in the rest of the network. In this paper, we investigate the detection of communities in temporal networks represented as multilayer networks. As a focal example, we study time-dependent financial-asset correla…
▽ More
Networks are a convenient way to represent complex systems of interacting entities. Many networks contain "communities" of nodes that are more densely connected to each other than to nodes in the rest of the network. In this paper, we investigate the detection of communities in temporal networks represented as multilayer networks. As a focal example, we study time-dependent financial-asset correlation networks. We first argue that the use of the "modularity" quality function---which is defined by comparing edge weights in an observed network to expected edge weights in a "null network"---is application-dependent. We differentiate between "null networks" and "null models" in our discussion of modularity maximization, and we highlight that the same null network can correspond to different null models. We then investigate a multilayer modularity-maximization problem to identify communities in temporal networks. Our multilayer analysis only depends on the form of the maximization problem and not on the specific quality function that one chooses. We introduce a diagnostic to measure \emph{persistence} of community structure in a multilayer network partition. We prove several results that describe how the multilayer maximization problem measures a trade-off between static community structure within layers and larger values of persistence across layers. We also discuss some computational issues that the popular "Louvain" heuristic faces with temporal multilayer networks and suggest ways to mitigate them.
△ Less
Submitted 24 December, 2017; v1 submitted 30 December, 2014;
originally announced January 2015.