-
Higher-Order Patterns Reveal Causal Timescales of Complex Systems
Authors:
Luka V. Petrović,
Anatol Wegner,
Ingo Scholtes
Abstract:
The analysis of temporal networks heavily depends on the analysis of time-respecting paths. However, before being able to model and analyze the time-respecting paths, we have to infer the timescales at which the temporal edges influence each other. In this work we introduce temporal path entropy, an information theoretic measure of temporal networks, with the aim to detect the timescales at which…
▽ More
The analysis of temporal networks heavily depends on the analysis of time-respecting paths. However, before being able to model and analyze the time-respecting paths, we have to infer the timescales at which the temporal edges influence each other. In this work we introduce temporal path entropy, an information theoretic measure of temporal networks, with the aim to detect the timescales at which the causal influences occur in temporal networks. The measure can be used on temporal networks as a whole, or separately for each node. We find that the temporal path entropy has a non-trivial dependency on the causal timescales of synthetic and empirical temporal networks. Furthermore, we notice in both synthetic and empirical data that the temporal path entropy tends to decrease at timescales that correspond to the causal interactions. Our results imply that timescales relevant for the dynamics of complex systems can be detected in the temporal networks themselves, by measuring temporal path entropy. This is crucial for the analysis of temporal networks where inherent timescales are unavailable and hard to measure.
△ Less
Submitted 27 January, 2023;
originally announced January 2023.
-
Bayesian Detection of Mesoscale Structures in Pathway Data on Graphs
Authors:
Luka V. Petrović,
Vincenzo Perri
Abstract:
Mesoscale structures are an integral part of the abstraction and analysis of complex systems. They reveal a node's function in the network, and facilitate our understanding of the network dynamics. For example, they can represent communities in social or citation networks, roles in corporate interactions, or core-periphery structures in transportation networks. We usually detect mesoscale structur…
▽ More
Mesoscale structures are an integral part of the abstraction and analysis of complex systems. They reveal a node's function in the network, and facilitate our understanding of the network dynamics. For example, they can represent communities in social or citation networks, roles in corporate interactions, or core-periphery structures in transportation networks. We usually detect mesoscale structures under the assumption of independence of interactions. Still, in many cases, the interactions invalidate this assumption by occurring in a specific order. Such patterns emerge in pathway data; to capture them, we have to model the dependencies between interactions using higher-order network models. However, the detection of mesoscale structures in higher-order networks is still under-researched. In this work, we derive a Bayesian approach that simultaneously models the optimal partitioning of nodes in groups and the optimal higher-order network dynamics between the groups. In synthetic data we demonstrate that our method can recover both standard proximity-based communities and role-based grou**s of nodes. In synthetic and real world data we show that it can compete with baseline techniques, while additionally providing interpretable abstractions of network dynamics.
△ Less
Submitted 16 January, 2023;
originally announced January 2023.
-
Learning the Markov order of paths in a network
Authors:
Luka V. Petrović,
Ingo Scholtes
Abstract:
We study the problem of learning the Markov order in categorical sequences that represent paths in a network, i.e. sequences of variable lengths where transitions between states are constrained to a known graph. Such data pose challenges for standard Markov order detection methods and demand modelling techniques that explicitly account for the graph constraint. Adopting a multi-order modelling fra…
▽ More
We study the problem of learning the Markov order in categorical sequences that represent paths in a network, i.e. sequences of variable lengths where transitions between states are constrained to a known graph. Such data pose challenges for standard Markov order detection methods and demand modelling techniques that explicitly account for the graph constraint. Adopting a multi-order modelling framework for paths, we develop a Bayesian learning technique that (i) more reliably detects the correct Markov order compared to a competing method based on the likelihood ratio test, (ii) requires considerably less data compared to methods using AIC or BIC, and (iii) is robust against partial knowledge of the underlying constraints. We further show that a recently published method that uses a likelihood ratio test has a tendency to overfit the true Markov order of paths, which is not the case for our Bayesian technique. Our method is important for data scientists analyzing patterns in categorical sequence data that are subject to (partially) known constraints, e.g. sequences with forbidden words, mobility trajectories and click stream data, or sequence data in bioinformatics. Addressing the key challenge of model selection, our work is further relevant for the growing body of research that emphasizes the need for higher-order models in network analysis.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.
-
Counting Causal Paths in Big Times Series Data on Networks
Authors:
Luka V. Petrovic,
Ingo Scholtes
Abstract:
Graph or network representations are an important foundation for data mining and machine learning tasks in relational data. Many tools of network analysis, like centrality measures, information ranking, or cluster detection rest on the assumption that links capture direct influence, and that paths represent possible indirect influence. This assumption is invalidated in time-stamped network data ca…
▽ More
Graph or network representations are an important foundation for data mining and machine learning tasks in relational data. Many tools of network analysis, like centrality measures, information ranking, or cluster detection rest on the assumption that links capture direct influence, and that paths represent possible indirect influence. This assumption is invalidated in time-stamped network data capturing, e.g., dynamic social networks, biological sequences or financial transactions. In such data, for two time-stamped links (A,B) and (B,C) the chronological ordering and timing determines whether a causal path from node A via B to C exists. A number of works has shown that for that reason network analysis cannot be directly applied to time-stamped network data. Existing methods to address this issue require statistics on causal paths, which is computationally challenging for big data sets.
Addressing this problem, we develop an efficient algorithm to count causal paths in time-stamped network data. Applying it to empirical data, we show that our method is more efficient than a baseline method implemented in an OpenSource data analytics package. Our method works efficiently for different values of the maximum time difference between consecutive links of a causal path and supports streaming scenarios. With it, we are closing a gap that hinders an efficient analysis of big time series data on complex networks.
△ Less
Submitted 27 May, 2019;
originally announced May 2019.