Kairos: Efficient Temporal Graph Analytics
on a Single Machine

Joana M. F. da Trindade MIT CSAIL [email protected] , Julian Shun MIT CSAIL [email protected] , Samuel Madden MIT CSAIL [email protected] and Nesime Tatbul Intel Labs / MIT CSAIL [email protected]

Abstract.

Many important societal problems are naturally modeled as algorithms over temporal graphs. To date, however, most graph processing systems remain inefficient as they rely on distributed processing even for graphs that fit well within a commodity server’s available storage. In this paper, we introduce Kairos, a temporal graph analytics system that provides application developers a framework for efficiently implementing and executing algorithms over temporal graphs on a single machine. Specifically, Kairos relies on fork-join parallelism and a highly optimized parallel data structure as core primitives to maximize performance of graph processing tasks needed for temporal graph analytics. Furthermore, we introduce the notion of selective indexing and show how it can be used with an efficient index to speedup temporal queries. Our experiments on a 24-core server show that our algorithms obtain good parallel speedups, and are significantly faster than equivalent algorithms in existing temporal graph processing systems: up to 60x against a shared-memory approach, and several orders of magnitude when compared with distributed processing of graphs that fit within a single server.

^†^†copyright: rightsretained^†^†doi: 10.475/123_4^†^†isbn: 123-4567-24-567/08/06

1. Introduction

The growing demand for temporal graph applications has given rise to new challenges in temporal graph analytics. As an increasing number of real-world systems and processes can be modeled as temporal graphs, the need for effective analysis tools and techniques has become more pressing. These applications range from social networks and communication systems to transportation networks and biological systems, where understanding the temporal dynamics is crucial for uncovering meaningful insights and patterns (TemporalNetworks, 1, 2, 3, 4).

Temporal graphs offer a unique perspective as they can capture the dynamics of interactions and relationships over time, which non-temporal graphs are unable to provide. Furthermore, temporal graphs lend themselves to the exploration of time-ordered events and the impact of such sequences in the network, creating an opportunity for more nuanced insights. For instance, being able to track the chronology of friendship formations on a social network or communication events can enhance our understanding of behavioral patterns. Similarly, the ability to observe the evolution of transportation routes or the progression of a biological system over time can provide critical data for predictive modeling and decision-making.

Refer to caption — Figure 1. An example temporal graph representing vertices $\{a,b,c,d,e,f,g\}$ . Each edge is associated with a time interval (start, end) denoting its validity.

However, the temporal dynamics inherent in these graphs also brings about new challenges. Existing graph frameworks and query systems frequently encounter difficulties when handling graph processing tasks required in temporal graph analytics applications. The underlying reasons for these challenges are twofold. First, many of these systems were primarily designed for traditional graph processing, and as a result, they are not well-equipped to handle the unique characteristics and requirements of temporal graphs. This shortcoming leads to suboptimal performance and a limited ability to fully leverage the available temporal information in the data. Second, some systems (ICM, 5, 6, 7) that specifically target temporal graph processing rely on Pregel-like distributed computation models. These models can be highly inefficient due to the message passing overhead across servers in a cluster, especially when the input graph fits comfortably within the memory resources of a single commodity machine. This limitation leads to suboptimal performance and an inability to fully exploit the temporal information available in the data. Finally, existing shared-memory temporal graph processing systems (TeGraph, 8) do not have these limitations, but still rely on expensive pre-processing of the graph that increases the original size of the dataset – a step which is not necessary for correctness of the target algorithms.

Temporal graph applications typically require querying small time slices of data. While existing systems can represent time as an attribute of nodes and edges, filtering by time necessitates either an expensive scan or the use of a range index, resulting in poor performance. Furthermore, traditional graph processing frameworks lack support for temporal algorithms, leading to complex and costly implementations as they require additional programming effort.

Temporal graphs also exhibit unique properties that are not necessarily considered in existing systems. For example, they tend to be large due to the increased number of attributes and edges, resulting from the preservation of interactions between vertices over time. Additionally, temporal graphs exhibit skewed data distributions, both in terms of degree distributions and their evolution over time (leskovec2005graphs, 9, 1). To address these challenges, temporal graph analytics systems must provide interactive response times for various use cases, such as operational decisions, contact tracing, and routing.

Our Approach and Contributions. In response to the challenges described above, we have developed Kairos, an in-memory temporal graph analytics system that leverages a highly-optimized parallel data structure for to efficiently execute temporal algorithms over temporal graphs on multi-core machines. Kairos is designed to address the unique properties and requirements of temporal graphs, providing an efficient solution for processing large-scale temporal graphs. Specifically, it introduces programming primitives and APIs that organize temporal graphs by time, as well as a novel Temporal Graph Index designed for efficiently looking up data in specific time ranges. This approach enables efficient temporal graph algorithm implementations, treating temporal edge information as first-class citizens in the model. The system also incorporates selective indexing, a query optimization technique that relies on a novel cost model to speed up query execution by choosing the best access method for neighbors of a given vertex at runtime.

We have implemented a number of parallel temporal graph algorithms for various application classes, including single-source shortest paths (earliest arrival, latest departure, fastest, and shortest duration), connectivity (temporal connected components), and centrality (temporal betweenness centrality). Our system provides significant performance improvements compared to existing state-of-the-art temporal graph processing systems. We make the following contributions:

I.

We present TGER, a novel “time-first” data structure that acts as an index to enable efficient processing of temporal graph queries and algorithms.
II.

We introduce selective indexing, a technique that speeds up query execution by choosing the best access method for retrieving neighbors of a given vertex at runtime.
III.

We present efficient shared-memory parallel algorithm implementations for various temporal graph application classes, offering significant performance improvements compared to existing state-of-the-art temporal graph processing systems.
IV.

Our results show substantial speedup compared to existing systems and provide insights into the performance characteristics and guarantees of our system.

2. Background

Category	Specific instantiation
Temporal Graph Algorithms
Temporal Minimal Paths	[Provenance] Tracking the origin and flow of information in different systems (Provenance1, 10, 11)
	[Indoor Routing] Temporal shortest paths that consider different obstacles for correct navigation (Indoor1, 12, 13)
	[Transportation] Route planning that considers real-time traffic conditions in transportation networks (Minimal1, 14)
	[Epidemiology] Identifying infection transmission paths in contact tracing (Epi1, 2)
Temporal Connectivity	[Social Networks] Analyzing community evolution and detecting temporal clusters (Clusters1, 15, 16, 17, 18)
Temporal Centrality	[Epidemiology] Identifying critical nodes in the spread of infections (Centrality1, 19)
Temporal Graph Queries
Time-constrained reachability	[Social Networks] Analyzing influence propagation and information cascades (InfoPropagation1, 20, 21, 22)
Temporal subgraph matching	[Bioinformatics] Detecting conserved patterns in dynamic biological networks (Motifs1, 23, 3)

Table 1. A Categorization of tasks commonly performed in temporal graph analytics.

2.1. Temporal Graph Data Model

A temporal graph is represented by the tuple $G=(V,E,T,\tau)$ :

•

$V$ denotes a set of vertices.
•

$E$ denotes a set of edges.
•

$T=[0,1,...,t_{max}]\in\mathbb{N}$ represents a discrete time domain.
•

$\tau:V\times V\times T\times T\to\{False,True\}$ is a function that determines for each pair of vertices $u,v\in V$ , and each pair of timestamps $t_{start},t_{end}\in T$ where $t_{start}\leq t_{end}$ , whether $(u,v)\in E$ , i.e., whether the edge $(u,v)$ exists during the discrete time period from $t_{start}$ to $t_{end}$ .

In other words, each edge in a temporal graph is associated with a discrete time interval indicating its validity. For instance, in interaction networks, this time interval indicates the period during which two vertices have interacted.

A weighted temporal graph is represented by the tuple $G=(V,E,T,\tau,w)$ , where $w$ is a function that maps a temporal edge (as defined above) to a real value (its weight). The number of vertices in a temporal graph is $n_{v}=|V|$ , and the number of edges is $n_{e}=|E|$ . Vertices are assumed to be labeled from $0$ to $n_{v}-1$ . For undirected temporal graphs, we use $deg(v)$ to denote the number of edges incident to a vertex $v\in V$ . In directed temporal graphs, vertices contain both incoming and outgoing edges. We use $\textit{deg}^{+}(v)$ to denote the number of outgoing edges, and $\textit{deg}^{-}(v)$ to denote the number of incoming edges that a vertex $v$ has.

2.2. Allen’s Interval Algebra

We draw inspiration from Allen’s Intervala Algebra (interval_algebra, 24) to specify the relationships that subsequent edges in the same path must have. The following subset of this algebra defines the validity of temporal paths:

•

Succeeds: For two time intervals $A$ and $B$ , $B$ succeeds $A$ if and only if the end time of $A$ is smaller than or equal to the start time of $B$ , i.e., $\textit{end}(A)\leq\textit{start}(B)$ .
•

Strictly succeeds: $B$ strictly succeeds $A$ if and only if the end time of $A$ is strictly smaller than the start time of B, i.e., $\textit{end}(A)<\textit{start}(B)$ .
•

Overlaps: $B$ overlaps $A$ if and only if the start time of $A$ is less than the start time of $B$ , and the end time of $A$ is less than the end time of $B$ , i.e., $\textit{start}(A)\leq\textit{end}(B)$ and $\textit{end}(A)\leq\textit{start}(B)$ .

We refer to this subset as ordering predicates in Kairos, and describe its application in Section 4.1.

2.3. Temporal Graph Analytics Tasks

In Section 1, we outlined several use cases for temporal graph analytics. Here, we provide a survey of applications of temporal graph analytics algorithms and queries from the literature, listed in Table 1. For each algorithm and query, we present an example application that relies on it as a core primitive for analysis. We focus on a core set of algorithms that address the most common use cases in temporal graph analytics and can serve as primitives for building more advanced analysis tasks. The primary categories for these algorithms are temporal paths, temporal connectivity, and temporal centrality.

Temporal Paths: A temporal path in graph $G$ is a path where every subsequent edge in the path must satisfy certain temporal constraints. Examples of minimal temporal paths include earliest arrival, latest departure, fastest path, and shortest path (TemporalPaths, 25, 26).

Temporal Connectivity: Temporal connectivity deals with the temporal version of connected components. This involves identifying sets of vertices that are connected through time.

Temporal Centrality: Temporal centrality measures the importance of a vertex in a temporal graph. An example of interest is temporal betweenness centrality, which quantifies how frequently a vertex appears on temporal shortest paths between that vertex and other vertices in the graph.

2.4. The Compressed Sparse Row (CSR) Format

In the context of parallel graph processing, a common data structure used to store graph data is the Compressed Sparse Row (CSR) format. This format is largely preferred over other competing data structures as it allows for efficient storage and manipulation of sparse graphs, in which the majority of potential edges are absent (packed_CSR, 27).

The CSR representation of a graph is characterized by the use of three arrays: an adjacency array, an offset array, and a vertex array. The adjacency array serves as a storage mechanism for destination vertices of all edges in a graph. The edges are placed in a contiguous block of memory, where they are sorted by the source vertex for outgoing edges, and sorted by the destination vertex for incoming edges. This efficient organization supports the fast retrieval of adjacent vertices in graph traversal operations, as all vertices in the same 1-hop neighborhood are located contiguously in memory. The offset array then stores the indices of outgoing / incoming edges into the adjacency array. These indices mark the starting point of the adjacency list for each vertex, enabling quick access to the set of edges associated with any given vertex. Lastly, if there is no metadata associated to vertices, then the vertex id is implicit (i.e., offset[ $i$ ] contains the offset for vertex $i$ ), and no separate vertex ids array is needed, which is the case for Kairos.

2.5. Shared-memory Graph Processing: Ligra

Ligra is a lightweight graph processing framework designed for shared-memory parallel systems (ligra, 28). It enables efficient parallel graph processing by employing a simple and flexible programming model, making it easy for developers to write high-performance algorithms for large-scale graphs.

The programming model of Ligra is centered around two primary operations: EdgeMap and VertexMap. These operations enable parallel traversal of the graph and are responsible for most of the computation in a Ligra-based algorithm.

EdgeMap is a higher-order function that takes as input a graph $G$ , a subset of vertices $V^{\prime}$ , and an edge function $f$ . It applies the edge function $f$ to all edges $(u,v)$ in the graph, where $u\in V^{\prime}$ and $v\in V$ . The edge function $f$ is responsible for implementing the logic of the specific graph algorithm and can perform various operations, such as updating vertex properties or computing edge weights. EdgeMap efficiently handles parallelism by processing edges in parallel, allowing for scalable performance on shared-memory systems.

VertexMap is another higher-order function that takes as input a graph $G$ , a subset of vertices $V^{\prime}$ , and a vertex function $g$ . It applies the vertex function $g$ to all vertices in $V^{\prime}$ . Similar to the edge function, the vertex function is responsible for implementing the algorithm-specific logic and can perform operations such as updating vertex properties or aggregating information from neighboring vertices. VertexMap also processes vertices in parallel, ensuring scalable performance.

By using these two core operations, Ligra enables developers to write graph algorithms that can efficiently exploit the parallelism offered by shared-memory systems. In the context of Kairos, we build upon the Ligra programming model and extend it to support temporal graph operations, as we will discuss in the following sections.

2.6. Problem Formulation

The primary challenge in designing a temporal graph analytics system is to efficiently process and analyze temporal graphs while taking into account the unique characteristics of such graphs, such as time-varying edges and vertices. The problem can be stated as follows: given a temporal graph $G=(V,E,T,\tau)$ , our goal is to efficiently support the execution of temporal graph analytics tasks, including temporal paths, temporal connectivity, and temporal centrality.

The example queries and algorithms discussed above require different strategies for selecting, at runtime, the appropriate set of vertices and temporal edges to be explored in the frontier being traversed by the graph processing engine. The efficiency of these strategies is crucial for the performance of temporal graph analytics algorithms. In this paper, we aim to address this challenge by introducing selective indexing and other design decisions that help significantly improve the performance of executing different tasks over an input temporal graph. To address this problem, we must fulfill the following desiderata:

(1)

Efficient storage and indexing of temporal graph data, especially as it pertains to temporal edges.
(2)

Design and implementation of efficient parallel algorithms for temporal graph analytics tasks, such as temporal paths, temporal connectivity, and temporal centrality.
(3)

Effective runtime selection of the edges to be explored in the frontier to minimize the time and space complexity of the operations involved in the analytics tasks.
(4)

Provide a flexible and easy-to-use programming model to support the implementation of various temporal graph analytics tasks and queries.

By addressing these aspects, our goal with Kairos is to provide an efficient temporal graph analytics system that can handle large-scale temporal graphs and deliver appropriate programming primitives for writing temporal graph applications.

3. Architectural Overview and Key Ideas

Kairos provides an API that is compatible with that of a state-of-the-art high-performance graph processing system ((ligra, 28)), and extends it to the temporal setting. The API provides methods to load graphs, perform operations such as traversal, filtering, and allows computations to be expressed in terms of an input temporal query.

In the subsequent sections, we will delve deeper into the key ideas behind our approach, as outlined in Kairos’s architecture diagram (Figure 2). We introduce Kairos’s core data structures, the need to choose between different access methods, as well as the types of information needed to make this decision. First, we describe a high-specialized parallel data structure for indexing temporal edges (Section 3.1). Second, we introduce novel approach to selectively decide which subset of vertices are benefitial to index, as well as how to access each vertex at runtime (Section 3.2) and associated cost model.

3.1. Temporal Graph Edge Registry (Key Idea A)

Temporal graphs introduce an additional layer of complexity in graph computation due to the addition of time as a variable. To process queries and algorithtms over temporal graphs, one could consider a naive approach based on CSR (Section 2.4). With this approach, the entire list of edges associated to a given vertex is stored in the corresponding offset of the CSR, and then a parallel filter is applied to retrieve neighbors satisfying an input temporal predicate. This can become expensive, however, particularly in graphs with high-degree vertices, or when the output of this filtering is much smaller than the degree of the vertex. This can decrease performance, as a large number of edges that are not relevant to the specific input temporal predicate might need to be processed.

Therefore, the efficient processing of queries and temporal algorithms over temporal graphs demands specialized data structures for storing and retrieving vertex neighbors. To this end, we have designed and implemented TGER (Temporal Graph Edges Registry), a parallel data structure based on priority search trees (CompGeometryBook, 29) for effectively processing queries and temporal algorithms over temporal graphs. TGER lets Kairos treat temporal edge information as first-class citizens in the data model, enabling efficient temporal graph algorithm implementations. Moreover, TGER is storage-efficient (i.e., $O(m)$ space, where $m$ is the number of temporal edges stored in it) and can answer interval containment queries efficiently ( $O(\log m+k)$ work, where $k$ is the number of results (CompGeometryBook, 29)). Finally, TGER makes efficient use of computational resources, employing fork-join parallelism in its implementation of both construction and query operations. As we show in our experiments, TGER can provide a substantial advantage over the naive CSR-based approach.

3.2. Selective Indexing (Key Ideas B and C)

The efficient retrieval of vertices and edges of interest is crucial for the performance of algorithms and queries in large graphs. This is particularly true for large-scale real-world graphs, where data is often skewed, and a small number of vertices can account for most edges. The skewness is intensified in real-world temporal graphs, where data can also be unevenly distributed over time (e.g., seasonal data patterns, or growth in popularity in the case of social networks). Furthermore, the skew present in real-world temporal graphs is a well studied phenomena, and which can be attributed to inter-contact time distributions, burstiness, and even circadian or otherwise weekly rhythms that are inherent in human activity (TemporalNetworks, 1). For this reason, different approaches have been proposed to generate temporal graphs that are closer to real-world skewed distributions (TACO_VLDB2022, 30, 31), and more recently data imputation (AdaptingToSkew, 32) that can handle skew.

To address this challenge, we introduce a new class of problems in query optimization for graph queries, as well as a unique technique for selectively indexing different parts of a graph relevant to query answering. As part of our approach, we present a novel cost model and related algorithms for determining the most suitable access method (e.g., index-based vs. scan-based) for graph traversal. Our algorithms take into account the characteristics of the underlying data (e.g., data skew) and workload (e.g., selectivity of temporal predicates present in the queries) to choose the most efficient access method (index vs. scan) for each vertex at runtime. Section 5 (Selective Indexing) provides a description of our technique.

Vertex Indexer. This component is responsible for deciding which vertices warrant a TGER index. First, a vertex size is defined in terms of the size of their out-degree or in-degree neighborhood (i.e., how many outgoing or incoming edges it has). Based on a predefined vertex size threshold (heuristically obtained via experimental analysis), the indexer builds a TGER only for those vertices whose size exceeds this threshold. To quickly identify which vertices have a TGER, the Vertex Indexer maintains an in-memory sparse associative array where each entry maps a vertex id to the in-memory location of its corresponding TGER. At runtime, Kairos employs a cost model ( Section 5) to evaluate whether it should access the corresponding TGER for that vertex, or opt for a linear scan using Temporal CSR (T-CSR, Section 4.2).

Cardinality Estimator. We introduce a cost model for deciding at runtime which access method to use for retrieving a vertex’s neighborhood. As part of this cost model, a cardinality estimator plays a key role in determining whether a query is selective enough to warrant index-based retrieval of outgoing/incoming edges via TGER. We further describe it and its associated algorithms in Section 5.1 and Section 5.2.

4. Kairos Framework

This section describes the interface and implementation of Kairos framework, with focus on how it extends Ligra (ligra, 28) to the temporal setting. Table 2 summarizes its interface.

4.1. Interface

Kairos contains two key data structures: VertexSet and TemporalEdgeSet, which are used respectively to represent subset of vertices and subset of temporal edges. OrderingPredicate takes as input a pair of temporal edges, an OrderingPredicateType, and evaluates whether the given temporal edges conform to that predicate (Section 4.3). Using these as input, TemporalGraph constructs the temporal graph. VertexMap is equivalent to Ligra’s (Section 2.5), with the only difference that it takes a TemporalGraph as input. Conversely, TemporalEdgeMap extends Ligra’s EdgeMap to the temporal setting, and is described in more details in Section 4.4).

Interface	Description
VertexSet	Represents a subset of vertices $V^{\prime}\subseteq V$ .
TemporalEdgeSet	Represents a subset of temporaledges $E^{\prime}\subseteq E$ .
OrderingPredicate( $A$ : temporaledge, $B$ : temporaledge, $T$ : OrderingPredicateType): bool	Evaluates the order between two temporal edges based on one of the three ordering predicate types: Suceeds, StrictlySucceeds, or Overlaps (Section 4.1).
TemporalGraph( $V$ : VertexSet, $E$ : TemporalEdgeSet, $P$ : OrderingPredicate)	Constructs a temporal graph using given input vertices, temporaledges, and ordering predicate.
VertexMap( $U$ : VertexSet, $G$ : TemporalGraph, $F$ : vertex $\to$ bool): VertexSet	Applies $F(u)$ for each $u\in U$ ; returns a VertexSet $\{u\in\ U\mid F(u)=\textit{true}\}$ .
TemporalEdgeMap( $U$ : TemporalEdgeSet, $G$ : TemporalGraph, $F$ : temporaledge $\to$ bool): TemporalEdgeSet	Applies $F(u)$ for each $u\in U$ ; returns a TemporalEdgeSet $\{u\in\ U\mid F(u)=\textit{true}\}$ .

Table 2. Core primitives from Kairos framework programming model interface.

4.2. T-CSR: Temporal CSR

The Temporal CSR (T-CSR) data structure is an extension of the traditional CSR representation (Section 2.4) designed to accommodate temporal graph data. It retains the core concepts of CSR, while incorporating additional arrays to store the start and end times associated with each temporal edge, as shown in Figure 3. Specifically, it extends the standard CSR representation by adding two more arrays, the start time array, and the end time array. As their name suggests, these arrays store the start and end times of each temporal edge in the graph, respectively. They are organized in the same order as the adjacency array, such that the start and end times of the temporal edge at position $i$ in the adjacency array can be found at position $i$ in the start / end time arrays.

With this extended representation, the T-CSR can store temporal graphs and support various temporal graph processing tasks. The additional start time and end time arrays enable the system to take into account the temporal information of the graph when executing queries and algorithms when used in conjunction with TemporalEdgeMap (Section 4.4). The T-CSR representation retains the advantages of the traditional CSR, such as cache efficiency and fast access to adjacency lists. Furthermore, it can be easily incorporated into existing CSR-based graph processing systems with minimal modifications, such as is the case for our extension of Ligra to the temporal setting. Overall, the Temporal CSR offers an effective and efficient solution for storing and processing temporal graph data. Because of that, it is the default representation used to store the neighbors of most vertices in Kairos. A similar representation is also used in the Temporal GNN literature (TGL, 33) for sampling temporal edges.

4.3. TGER: Temporal Graph Edges Registry

While T-CSR offers valuable improvements over traditional CSR for temporal graph data handling, its design is still geared towards adjacency-oriented queries and operations. For example, T-CSR does an excellent job when dealing with questions of “who is connected to whom”. However, it is less efficient when it comes to handling range-based temporal queries such as “who was connected to whom within a specific time window”. This is because in T-CSR, the temporal information is appended to the existing structure, which is space-optimized for traditional graphs (i.e., where edges do not have additional temporal information). As a result, each temporal query needs to scan over the entire adjacency list of a vertex and filter out the relevant edges based on their timestamps. While this operation can be performed in parallel, it can still be inefficient for large graphs, or for queries with small time windows (i.e., queries that are highly selective) relative to the total neighbors a vertex has.

To address this challenge, we propose TGER, an index data structure which accommodates temporal information (in the form of time intervals associated with edges) as a first-class citizen. TGER is a dual index, meaning it indexes both the start and end time attributes. Specifically, it combines a priority queue (by defaullt over the start time attribute) and a Binary Search tree (BST) (by default over the end time attribute). This allows queries with predicates on either start or end time to be answered efficiently. It is heavily inspired by Priority Seach Trees (PST) (CompGeometryBook, 29), a data structure traditionally used for storing intervals or 2D points in the context of computational geometry. In TGER, all queries are 3-sided (Figure 5), meaning that only one bound is specified for one of the dimensions. This allows temporal edges to be queried in $O(\log m+k)$ , where $m$ is the number of temporal edges stored in it and $k$ is the number of results from the query. Figure 3 shows a visual representation of TGER’s data layout, contrasting it with T-CSR, and Algorithm 1 shows the pseudocode for TGER’s parallel build operation.

Ordering Predicates and 3-sided queries. A key aspect of TGER is that the application can specify which of the two dimensions of the intervals should be mapped to the priority queue (or “heap”) axis, and which dimension should be mapped to the “BST” axis. In addition, TGER can also be configured as either a min-heap, or a max-heap. Given this flexibility, both Succeeds and StrictlySucceeds ordering predicates can be translated to an equivalent 3-sided query by flip** the axis and/or using a max-heap. The Overlaps ordering predicate, however, requires both ends of the interval of two subsequent edges to be checked against each other, as can be seen in Figure 4. In that case, TemporalEdgeMap (Section 4.4) performs one additional query to match in-neighbors with corresponding out-neighbors.

1:A temporal graph

G=(V,E)

2:If applicable, the TGER for each vertex in

G

3:procedure IndexVertices(

G

)

4: parallel for each

v\in V

d=\text{out-degree of }v

// Omitted: in-neighbors processing.

6: if

d\geq\text{min cutoff}

then // (Section 5)

\text{out-index}[v]=

BuildIndex(

e\in\text{out-edges of }v

)

8:procedure BuildIndex(

E

)

\text{sorted}=\text{parallel-sort(E, ByStartTime())}

10: return BuildIndexRecurse(

\text{sorted},0,\text{sorted.size()}

)

11:procedure BuildIndexRecurse(sorted, start, end)

12: size = end - start

13: if

\text{size}==0

then return nullptr

14: if

\text{size}==1

then return new TGERNode(sorted[start])

15: TGERNode* root = new TGERNode(sorted[start])

16: start = start + 1

17: mid-idx = index of point with median ”end time” in sorted

18: root-¿y-mid = sorted[mid-idx].end-time

19: root-¿left = spawn BuildIndexRecurse(sorted, start, mid-idx)

20: root-¿right = BuildIndexRecurse(sorted, mid-idx, end)

21: sync

Algorithm 1 Pseudocode for TGER index parallel build.

4.4. Temporal Edge Map

We introduce TemporalEdgeMap, a programming primitive inspired by Ligra’s EdgeMap (Section 2.5), and designed to efficiently handle temporal information during parallel processing of temporal graphs. The TemporalEdgeMap extends Ligra’s EdgeMap to the temporal setting by incorporating selective indexing (seeSection 5.1) to dynamically switch between different scanning strategies (Temporal CSR scan vs. TGER) depending on the query. Moreover, the primary distinction between TemporalEdgeMap’s programming interface and that of Ligra’s EdgeMap is that it allows for map** over temporal edges while specifying an ordering predicate (Section 4.1) as input. Combined with TGER, the TemporalEdgeMap acts as a parallel map** function that enables temporal graph algorithms to efficiently retrieve only the edges that satisfy temporal predicates of interest, while still respecting user-defined constraints regarding the validity of temporal paths in their applications. In Algorithm 2, we demonstrate the use of TemporalEdgeMap to update vertex frontiers for a parallel implementation of the earliest-arrival temporal path algorithm (TemporalPaths, 25). This example illustrates how the TemporalEdgeMap can be used to define complex suitable temporal graph processing logic in the context of parallel graph processing applications.

1:A temporal graph

G=(V,E)

, a target vertex

x\in V

, and a time interval

[t_{a},t_{b}]

t[V]

: The earliest-arrival time from

x

to every vertex

v\in V

within query time interval

[t_{a},t_{b}]

3:procedure Update(

s,d,[t_{s},t_{e}]

)

4: if

t_{s}>=t_{a}

t_{e}>t_{b}

then return 0

5: if

t_{s}<t[s]

t_{e}>=t[d]

then return 0

6: return writeMin(

t[d],t_{e}

) and CAS(

\text{Visited}[d],0,1

)

7:procedure Cond(

i

)

8: return

(\text{Visited}[i]==0)

9:procedure EarliestArrival(

G,x,[t_{a},t_{b}]

)

10:

t[x]=t_{b}

t[v]=\infty

for all

v\in V\setminus\{x\}

11:

\text{Visited}[v]=0

for all

v\in V

12:

\text{Frontier}=\{x\}

13: while Size(Frontier)

\neq 0

14: Frontier

=

TemporalEdgeMap(

G,[t_{a},t_{b}],\text{Frontier}

, Update, Cond, OrdPred.StrictlySucceeds)

Algorithm 2 Pseudocode for Earliest Arrival in Kairos.

5. Selective Indexing Optimization

In this section, we introduce the concept of selective indexing. As we described Section 3.2, the Vertex Indexer builds a TGER only for those vertices whose size (as out/in-degree) meets an experimentally obtained predefined vertex size threshold (as of writing, currently set to 2k edges). With selective indexing, we propose a novel cost model and related algorithms to determine the most appropriate access method for each vertex in a temporal graph during traversal. The unique aspect of the cost estimation problem we address is that it selectively assigns different access methods (e.g., index vs full scan) for the same query at runtime. As an example, while a conventional relational query optimizer may select an in-memory hash index for all primary keys satisfying a PrimaryKey-ForeignKey join, our selective indexing approach considers the estimated selectivity for the query based on the value of each primary key, as well as the anticipated number of matches it has in the corresponding foreign key table.

Figure 6 presents the decision tree used by Kairos’s Vertex Indexer (Figure 2) to determine the optimal candidate data structure for accessing a vertex’s neighboring edges, given an input temporal query. The following sections provide a description of our selective indexing approach’s cost model and associated algorithms.

5.1. Cost Model

To define a cost model for deciding when it is beneficial to use TGER compared to a parallel scan over the T-CSR, we need to consider the following factors:

Vertex degree distribution. The distribution of the number of outgoing edges for each vertex is important because TGER is only built (and potentially accessed) for vertices with more than a certain number of outgoing (or incoming) edges. If a large portion of the vertices have a high degree, the custom index will be more frequently accessed and contribute more to the overall query performance. Furthermore, more memory will also be used to store index data.

Temporal edges distribution. This refers to the distribution of start times and end times present in these edges for a given vertex. These distributions indicate the expected number of matches for a given temporal predicate query interval. Even though TGER is balanced, skew in the distribution of start times can lead to more levels being traversed to retrieve all edges that satisfy a given input predicate.

Query workload. For a given input temporal predicate, Kairos needs to estimate the expected number of results for a specific vertex corresponding to that query.

In summary, the distribution of the start and end times of the temporal edges affects the selectivity of the queries. If the distribution is such that the queries have a high degree of selectivity (i.e., they filter out a significant portion of the data), using the TGER will be more efficient. Conversely, if the distribution is such that the queries have low selectivity (i.e., they return a large portion of the data), TGER’s performance advantage over the T-CSR access method is reduced.

Taking these factors into account, we define a cost model that estimates the time required for each method (index-based using TGER and T-CSR-based) to execute a given query. The model acts as a proxy for the estimated query processing time given what we know about the graph (e.g., cardinality of a vertex’s neighborhood), the query workload (e.g., estimated selectivity of input query), and characteristics of the data structures used for access (e.g., parallelism potential and asymptotic complexity). For the TGER-based access method, the cost model relies on the time complexity for PSTs, which is $O(\log m+k)$ , where $m$ is the number of elements stored in the data structure, and $k$ is the number of results matching the input query. A TGER is created for each vertex (Figure 6), so $m=deg(v)$ , where $v\in V$ , which yields

(1)

T_{v}=c\cdot[\log(deg(v))+k]

where $c$ acts as a constant factor representing the average cost of performing a single operation using TGER. The value for $c$ captures the parallelism potential when using TGER, and is derived experimentally.

For the T-CSR access method, Kairos needs to perform a parallel scan to filter out edges that satisfy the input temporal predicate. Despite this operation being highly parallelizable and cache-friendly, the asymptotic time complexity for this scan is still $deg(v)$ , as all edges for $v$ need to be scanned, with

(2)

S_{v}=c^{\prime}\cdot deg(v)

where $c^{\prime}$ is a constant factor representing the average cost of performing a single operation using the T-CSR. It performs a role similar to that of $c$ for TGER, and like $c$ , it too is derived experimentally. To decide which method is more beneficial for a given query, we compare the estimated time costs for both methods using the cost model. If the estimated time cost for TGER method ( $T_{v}$ ) is lower than the estimated time cost for the T-CSR array method ( $S_{v}$ ), then it is more beneficial to use TGER. Our cost model parameterizes the cardinality estimator based on the factors described above, with

(3)

C_{v}=\begin{cases}T_{v}&\text{if }\beta\leq\theta_{\text{sel}}\\ S_{v}&\text{otherwise}\\ \end{cases}

$C_{v}$ ${}={}$	the estimated cost of accessing a vertex’s neighbors
$T_{v}$ ${}={}$	the cost of querying for vertex $v$ ’s neighbors in TGER
$S_{v}$ ${}={}$	the cost of scanning vertex $v$ ’s neighbors in T-CSR that satisfy the input temporal predicate
$\beta$ ${}={}$	the selectivity of the input temporal predicate, defined as $k/m$
$\theta_{\text{sel}}$ ${}={}$	the selectivity threshold

For example, assuming a selectivity threshold $\theta_{\text{sel}}=0.3$ (i.e., queries retrieving 30% of the neighboring edges of a vertex $v$ ), if the estimated selectivity $\beta$ is less than $0.3$ (e.g., $0.2$ ), then Kairos chooses TGER as the access method for vertex $v$ given input query $q$ .

In theory, the choice of threshold primarily depends on the relationship between $k$ and $deg(v)$ . As $k$ approaches $dev(v)$ , the cost of using TGER converges to $O(deg(v))$ , rather than $O(\log(deg(v)))$ . In this case, it becomes more beneficial to use T-CSR due to its higher parallelism potential. Parallelism is higher for queries over T-CSR because they are implemented as highly parallelizable scans over a parallel array. On the other hand, TGER queries rely on divide-and-conquer recursive parallel operations over a data structure that resembles a BST in one dimension and a heap in another, leading to reduced parallelism behavior as $k$ nears $deg(v)$ . In practice, however, we determine which threshold to use from experimental results, and find that 2k edges strikes a good performance vs estimator accuracy balance. By selecting an appropriate threshold, Kairos can decide at runtime whether to use the TGER index or T-CSR for accessing a vertex’s neighbors, thereby improving query performance compared to a baseline (temporal_ligra, 34) that only uses T-CSR.

5.2. Cardinality Estimation Algorithm

Our cardinality estimation algorithm aims to help determine the best access method for each vertex while taking into account the temporal predicates present in the query workload and the characteristics of the temporal graph being queried.

During the index construction phase (i.e., when building the TGER indices), Kairos creates a 2D density histogram for each vertex that meets the threshold for TGER index construction. The dimensions of the histogram are the start time and duration (end time $-$ start time) of the edges equally divided into $100$ buckets per dimension, for a total of $10000$ buckets, capturing the temporal distribution of a vertex’s edges. At runtime, Kairos uses the histogram to estimate the density of a vertex’s edges that would satisfy the query’s temporal predicates. Depending on the estimated density and a selectivity threshold, Kairos selects the most efficient access method. If the estimated density is above the threshold, the query execution for that vertex employs the associated TGER index; otherwise, a parallel scan on the T-CSR is performed. This enables Kairos to adapt the access method according to the unique features of the temporal graph and the specific temporal predicates in the query, leading to improved query performance, as demonstrated in Section 6.

6. Experimental Results

In this section, we present the results of an experimental evaluation of Kairos’s query performance and scalability.

Setup. We use a 2nd Generation Intel^® Xeon^® Scalable Processor (Cascade Lake) system with 24 physical (48 virtual) cores on each of its two NUMA nodes. The socket has a total of 192GiB of DDR4 2666 MhZ RAM. Our programs use Cilk Plus (cilkplus, 35) and are compiled with OpenCilk’s (opencilk, 36) clang version 14 and -O3 flag. All of the parallel speedup numbers that we report are based on the running time on 24-cores (single socket) without hyper-threading compared to the running time on a single thread.

Datasets. Table 3 shows the number of vertices ( $|V|$ ), number of edges ( $|E|$ ), maximum out-degree ( $\max_{v\in V}\textit{deg}^{+}(v)$ ), maximum in-degree ( $\max_{v\in V}\textit{deg}^{-}(v)$ ), and average degree ( $\textit{avg}_{v\in V}\textit{deg}(v)$ ) for each of the datasets under use. The synthetic data set comprises a temporal graph in which vertices are log-normally distributed, the inter-arrival times of start times follow a Poisson distribution, and the edge durations follow an uniform distribution. If the temporal edges in a dataset only have start times, then end time is sampled from a uniform distribution, similar to what is done in (TemporalPaths, 25, 26).

Name	$\|V\|$	$\|E\|$	$\max_{v\in V}\textit{deg}^{+}(v)$	$\max_{v\in V}\textit{deg}^{-}(v)$	$\textit{avg}_{v\in V}\textit{deg}(v)$
bitcoin (TemporalMotifsSampling, 37)	$4.80\times 10^{7}$	$1.13\times 10^{8}$	$2.66\times 10^{6}$	$2.53\times 10^{6}$	$4$
netflow (Dataset_netflow, 38)	$3.72\times 10^{8}$	$1.20\times 10^{9}$	$5.78\times 10^{5}$	$1.82\times 10^{8}$	$12$
reddit-reply (TemporalMotifsSampling, 37)	$1.17\times 10^{7}$	$6.46\times 10^{8}$	$3.92\times 10^{5}$	$9.94\times 10^{5}$	$110$
stackoverflow (Dataset_stackoverflow, 39)	$6.02\times 10^{6}$	$6.34\times 10^{7}$	$1.01\times 10^{5}$	$93143$	$20$
transportation (Dataset_transportation, 40)	$41794$	$7.93\times 10^{7}$	$50881$	$50625$	$3798$
twitter-cache (Dataset_twittercache, 41)	$94226$	$8.55\times 10^{8}$	$5.21\times 10^{6}$	$2.31\times 10^{5}$	$36328$
synthetic	$10^{7}$	$10^{9}$	$5.65\times 10^{7}$	$3.08\times 10^{7}$	$99$

Table 3. Temporal graphs used for evaluation

6.1. Scalability

Table 4 shows the sequential and parallel running times of our algorithms, as well as their parallel speedup. Runtimes are reported as an average of ten runs. Except for T. BC, T. CC, k-core, and PageRank, all algorithm runtimes use a single source vertex. For algorithms requiring a single source, we selected the top 100 vertices based on their out-degree, leading to 100 runs in a single execution. For PageRank, the reported runtimes cover 100 iterations. T. BC uses the number of temporal S. Duration paths when calculating centrality for each vertex. The algorithms E. Arrival, L. Departure, Fastest, and S. Duration inherently expect a start and end time in their original definitions. For BC, BFS, CC, $k$ -core, and PageRank, we have adapted the original algorithms to accept a start and end time as input. For every algorithm, we set the start time to align with the 95th percentile of the latest start times in the dataset. The end time is set to the maximum value, representing the 100th percentile.

The algorithms overall get good parallel speedup, with a maximum of 22.6x and mean speedup 8.7x. The lower speedups are in general observed in the smaller datasets, as they are too small to benefit from parallel processing. However, the larger speedups are not necessarily always observed in the largest dataset. Rather, they seem to be influenced by a number of factors, including how skewed the vertex degree distribution is, as well as fraction of edges matching the input temporal predicate. In other words, while we use the same query interval size for all experiments, the edges matching that input query predicate (i.e., its selectivity) may not be evenly distributed over all vertices, and thus do not offer the same parallelism potential. Furthermore, some co-routines in Kairos are only parallelized if the minimum number of interations meets an heuristic threshold (e.g., at least 1000 iterations in parallel cilk_for loops), which again may be influenced by skew in the underlying vertex-degree and inter-arrival times distribution for a given dataset.

Figure 7 shows the running time vs. number of threads for all of the minimal temporal paths over a smaller synthetically generated dataset (500M edges). We see good parallel scalability for all of these algorithms, with speedups ranging from 5x to 8x, though see little benefit from 16 to 24 cores.

6.2. Impact of Selective Indexing

Figure 9 shows a comparison against a Temporal Ligra baseline (temporal_ligra, 34), i.e., T-CSR is used for all vertices and no selective indexing is present, as described in Section 4.2. As far as we know, this is the fastest available shared-memory implementation of graph processing algorithms (temporal and traditional) over temporal graphs. The algorithms are configured in the same manner as in Section 6.1, and here we show the running times for a subset comprised mostly of temporal minimal paths, and over one small (reddit-reply), and two large datasets (twitter-cache and synthetic). Our results show that the selective indexing approach does reasonably well, with up to 8x improvement in some cases. As expected, the highly selective queries are the ones with the most improvement. Furthermore, between 10% and 20% selectivity, the T-CSR approach starts being more advantageous.

6.3. Microbenchmark: TGER query runtimes

Figure 8 shows the time it takes to run a query over a single TGER of various sizes (synthetically generated 1M, 10M and 100M edges). We note that each TGER in Kairos is associated to a single vertex. In other words, the running times reported in this experiment should be interpreted as the amount it takes to retrieve edges of interest from a single vertex that has been indexed with TGER. Similar to the results reported in experiments above, input queries are sized to match a portion of the most recent edges (by start time) in the input data (in this case, a vertex’s neighboring edges). Results here show that it takes less than 125 milliseconds to retrieve roughly 10% of a TGER with 100M edges.

6.4. Comparison to Alternatives

Distributed memory. Although a direct comparison with Tink, ICM is challenging due to their distributed memory design, we offer a comparison based on the runtimes presented in their papers (Tink, 6, 5). Using 8 cores, Tink processes the E. Arrival algorithm for just one source vertex in 37s on a synthetic graph of 25 million edges. In contrast, Kairos, with the same number of cores, processes the same algorithm for the 100 source vertices (equivalent to 100 runs) on a synthetic dataset of 500M edges, as shown in Figure 7 in less than 10s. Put simply, for E. Arrival – the sole algorithm with reported runtimes by Tink – Kairos handles data that’s 20 times larger, addresses 100 times more source vertices, and completes the task in a third of the time. We could not locate total runtimes for ICM in its publication. However, in its “weak scaling” analysis, it reports nearly 500 seconds to run E. Arrival on a single source vertex for a graph segment with 100M edges, using 8 cores. Kairos, in contrast, requires under 10 seconds for 100 vertices on 500 million edge dataset (Figure 7). In essence, when compared with ICM, Kairos operates 50 times faster, handles 100 times more algorithm executions, processing a dataset that is 50 times larger.

Shared memory. At the time of writing, the only other temporal graph analytics system in this category is TeGraph (TeGraph, 8). We have contacted the authors, who shared an implementation of shortest paths with us, as well Delicious, one of the datasets they used for evaluation. This dataset has around 300M edges, and 34M vertices. Unfortunately, we were not able to reproduce the results in (TeGraph, 8), which report around one second as the total runtime for executing shortest path on the aforementioned dataset using 100 source vertices. Instead, we find that their implementation of shortest path takes around three seconds to process a single vertex, regardless of which vertex is provided as input. Extrapolating from this result, we get around 300 seconds to process 100 source vertices – as opposed to the one second reported in their paper. Furthermore, the results for their “OnePass” baseline (which is closer to our approach) are significantly slower than Kairos. Specifically, for the same Delicious dataset, their “OnePass” baseline takes close to 90 seconds for 100 vertices using 16 threads on a 8-core machine. Kairos, on the other hand, takes around five seconds using 8 cores, which is a 18x speedup compared to their “OnePass” baseline, and 60x compared to our local results from their implementation.

	bitcoin			netflow			reddit-reply			stackoverflow			transportation			twitter-cache			synthetic
Application	$T_{1}$	$T_{24}$	SU	$T_{1}$	$T_{24}$	SU	$T_{1}$	$T_{24}$	SU	$T_{1}$	$T_{24}$	SU	$T_{1}$	$T_{24}$	SU	$T_{1}$	$T_{24}$	SU	$T_{1}$	$T_{24}$	SU
E. Arrival	74.2	7.87	9.4	124.8	21.6	5.8	3.89	.76	5.1	17.1	1.99	8.6	58.8	6.49	9.1	246.9	16.11	15.3	440	31.7	13.9
L. Departure	56.4	6.63	8.5	125.2	22	5.7	5.1	.76	3.9	11.2	1.53	7.3	45.4	5.22	8.7	214.2	15.15	14.1	391	30.6	12.8
Fastest	56.3	7.73	7.3	185.6	31.6	5.9	5.81	1.11	5.2	9.16	1.32	6.9	38.7	4.73	8.2	222.6	12.84	17.4	311	23.1	13.5
S. Duration	56.6	7.76	7.3	187	32.2	5.8	5.84	1.07	5.5	9.22	1.33	6.9	2.7	19.7	7.3	12.84	223.5	17.4	285	21.6	13.2
T. BFS	8.5	1.5	5.7	97.2	16.5	5.9	1.9	.4	4.8	.8	.2	4	.009	.002	4.5	9.8	1.4	7	19.6	3.2	6.2
T. CC	5.65	.59	9.6	24.4	2.2	11.1	3.99	.2	19.9	1.85	.11	16.8	6.83	.49	13.9	10.2	.71	14.4	5.41	4.57	1.2
T. k-core	13.7	1.22	11.4	38.9	3.6	10.8	32.4	4.2	7.7	2.2	.4	5.5	.02	.01	2	2.7	.9	3	2	.5	4
T. BC	10.8	.57	18.95	8.04	1.26	6.38	.032	.168	5.25	.085	.014	6.07	.005	.003	1.67	2.84	.692	4.1	4.57	3.76	1.22
T. PageRank	65.3	5.23	12.5	35.8	2.6	13.8	11.3	.50	22.6	8.51	.42	20.3	2.6	.21	12.4	17.22	1.2	14.4	11	4.62	2.4

Table 4. Running times (in seconds) of single-threaded (

T_{1}

), 24-core no hyper-threading (

T_{24}

), and parallel speedup as single-thread time divided by 24-core time (SU).

6.5. Selective Indexing: Estimator Accuracy

In Section 5.1, we proposed a novel approach for selectively indexing a subset of vertices in a temporal graph, as well as cost model for deciding which access method to use for each vertex at runtime. In this section, we present an experiment assessing the accuracy of the cardinality estimator, a key component of our proposed cost model. For this evaluation, we vary the size of an input temporal predicate, and use a selectivity threshold of 20% (the “q is selective?” decision in Figure 6). As in the experiments we describe above, here we also vary the size of an input query interval to purposefully match against a percentage of the most recent edges (by start time) in the dataset. To measure accuracy, we define true positives as “should use TGER, and did” and true negatives as “should not use TGER, and did not”. Specifically, “should” here takes into account the estimated selectivity compared to an oracle with the actual selectivity of the query. Furthermore, we only evaluate this decision for vertices that have been large enough to be indexed with a TGER (otherwise, no decision needs to be made, as T-CSR is used a 100% of the time), and vary the minimum cutoff for indexing from 1k to 8k edges (multiples of 2). We find that for all datasets, the accuracy of this decision stays consistently above 90% for input query intervals sized under 1%, and above 95% for all other input query interval sizes we assessed (2 through 5%, 10%, and 20%). Furthermore, the accuracy also increases as a function the cutoff size. As expected, this increase is largely due to the cardinality estimator’s 2D density histogram (Section 5.2) having more samples in this case.

7. Related Work

Temporal Graph Analytics. While the authors of (TemporalPaths, 25, 26) have been the first to propose one-pass parallel versions of temporal graph algorithms, as far as we know (TeGraph, 8) is the only other shared-memory temporal graph analytics system. Previous systems for temporal graph analytics have primarily relied on distributed message passing, often building upon programming paradigms like Pregel or stateful data stream processing frameworks such as Apache Flink (ICM, 5, 42, 6, 7). The large number of messages exchanged when relying on these programming paradigms imposes considerable overhead on graph processing. This becomes particularly noticeable in the case of graphs that fit within available memory of current commodity servers, as shown by the speedups we get when compared against these systems.

Time-evolving Graph Engines. There has been significant work on develo** frameworks for processing of graphs that evolve over time (Chronos, 43, 44, 45, 46, 47). While these systems allow processing of snapshots, streaming graphs, or dynamic graphs dynamic graphs, for their most part they do not support temporal graphs. Rather, timestamps associated to edges or vertices are treated as graph updates in these systems. For this reason, we consider these orthogonal.

Range Query Data Structures. There has been extensive work on data structures to handle range queries. Traditional range query data structures often used in relational systems include quad-trees, R-trees, kd-trees. Specifically aimed at interval data – such as that present in temporal edges – are interval trees, segment trees, and priority search trees (CompGeometryBook, 29). All three of these data structures have runtime complexity of $O(logn+k)$ , where $k$ is the number of results for the range query. Where they differ is on space complexity, which is highest for segment trees at $O(nlogn)$ , as well as on properties relating to how they can be queried (e.g., stabbing vs 3-sided queries). The index for storing temporal edges that we introduce in this paper is a highly-optimized and specialized version of a priority search tree.

GNNs. Graph Neural Networks (GNNs) have emerged as a powerful framework for learning representations of graph-structured data, tackling problems in various domains such as social networks, molecular biology, and recommendation systems (DGL, 48, 33, 49). GNNs rely on the underlying graph structure and the attributes of nodes and edges to learn complex patterns and make predictions, with recent increased focus on temporal graphs (TGL, 33, 4). By employing efficient graph analytics algorithms, these systems can compute a wide range of graph properties, such as centrality measures, graph motifs, or community structures, which can then be used as additional input features for GNNs. In the context of GNNs, most related to our work is the version of Temporal CSR used for sampling temporal edges in (TGL, 33). While it does not account for end time edges, as far as we know this is the only other work that extends CSR to the temporal setting.

8. Conclusions

We presented Kairos, a temporal graph analytics system that provides application developers a framework for efficiently executing temporal algorithms over temporal graphs. Specifically, it employs TGER, a highly-optimized parallel data structure, as efficient index for temporal graph processing. Kairos is built atop Ligra (ligra, 28), a state-of-the-art parallel graph processing system. With TGER, Ligra’s vertex-centric computational model takes advantage of locality that is naturally occurring on temporal graphs and queries over such graphs. We show in our experiments that using Kairos, a number of minimal temporal path algorithms are up to 8x times faster than an already competitive baseline (temporal_ligra, 34), which we also provide. When compared with alternative shared-memory system, it achieves up to 60x speedups.

Acknowledgements.

This research is supported by DOE Early Career Award #DE-SC0018947, NSF CAREER Award #CCF-1845763, and by Intel as part of the MIT Data Systems and AI Lab (DSAIL) at MIT. Joana M. F. da Trindade was partially supported by an Alfred P. Sloan UCEM PhD Fellowship, and a Microsoft Research PhD Fellowship.

References

(1) Petter Holme and Jari Saramäki “Temporal networks” In Physics Reports 519, 2012, pp. 97–125 URL: http://arxiv.longhoe.net/abs/1108.1780
(2) Mincheng Wu et al. “Use of temporal contact graphs to understand the evolution of COVID-19 through contact tracing data” In Nature Commun. Phys 5, 2022
(3) Renaud Lambiotte, Martin Rosvall and Ingo Scholtes “From networks to optimal higher-order models of complex systems” In Nature Phys. 15.4, 2019, pp. 313–320
(4) Shenyang Huang et al. “Temporal Graph Benchmark for Machine Learning on Temporal Graphs”, 2023 arXiv:2307.01026 [cs.LG]
(5) S. Gandhi and Y. Simmhan “An interval-centric model for distributed computing over temporal graphs” In ICDE, 2020, pp. 1129–1140
(6) Wouter Lightenberg, Yulong Pei, George Fletcher and Mykola Pechenizkiy “Tink: A Temporal Graph Analytics Library for Apache Flink” In WWW, 2018, pp. 71–72
(7) Christopher Rost et al. “Distributed temporal graph analytics with GRADOOP” In PVDLB, 2022, pp. 375–401
(8) Chengying Huan et al. “TeGraph: A Novel General-Purpose Temporal Graph Computing Engine” In ICDE, 2022, pp. 578–592
(9) Jure Leskovec, Jon Kleinberg and Christos Faloutsos “Graphs over time: densification laws, shrinking diameters and possible explanations” In KDD, KDD, 2005, pp. 177–187
(10) Benjamin Erb, Dominik Meißner, Jakob Pietron and Frank Kargl “Chronograph: A Distributed Processing Platform for Online and Batch Computations on Event-Sourced Graphs” In DEBS, 2017, pp. 78–87
(11) Qi Wang et al. “You Are What You Do: Hunting Stealthy Malware via Data Provenance Analysis.” In NDSS, 2020
(12) Chrysovalantis Anastasiou et al. “ASTRO: Reducing COVID-19 Exposure through Contact Prediction and Avoidance” In ACM Trans. Spatial Algorithms Syst. 8.2, 2022, pp. 1–31
(13) Tiantian Liu et al. “Towards Crowd-Aware Indoor Path Planning” In PVLDB, 2021, pp. 1365–1377
(14) Lei Li, Sibo Wang and Xiaofang Zhou “Time-Dependent Hop Labeling on Road Network” In ICDE, 2019, pp. 902–913
(15) Hongchao Qin et al. “Mining Bursting Core in Large Temporal Graphs” In PVLDB, 2022, pp. 3911–3923
(16) Junyong Yang et al. “Scalable Time-Range k-Core Query on Temporal Graphs (Full Version)” In PVLDB, 2023, pp. 1168–1180
(17) Michael Yu et al. “On querying historical k-cores” In PVLDB, 2021, pp. 2033–2045
(18) Rong-Hua Li et al. “Persistent community search in temporal networks” In ICDE, 2018, pp. 797–808
(19) Martino Ciaperoni et al. “Relevance of temporal cores for epidemic spread in temporal networks” In Scientific reports 10.1 Nature Publishing Group UK London, 2020, pp. 12529
(20) Manuel Gomez Rodriguez, Jure Leskovec, David Balduzzi and Bernhard Schölkopf “Uncovering the structure and temporal dynamics of information propagation” In Network Science 2.1 Cambridge University Press, 2014, pp. 26–65
(21) Daniele Notarmuzi et al. “Universality, criticality and complexity of information propagation in social media” In Nature communications 13.1 Nature Publishing Group UK London, 2022, pp. 1308
(22) Tianming Zhang et al. “Efficient distributed reachability querying of massive temporal graphs” In PVLDB, 2019, pp. 871–896
(23) Ashwin Paranjape, Austin R Benson and Jure Leskovec “Motifs in Temporal Networks” In WSDM, 2017, pp. 601–610
(24) James F Allen “Maintaining knowledge about temporal intervals” In Commun. ACM 26.11, 1983, pp. 832–843
(25) Huanhuan Wu et al. “Path Problems in Temporal Graphs” In PVLDB, 2014, pp. 721–732
(26) H. Wu et al. “Efficient Algorithms for Temporal Path Computation” In TKDE 28.11, 2016, pp. 2927–2942
(27) Brian Wheatman and Helen Xu “Packed Compressed Sparse Row: A Dynamic Graph Representation” In HPEC, 2018, pp. 1–7
(28) Julian Shun and Guy E. Blelloch “Ligra: A Lightweight Graph Processing Framework for Shared Memory” In PPoPP, 2013, pp. 135–146
(29) Mark de Berg, Otfried Cheong, Marc van Kreveld and Mark Overmars “Computational Geometry: Algorithms and Applications” Springer-Verlag TELOS, 2008
(30) Wenfei Fan et al. “Towards Event Prediction in Temporal Graphs” In PVLDB, 2022, pp. 1861–1874
(31) Zhihao Wen and Yuan Fang “TREND: TempoRal Event and Node Dynamics for Graph Representation Learning” In WWW, 2022, pp. 1159–1169
(32) Bin Han and Bill Howe “Adapting to Skew: Imputing Spatiotemporal Urban Data with 3D Partial Convolutions and Biased Masking”, 2023 arXiv:2301.04233 [cs.CV]
(33) Hongkuan Zhou et al. “TGL: A General Framework for Temporal GNN Training on Billion-Scale Graphs” In PVLDB, 2022, pp. 1572–1580
(34) “Temporal Ligra” URL: https://github.com/jshun/ligra/tree/temporal
(35) Charles E Leiserson “The Cilk++ concurrency platform” In DAC, 2009, pp. 522–527
(36) Tao B Schardl and I-Ting Angelina Lee “OpenCilk: A Modular and Extensible Software Infrastructure for Fast Task-Parallel Code” In PPoPP, 2023, pp. 189–203
(37) Paul Liu, Austin R. Benson and Moses Charikar “Sampling methods for counting temporal motifs” In WSDM, 2019
(38) Melissa J.M. Turcotte, Alexander D. Kent and Curtis Hash “Unified Host and Network Data Set” In Data Science for Cyber-Security, 2018, pp. 1–22
(39) “SNAP StackOverflow dataset” URL: https://snap.stanford.edu/data/sx-stackoverflow.html
(40) Rainer Kujala et al. “A collection of public transport network data sets for 25 cities” In Nature Sci. Data 5, 2018
(41) Juncheng Yang, Yao Yue and K.V. Rashmi “A large scale analysis of hundreds of in-memory cache clusters at Twitter” In OSDI, 2020, pp. 191–208
(42) Shriram Ramesh, Animesh Baranawal and Yogesh Simmhan “A Distributed Path Query Engine for Temporal Property Graphs” Preprint, https://arxiv.longhoe.net/abs/2002.03274 In CoRR 2020 abs/2002.03274
(43) Wentao Han et al. “Chronos: A Graph Engine for Temporal Graph Analysis” In EuroSys, 2014, pp. 1:1–1:14
(44) Youshan Miao et al. “ImmortalGraph: A System for Storage and Analysis of Temporal Graphs” In TOS 11.3, 2015, pp. 14:1–14:34
(45) Raymond Cheng et al. “Kineograph: taking the pulse of a fast-changing and connected world” In EuroSys, 2012, pp. 85–98
(46) Aapo Kyrola, Guy E Blelloch and Carlos Guestrin “GraphChi: Large-scale graph computation on just a PC” In OSDI, 2012, pp. 31–46
(47) Anand Padmanabha Iyer, Li Erran Li, Tathagata Das and Ion Stoica “Time-evolving Graph Processing at Scale” In GRADES, 2016, pp. 5:1–5:6
(48) Minjie Yu Wang “Deep Graph Library: towards efficient and scalable deep learning on graphs” In ICLR, 2019
(49) Antonio Longa et al. “Graph Neural Networks for Temporal Graphs: State of the Art, Open Challenges, and Opportunities” In Transactions on Machine Learning Research, 2023 URL: https://openreview.net/forum?id=pHCdMat0gI

Kairos: Efficient Temporal Graph Analytics on a Single Machine