Empowering Interdisciplinary Insights with Dynamic Graph Embedding Trajectories

Yiqiao **¹, Andrew Zhao¹, Yeon-Chang Lee²,
Meng Ye³, Ajay Divakaran³, Srijan Kumar¹
¹Georgia Institute of Technology,
²Ulsan National Institute of Science and Technology (UNIST),
³SRI International
{y**328,srijan}@gatech.edu

Abstract

We developed DyGETViz, a novel framework for effectively visualizing dynamic graphs (DGs) that are ubiquitous across diverse real-world systems. This framework leverages recent advancements in discrete-time dynamic graph (DTDG) models to adeptly handle the temporal dynamics inherent in dynamic graphs. DyGETViz effectively captures both micro- and macro-level structural shifts within these graphs, offering a robust method for representing complex and massive dynamic graphs. The application of DyGETViz extends to a diverse array of domains, including ethology, epidemiology, finance, genetics, linguistics, communication studies, social studies, and international relations. Through its implementation, DyGETViz has revealed or confirmed various critical insights. These include the diversity of content sharing patterns and the degree of specialization within online communities, the chronological evolution of lexicons across decades, and the distinct trajectories exhibited by aging-related and non-related genes. Importantly, DyGETViz enhances the accessibility of scientific findings to non-domain experts by simplifying the complexities of dynamic graphs. Our framework is released as an open-source Python package for use across diverse disciplines. Our work not only addresses the ongoing challenges in visualizing and analyzing DTDG models but also establishes a foundational framework for future investigations into dynamic graph representation and analysis across various disciplines.

1 Introduction

Background

Dynamic graphs (DGs) are ubiquitous data structures present in various real-world evolving systems, such as social networks [1], linguistics [2], international relations [3], and computational finance [4]. Representing these dynamic graphs efficiently has become a crucial challenge due to their massive sizes and ever-changing nature. One compelling approach to tackle this challenge is discrete-time dynamic graph (DTDG) models [5, 6, 7], which represent a dynamic graph as a series of snapshots, each containing the nodes and edges that co-occur at particular timestamps. Despite the effectiveness of DTDG models in a wide range of graph-oriented tasks such as link prediction, node classification, and edge regression, these models usually remain opaque to researchers in terms of interpretability. The high-dimensional representations generated by these models make it difficult for users to extract and understand the intrinsic value from dynamic graphs. Currently, researchers often manually analyze the dynamic graph data, as there are no specialized tools to support this process [8, 9]. However, manual analysis of enormous dynamic graphs covering multiple timestamps can be overwhelming, and the continuously evolving nature of these graphs makes it challenging to intuitively capture both micro-level and macro-level structural shifts. For instance, in the study of international relations, aside from predicting graph attributes like future bilateral trade volumes, it is vital to understand micro-level changes such as a country’s alliance network, trade relations, and conflict dynamics, as well as macro-level trends such as the stability of the global economy amidst wars and financial crises, as inherently reflected by the high-dimensional node embeddings obtained from DTDG models.

In this case, visualization becomes a powerful tool with an intuitive and user-friendly interface for analyzing the dynamic graph embeddings of DTDG models, as it enables researchers to gain both micro-level understandings, such as predicting node states and future trajectories, and macro-level analysis, such as forecasting emerging turning points in geopolitical events. With an effective visualization framework, researchers can gain insights, identify patterns, detect anomalies, and effectively communicate their findings to both domain experts and the general public, which would be challenging to achieve solely through manual analysis.

Challenges

Develo** a visualization framework for dynamic graph embedding trajectories requires addressing the unique characteristics and challenges of DGs. The first challenge is the constant addition and deletion of nodes in DTDG. As nodes are continuously added or removed, accurately inferring dynamic embedding trajectories for new nodes and effectively incorporating them into the visualization becomes crucial yet challenging. More specifically, the continuous addition and removal of nodes create a dynamic landscape in which the proximity between nodes is in constant flux, further complicating the visualization process. The second challenge arises from the persistent evolution of node embeddings over time. Conventional visualization techniques [10, 11, 12] often rely on non-parametric methods, which can present limitations when projecting new data points onto an existing visualization space [13]. When applying such visualization techniques to each snapshot of the DTDG, the visualization layout undergoes a complete transformation, disrupting the continuity and hindering a coherent representation of embedding trajectories over time [13]. Thus, researchers will fail to observe valuable patterns in the DTDG network. Addressing this challenge is crucial for providing researchers with a clear and consistent understanding of the dynamic graph’s behavior and evolution over time.

This Work

In this work, we formally define the novel problem of dynamic graph embedding trajectories visualization to enable the analysis of discrete-time dynamic graph models. We propose DyGETViz, a novel framework for Dynamic Graph Embedding Trajectory Visualization, to address the above challenges. DyGETViz leverages recent developments in dynamic graph neural networks (GNNs) [7, 14, 15] and offers two key functionalities: visualization and analytics. The visualization module employs principles from dynamic GNNs to map high-dimensional node embeddings into lower-dimensional representations, and employs a flexible and computationally efficient approach to project node state at each timestamp onto the visualization, which is potentially scalable to datasets spanning multiple timestamps. The analytics module quantifies structural shifts in DTDGs from both micro- and macro-level. For micro-level analysis, it uses two similarity measures, namely, Jaccard index [16] and Rank-biased Overlap (RBO) [17, 18], to quantify the changes in the local topology of each node between adjacent timestamps. For macro-level analysis, it uses a novel metric, normalized average ranking change (NARC), as well as the absolute volumes of embedding movements to assess the changes in global topology. These comprehensive analytics enable researchers to gain insights into both fine-grained and large-scale changes in dynamic graphs, empowering investigations across various domains. The versatility and applicability of DyGETViz is demonstrated by our analysis on nine datasets introduced in Supp.A spanning different graph sizes and domains, including ethology, epidemiology, finance, genetics, linguistics, communication studies, social studies, and international relations.

We provide complete technical details for DyGETViz in Supp.B. Our proposed python package is available at GitHub, and the visualization for all datasets are available on our website. All the code and datasets have been made publicly available.

2 Results

2.1 Reddit Community Graphs Reveals Content Specialization, Content Diversity, and Echo Chambers

Refer to caption — Figure 1: Visualization of Reddit online communities. Each gray node in the background represents an online community (“subreddit”). The trajectories of five groups of subreddits are displayed, including a. gaming, b. sports, c. video-sharing, d. politics, and e. music. Text in the background indicates the topics that characterize each subreddit cluster. Different video-sharing communities (c.) manifest diverse levels of specialization, where communities with a narrow focus of video promotion demonstrate less mobility than general-purpose communities. DyGETViz captures a major event in r/The_Donald – its shutdown.

Online users often form communities around shared interests, beliefs, ethnicity, and geographical locations [1]. A deeper understanding of these community dynamics on platforms like Reddit, which is structured into thousands of interest-specific “subreddits”, is crucial for analyzing how user groups interact, share content, and influence one another over time. This study presents an analysis of subreddit trajectories across various topics, including gaming, sports, videos, politics, and music, with a focus on content specialization and the phenomenon of echo chambers, as shown in Fig. 1. Each subreddit’s trajectory is highlighted in a distinct color. To derive the graph embeddings, we train the model on the bipartite graph consisting of videos and subreddits, where an edge with timestamp $t$ exists between a video and a subreddit if the video is shared in the subreddit at $t$ . In the resulting graph, two nodes are close in the embedding space if they share similar videos. Each subreddit’s trajectory within the visualized graph embeddings indicates the level of content homogeneity or diversity.

Specialization in Content Sharing Across Video-Related Subreddits

The trajectories of video-sharing subreddits (Fig. 1c) demonstrate diverse levels of specialization. Subreddits with a narrow focus on promoting YouTube videos and small channels, such as r/GetMoreViewsYT, r/YouTube_startups, r/AdvertiseYourVideos, r/SmallYoutubers, and r/YouTubeSubscribeBoost, move within a confined region, illustrating a high degree of content homogeneity within these subreddits as users simultaneously spread the same videos within multiple subreddits for better visibility. In contrast, general subreddits like r/videos display a greater diversity of content, as shown from their more expansive trajectories. These findings are supported by the numeric values of $\mathrm{Jaccard}_{100}$ (Fig. S3), where the overlap between the nearest neighbors of each video-related subreddit in the embedding space in adjacent timestamps is high for video-promotion subreddits. Details for the metrics are in Supplementary Sec. B.3.

Diversity and Overlap in Sports-Related Subreddits

On the other hand, for sports-related communities (Fig. 1b), subreddits with specialized topics, such as r/nba (subreddit for the National Basketball Association), r/nfl (subreddit for the National Football League), r/MMA (subreddit for mixed martial arts), r/SquaredCircle (subreddit for professional wrestling), demonstrate similar levels of movements to more general subreddits like r/sports. Notably, r/nba has a large overlap with r/nfl, indicating that these two subreddits share similar audience, posts, and content sharing pattern. Both NBA and NFL feature team-based sports, high-profile athletes, strategies, and have regular seasons followed by playoff rounds that culminate in a championship event. In case of content sharing, many videos feature athletes or moments that have transcended their respective sports and gained widespread popularity, which is appreciated by fans of both basketball (NBA) and American football (NFL). Compared to subreddits focused on videos, sports-related subreddits display more variability among their neighboring subreddits in the embedding space, as evidenced by the $\operatorname{Jaccard}_{100}$ index values averaged on an annual basis (Table S4).

Trajectories of Political Subreddits Reveal Echo Chamber and Major Events

The phenomenon of echo chambers within online social networks, wherein users experience reinforcement of their ideologies through repeated interactions with like-minded peers and a narrow spectrum of information, presents a significant challenge to discourse diversity [19, 20, 21]. This pattern is notably pervasive on platforms like Reddit, where close-knit communities form around specific ideologies or interests. A pertinent example is observed in the subreddit r/WayOfTheBern, an unofficial subreddit established by Bernie Sanders’ supporters following his loss in the 2016 primary election [22]. Initially intended as a space for political discourse divergent from the mainstream Democratic Party narrative, this community has been scrutinized for its alignment and user overlap with right-leaning communities, suggesting a complex web of ideological positioning that transcends conventional political boundaries [23, 22]. The embedding trajectories in Fig. 1d reveals substantial connections between r/WayOfTheBern and r/The_Donald, another banned subreddit known for sharing misinformation and controversial content [22, 24]. These communities demonstrate converging paths that deviate from more generalized political forums like r/politics. Significantly, r/The_Donald manifests a notable divergence in the its trajectory around March 2020, coinciding with key external events such as the COVID-19 outbreak and subsequent quarantine in US major cities. The trajectory of r/The_Donald terminates at a juncture markedly distinct from its typical position in March 2020, coinciding with the outbreak of COVID-19 pandemic in the United States, the implementation of quarantine measures in major US cities, and Reddit’s decision to relegate r/The_Donald to “Restricted mode” and restricting most users from creating new posts [25]. Such a confluence of events indicates a notable divergence and deterioration characterized by the proliferation of toxic discourse within the community.

The existence and perpetuation of echo chambers underscore the complex challenges of online social networks in fostering balanced and open discourse. They not only facilitate the entrenchment of partisan beliefs by insulating users from contrary viewpoints but also serve as fertile grounds for the spread of misinformation. The observed patterns and trajectories within these communities highlight the urgent need for strategies aimed at early detection and mitigation of echo chambers, ensuring a more diverse and accurate exchange of information within these digital ecosystems.

2.2 Linguistic Reflections and Shifts in Societal Perceptions Through Lexicon Graphs

Semantic shifts in word meanings usually reveal socio-cultural changes over time, whereas the rates of semantic change vary significantly across words [2]. By leveraging word embeddings, DyGETViz effectively tracks the dynamics of lexical connotations over time. Our study uses the skip-gram with negative sampling (SGNS) embeddings [2] trained on the Google N-Gram [26] dataset. Conventional SGNS approach typically considers a fixed window of context words around the target word, and thus may not fully capture the contextual meaning of a word, especially in intricate linguistic contexts characterized by long-range dependencies or when dealing with semantically similar words with limited co-occurrence within the local context. To overcome this issue, we construct a new temporal graph. For each timestamp $t$ , we compute the pairwise cosine similarity between each pair of word embeddings $\mathbf{v}_{i}^{t}$ and $\mathbf{v}_{j}^{t}$ , and then connect each word to its $k$ nearest neighbors with the highest cosine similarity. A new set of temporal embeddings is then trained on this graph. This method facilitates the extraction of high-order semantic associations between words that may not typically co-occur within the same local context, thus overcoming the limitation of the original SGNS embeddings. Empirically, we experimented with $k\in[5,10,20,50,100]$ , and found that $k=20$ yields the most meaningful semantic associations between words.

Tracing the Evolution of Socio-Economic Language in Environmental Discourse from 1950s to 1990s

To gain insights into the evolving socio-economic discourse concerning environmental concerns, we delve into the semantic trajectories of words associated with environmental protection using the HistWords-CN dataset (Fig. 2a). Starting from the 1950s, we traced diverse interpretations of words such as “environment,” which initially carried connotations related to the working environment, as indicated by their proximity to words like “team,” “mobilization,” and “state-of-the-art.” However, as we move into the 1980s and 1990s, we observe a convergence of these terms toward the region occupied by ecological-environment-related words, such as “forest,” “grassland,” “carbon dioxide,” and “nature.” This reflects the ever-growing discourse on ecology and the escalating importance attached to environmental protection. Notably, despite this convergence, the term “save” deviated from this trajectory due to its diverse meanings related to cost-saving, rent, thrift, and value. Our model thus provides an intricate understanding of the evolution of environmental discourse over time.

2.3 Evolving Language and LGBTQ+ Acceptance: A Lexical Analysis of Societal Shifts

The power of language lies in its ability to both reflect and shape societal attitudes. In this context, we explore the linguistic landscape surrounding homosexuality, recognizing its historical significance as a mirror for societal changes.

The term “Gay” Experienced Significant Lexical Evolution in the 1970s.

During the 1970s, the LGBTQ+ rights movement in the United States experienced a transformative period characterized by increased visibility and activism [27]. However, prevailing societal attitudes during this era remained heavily influenced by traditional values and social norms, often stigmatizing homosexuality [28]. As depicted in Fig. 2, we observe a remarkable lexical shift associated with the term “gay” from the 1970s to the 1990s. The word gradually transitions away from its original connotations of happiness and fortune towards homosexuality, aligning with its etymological evolution [29].

Additionally, Table S3 provides a comprehensive view of the top five words associated with each term in the embedding space over time. The words “happy” and “delighted” retain their consistent meanings across the years, serving as constants in the lexical landscape. However, the term “gay,” once widely employed to convey happiness before the 1960s, underwent a profound transformation when they were used to refer to homosexuality in the 1970s and acquired proximity with negative words such as “forlorn” and “ugly.” This lexical shift reflects the societal struggle to grapple with evolving perceptions of homosexuality. Furthermore, LGBT-related words, including “gay,” “homosexual,” and “lesbian,” exhibit strong associations with “clubs” and “dance” during the 1970s and 1980s. This phenomenon corresponds to the development of a distinctive LGBTQ+ culture and language during this era. Bars and dance clubs emerged as vital meeting places for the LGBTQ+ community, providing safe spaces for socialization, self-expression, and the formation of supportive networks [30]. It is crucial to acknowledge that the portrayal of LGBTQ+ characters and issues in popular culture largely perpetuated negative stereotypes and discriminatory portrayals during the examined period. This further entrenched negative attitudes within the general public, making societal acceptance and understanding a complex and arduous journey [31, 32].

By meticulously tracing these linguistic transformations and contextualizing them within historical and societal frameworks, our study contributes to a deeper understanding of the intricate relationship between language, societal attitudes, and the ongoing struggle for LGBTQ+ acceptance.

2.4 Unveiling Global Trade Dynamics: Insights from UN Comtrade Export Data.

In the field of economics, understanding, modeling, and predicting international trade plays a crucial role in hel** economists and policymakers navigate the challenges and opportunities arising from globalization, such as financial crises. In this study, we analyze international trade dynamics using export data from the United Nations Commodity Trade Statistics Database [33]. To capture the economic status and trading partnerships of countries, we perform linear regression on the logarithmic values of a country’s gross exports and the bilateral trade volumes, and employ the joint training objective with $\lambda_{1}=\lambda_{2}=0.1$ in Equation 1.

The resulting visualization in Fig. 3a offers a comprehensive representation of the international trade landscape. Advanced Economies, as classified by the International Monetary Fund (IMF)¹¹1https://www.imf.org/en/Publications/WEO/weo-database/2023/April/groups-and-aggregates, form distinct clusters located primarily in the upper right region, while countries with lower trade volumes and those positioned on the periphery of international trade form separate clusters in the left and lower regions. In addition, Fig. 3b provides a clearer illustration of the distinct visual partitions among the three country groups defined by IMF [35]: Major Advanced Economies (MAE)²²2IMF defines “Major Advanced Economies” as the G7 countries, including Canada, France, Germany, Italy, Japan, the UK and the USA, Other Advanced Economies (OAE), and Emerging and Develo** Economies (EDE). This spatial arrangement reflects the different degrees of trade engagement of each country within the global trade network.

Dynamic Graph Embedding Trajectories of Individual Countries Reveal Development and Stability Patterns of Key Economies

In Fig. 3b, the trajectories of individual countries reveal distinct patterns of economic development and stability. The United States, the United Kingdom, and Germany³³3Germany has been listed in UN Comtrade as a single sovereign state since 1991, following German reunification in October 1990. demonstrate relatively stable and consistent trading status throughout the examined period (1988-2022). On the other hand, China’s trajectory moves between MAE and OAE, reflecting its prolonged period of economic development characterized by comprehensive domestic reforms, the lifting of price controls, and the liberalization of trade policies [36, 37]. Russia exhibits significant movements between OAE and EDE. Its trajectory predominantly shifted towards the EDE region during the period 1992-1998, coinciding with a substantial 40% contraction in GDP [3]. Starting from the early 2000s, Russia moves towards the region occupied by OAEs, including the four middle-sized developed countries Switzerland, Belgium, Sweden, and the Netherlands, which indicates a period of economic recovery characterized by greater trade volumes. Despite its status as an MAE, Japan has experienced economic development with significant fluctuations. The country encountered unique obstacles such as the Japanese asset price bubble (1990-1992) whose impact has lasted for more than a decade [38, 39]. We further use the Jaccard index [16] and Rank-biased Overlap (RBO) [17, 18] to measure the macro-level changes over time. Detailed calculations of these metrics are in Supplementary Sec. B.3. As reflected in Fig. 3c, the RBO and $\mathrm{Jaccard}_{5}$ for Japan plummeted during this period compared to other countries, indicating a period of instability in its economic status.

Trade Resilience and Volatility during Global Economic Crises

From Fig. 3c, we observe three periods of significant fluctuations in RBO and $\mathrm{Jaccard}_{5}$ for most countries, indicating significant changes in their trading status. The first period is 1997 - 2003, which corresponds to the 1997 Asian Financial Crisis and the dot-com bubble when investor confidence declined worldwide. For most countries, the recovery from the financial crisis in 1998–1999 was rapid [40]. For example, China demonstrates quick movements towards and away from the EDE region (Fig. 3b) around 1998. These two events had global ripple effects. The dot-com bubble, during which many large-scale Internet and communication companies failed and shut down, has a more far-reaching effect. As the epicenter of the bubble, the US experienced the most drastic fluctuation in its trading status, as shown by its decline in RBO and $\operatorname{Jaccard}_{5}$ [16] in Fig. 3c (Refer to appendix for ). Similarly, the Great Recession in the 2008s and the COVID-19 also caused fluctuations in RBO and $\text{Jaccard}_{5}$ .

2.5 Dynamic Graph Analysis of Gene Expression Trajectories Reveals Key Patterns in Aging

Dynamic graphs are vital for identifying anomalous genes and genetic variations that significantly impact disease development [41] and human aging process [4]. DyGETViz enables researchers to effectively pinpoint genes with unusual patterns or interactions, facilitating a deeper understanding of aging-related diseases, the genetic mechanisms underlying the aging process, and potential treatments.

Characterizing Structural and Temporal Differences in Gene Expression During Aging

We examine structural differences, neighbor distributions, and temporal dynamics between aging-related and non-aging-related genes using human gene expression data at 37 differnt ages, ranging from 20 to 99. The t-SNE projection in Fig. 4a shows that genes directly related to aging (red dots) have distinct distributions from normal genes (gray dots). We further analyze the trajectories of 10 aging-related and 10 non-aging-related genes, and plot their embedding trajectories in Fig. 4b. From a dynamic graph perspective, the aging-related genes are characterized by distinct embedding trajectories, which are mainly located on the left side of the plot. Such distinctions are reinforced by the kernel density estimation (KDE) plot in Fig. 4c.

Application of DyGETViz for Predicting Aging-Related Gene Behavior

We randomly select 6 genes commonly altered during the human aging process as identified in previous research [43]. These genes experience frequent changes due to their roles in cellular processes, although there is insufficient evidence linking them directly to aging. These genes are categorized as overexpressed (Gene 306, 1520, and 2212) and underexpressed genes (Gene 1281, 1277, and 1287) ⁴⁴4https://genomics.senescence.info/genes/microarray.php. As shown in Fig. 4c, the orange trajectories representing overexpressed genes typically transition between regions associated with aging and non-aging, suggesting that these genes can potentially induce or accelerate the aging process, despite the absence concrete evidence. Meanwhile, the purple trajectories representing underexpressed genes mostly remaining within non-aging regions, suggesting that these genes are less likely to be involved in the aging process.

2.6 Challenges in Distinguishing Fraudulent and Legitimate Behaviors in Financial Networks

Dynamic graphs can be used in financial networks to detect and flag users engaged in fraudulent behaviors [4]. Accurate identification of fraudulent users can facilitate timely intervention and prevent financial loss. As shown in Fig. 4e, the distinction between fraudsters and normal users appears less pronounced, as both groups exhibit trajectories widely dispersed across the plot. These observations highlight the challenge of distinguishing between fraudulent and normal users. In real-world scenarios, fraudulent users possess a remarkable ability to camouflage their activities, often mirroring the behaviors of genuine users. This challenge is further exemplified in Fig. 4f, where the KDE plot depicts the convergence of their trajectories, underscoreing the complexity in accurately identifying and differentiating fraudulent activities from legitimate ones.

2.7 Modeling Social Dynamics in Ant Colonies on Animal Activity Graphs

Animals exhibit intricate and efficient social organizations. For example, ant colonies demonstrate as well-defined organizational hierarchy and role differentiation among worker ants [9]. Roles within these societies include nurses, responsible for the care of the brood and the queen; cleaners, who ensure colony cleanliness and waste disposal; and foragers, tasked with acquiring food resources from outside the colony. Dynamic graph modeling is utilized to describe these behaviors and the evolution of social roles within animal groups. Our model provides a clear interpretation of the trajectories of role-based behaviors, as inferred from the embedding model.

Trajectories of Different Ant Roles Reveal Distinct Spatial Organizations

Fig. S5 illustrates these findings, showing that the movement patterns of nurses are generally restricted to areas near the queen, reflecting their frequent interactions. Conversely, foragers are typically found in remote areas, aligning with their external foraging activities and minimal contact with the queen. The spatial distribution and movements of these roles over time reveal distinct patterns: nurses and foragers maintain localized activity areas, whereas cleaners exhibit movement patterns intersecting with those of nurses due to their intermediary tasks.

Capturing Role Transitions in Ant Behaviors

DyGETViz captures the transition of individuals between roles, a phenomenon supported by existing literature [9]. For example, the trajectories of certain ants (e.g., Ant29 and Ant242) shift from nursing towards cleaning roles over time, indicating a natural progression as they age. This dynamic is effectively represented in our models providing insight into the adaptive behaviors within ant colonies.

3 Conclusion and Future Works

In this work, we formally define the problem of dynamic graph embedding trajectories visualization, and introduce DyGETViz, a novel framework to effectively address the problem. Empirical evaluation on 9 real-world datasets demonstrates the broad application of DyGETViz and provides significant insights.

Looking forward, there are multiple promising directions for further research. An immediate area of interest is the development of more refined methodologies for assessing the quality and efficacy of the visualizations generated. This could involve the creation of metrics and evaluation protocols that better capture the utility and interpretability of visual outputs in practical scenarios. Additionally, it is imperative to investigate the potential of DyGETViz to be adapted or enhanced to support a wider array of visualization paradigms and representations. Such explorations could extend its relevance to other data types and structures beyond graphs, thereby accommodating the dynamic and diverse needs of modern data visualization.

Acknowledgment

This research/material is based upon work supported in part by NSF grants CNS-2154118, IIS-2027689, ITE-2137724, ITE-2230692, CNS2239879, Defense Advanced Research Projects Agency (DARPA) under Agreement No. HR00112290102 (subcontract No. PO70745), and funding from Microsoft, Google, and Adobe Inc. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the position or policy of DARPA, DoD, SRI International, NSF and no official endorsement should be inferred. We thank the reviewers for their comments.

References

[1] Yiqiao **, Yeon-Chang Lee, Kartik Sharma, Meng Ye, Karan Sikka, Ajay Divakaran, and Srijan Kumar. Predicting information pathways across online communities. In KDD, 2023.
[2] William L Hamilton, Jure Leskovec, and Dan Jurafsky. Diachronic word embeddings reveal statistical laws of semantic change. In ACL, pages 1489–1501, 2016.
[3] Anderson Monken, Flora Haberkorn, Munisamy Gopinath, Laura Freeman, and Feras A Batarseh. Graph neural networks for modeling causality in international trade. In FLAIRS, volume 34, 2021.
[4] Xuanwen Huang, Yang Yang, Yang Wang, Chun** Wang, Zhisheng Zhang, Jiarong Xu, Lei Chen, and Michalis Vazirgiannis. Dgraph: A large-scale financial dataset for graph anomaly detection. NIPS, 35:22765–22777, 2022.
[5] Jiaxuan You, Tianyu Du, and Jure Leskovec. Roland: graph learning framework for dynamic graphs. In KDD, pages 2358–2366, 2022.
[6] Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kanezashi, Tim Kaler, Tao Schardl, and Charles Leiserson. Evolvegcn: Evolving graph convolutional networks for dynamic graphs. In AAAI, volume 34, pages 5363–5370, 2020.
[7] Youngjoo Seo, Michaël Defferrard, Pierre Vandergheynst, and Xavier Bresson. Structured sequence modeling with graph convolutional recurrent networks. In ICONIP, pages 362–373. Springer, 2018.
[8] Siwei Li, Zhiyan Zhou, Anish Upadhayay, Omar Shaikh, Scott Freitas, Haekyu Park, Zijie J Wang, Susanta Routray, Matthew Hull, and Duen Horng Chau. Argo lite: Open-source interactive graph exploration and visualization in browsers. In CIKM, pages 3071–3076, 2020.
[9] Danielle P Mersch, Alessandro Crespi, and Laurent Keller. Tracking individuals shows spatial fidelity is a key regulator of ant social organization. Science, 340(6136):1090–1093, 2013.
[10] Harold Hotelling. Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6):417, 1933.
[11] Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. JMLR, 9(11), 2008.
[12] Nicola Pezzotti, Thomas Höllt, B Lelieveldt, Elmar Eisemann, and Anna Vilanova. Hierarchical stochastic neighbor embedding. In Computer Graphics Forum, volume 35, pages 21–30. Wiley Online Library, 2016.
[13] Sungtae An, Shenda Hong, and Jimeng Sun. Viva: semi-supervised visualization via variational autoencoders. In ICDM, pages 22–31. IEEE, 2020.
[14] **yin Chen, Xueke Wang, and Xuanheng Xu. Gc-lstm: Graph convolution embedded lstm for dynamic network link prediction. Applied Intelligence, pages 1–16, 2022.
[15] Aravind Sankar, Yanhong Wu, Liang Gou, Wei Zhang, and Hao Yang. Dysat: Deep neural representation learning on dynamic graphs via self-attention networks. In WSDM, pages 519–527, 2020.
[16] Paul Jaccard. The distribution of the flora in the alpine zone. 1. New phytologist, 11(2):37–50, 1912.
[17] William Webber, Alistair Moffat, and Justin Zobel. A similarity measure for indefinite rankings. TOIS, 28(4):1–38, 2010.
[18] Sejoon Oh, Berk Ustun, Julian McAuley, and Srijan Kumar. Rank list sensitivity of recommender systems to interaction perturbations. In CIKM, pages 1584–1594, 2022.
[19] Corrado Monti, Giuseppe Manco, Cigdem Aslay, and Francesco Bonchi. Learning ideological embeddings from information cascades. In CIKM, pages 1325–1334, 2021.
[20] Michela Del Vicario, Gianna Vivaldo, Alessandro Bessi, Fabiana Zollo, Antonio Scala, Guido Caldarelli, and Walter Quattrociocchi. Echo chambers: Emotional contagion and group polarization on facebook. Scientific reports, 6(1):37825, 2016.
[21] Matteo Cinelli, Gianmarco De Francisci Morales, Alessandro Galeazzi, Walter Quattrociocchi, and Michele Starnini. The echo chamber effect on social media. PNAS, 118(9):e2023301118, 2021.
[22] James Varney. Prominent pro-sanders subreddit wayofthebern aims to divide democrats, says social media analyst. The Washington Times, 2 2019.
[23] Redditpedia Wiki. Subreddit statistics of user overlap, 2023.
[24] Marcus Mann, Diana Zulli, Jeremy Foote, Emily Ku, and Emily Primm. Unsorted significance: Examining potential pathways to extreme political beliefs and communities on reddit. Socius, 9:23780231231174823, 2023.
[25] Elizabeth Timberg, Craig; Dwoskin. Reddit closes long-running forum supporting president trump after years of policy violations. The Washington Post, 2020.
[26] Yuri Lin, Jean-Baptiste Michel, Erez Aiden Lieberman, Jon Orwant, Will Brockman, and Slav Petrov. Syntactic annotations for the google books ngram corpus. In ACL, pages 169–174, 2012.
[27] Patrick Moore. Beyond shame: Reclaiming the abandoned history of radical gay sexuality. Beacon Press, 2004.
[28] Stephan Cohen. The Gay Liberation Youth Movement in New York:’an army of lovers cannot fail’. Routledge, 2007.
[29] Adam Jatowt and Kevin Duh. A framework for analyzing semantic change of words across time. In JCDL, pages 229–238. IEEE, 2014.
[30] Michael Anthony Lusby. Ghent gayland: A case study of the gay and lesbian community and media of norfolk, virginia. Master’s thesis, College of William & Mary, 2011.
[31] Lauren B McInroy and Shelley L Craig. Perspectives of lgbtq emerging adults on the depiction and impact of lgbtq media representation. Journal of youth studies, 20(1):32–46, 2017.
[32] Kevin L Nadal, Chassitty N Whitman, Lindsey S Davis, Tanya Erazo, and Kristin C Davidoff. Microaggressions toward lesbian, gay, bisexual, transgender, queer, and genderqueer people: A review of the literature. The journal of sex research, 53(4-5):488–508, 2016.
[33] UN Comtrade. The united nations commodity trade statistics database. https://comtrade.un.org/, 2010.
[34] Worldometer. Gdp by country (2017), 2023.
[35] IMF. Country composition of weo groups, 2023.
[36] John William Longworth, Colin G Brown, and Scott A Waldron. Beef in china: agribusiness opportunities and challenges. The China Journal, 2001.
[37] Justin Yifu Lin, Fang Cai, and Zhou Li. The China miracle: Development strategy and economic reform (Revised Edition). The Chinese University of Hong Kong Press, 2004.
[38] Thayer Watkins. Japan’s bubble economy, 1999.
[39] Kunio Okina, Masaaki Shirakawa, and Shigenori Shiratsuka. The asset price bubble and monetary policy: Japan’s experience in the late 1980s and the lessons. Monetary and Economic Studies (special edition), 19(2):395–450, 2001.
[40] Steven Radelet, Jeffrey D Sachs, Richard N Cooper, and Barry P Bosworth. The east asian financial crisis: diagnosis, remedies, prospects. Brookings papers on Economic activity, 1998(1):1–90, 1998.
[41] Leman Akoglu, Hanghang Tong, and Danai Koutra. Graph based anomaly detection and description: a survey. TKDE, 29:626–688, 2015.
[42] Qi Li, Khalique Newaz, and Tijana Milenković. Improved supervised prediction of aging-related genes via weighted dynamic network analysis. BMC bioinformatics, 22(1):1–26, 2021.
[43] Robi Tacutu, Thomas Craig, Arie Budovsky, Daniel Wuttke, Gilad Lehmann, Dmitri Taranukha, Joana Costa, Vadim E Fraifeld, and Joao Pedro De Magalhaes. Human ageing genomic resources: integrated databases and tools for the biology and genetics of ageing. Nucleic acids research, 41(D1):D1027–D1033, 2012.
[44] B KLIMT. Introducing the enron corpus. In CEAS, 2004.
[45] Benedek Rozemberczki, Paul Scherer, Oliver Kiss, Rik Sarkar, and Tamas Ferenci. Chickenpox cases in hungary: a benchmark dataset for spatiotemporal signal processing with graph neural networks. arXiv preprint arXiv:2102.08100, 2021.
[46] Khalique Newaz and Tijana Milenković. Inference of a dynamic aging-related biological subnetwork via network propagation. TCBB, 19(2):974–988, 2020.
[47] Terrence Szymanski. Temporal word analogies: Identifying lexical replacement with diachronic word embeddings. In ACL, pages 448–453, 2017.
[48] Anant Dadu, Vipul K Satone, Rachneet Kaur, Mathew J Koretsky, Hirotaka Iwaki, Yue A Qi, Daniel M Ramos, Brian Avants, Jacob Hesterman, Roger Gunn, et al. Application of aligned-umap to longitudinal biomedical studies. Patterns, 4(6), 2023.
[49] Leland McInnes, John Healy, Nathaniel Saul, and Lukas Großberger. Umap: Uniform manifold approximation and projection. JOSS, 3(29), 2018.
[50] Sam T Roweis and Lawrence K Saul. Nonlinear dimensionality reduction by locally linear embedding. science, 290(5500):2323–2326, 2000.
[51] Joshua B Tenenbaum, Vin de Silva, and John C Langford. A global geometric framework for nonlinear dimensionality reduction. science, 290(5500):2319–2323, 2000.
[52] C Spearman. The proof and measurement of association between two things. The American Journal of Psychology, 15(1):72–101, 1904.
[53] Maurice G Kendall. A new measure of rank correlation. Biometrika, 30(1/2):81–93, 1938.
[54] Yiqiao **, Qinlin Zhao, Yiyang Wang, Hao Chen, Kaijie Zhu, Yijia Xiao, and **dong Wang. Agentreview: Exploring peer review dynamics with llm agents. arXiv:2406.12708, 2024.
[55] Utkarsh Mahadeo Khaire and R. Dhanalakshmi. Stability of feature selection algorithm: A review. Journal of King Saud University - Computer and Information Sciences, 34(4):1060–1073, 2022.
[56] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph attention networks. In ICLR, 2018.
[57] Yiqiao **, Yunsheng Bai, Yanqiao Zhu, Yizhou Sun, and Wei Wang. Code recommendation for open source software developers. In Web Conference, 2023.
[58] Srijan Kumar, Xikun Zhang, and Jure Leskovec. Predicting dynamic embedding trajectory in temporal interaction networks. In KDD, pages 1269–1278, 2019.
[59] Yiqiao **, Xiting Wang, Ruichao Yang, Yizhou Sun, Wei Wang, Hao Liao, and Xing Xie. Towards fine-grained reasoning for fake news detection. In AAAI, volume 36, pages 5746–5754, 2022.
[60] Ruichao Yang, Xiting Wang, Yiqiao **, Chaozhuo Li, Jianxun Lian, and Xing Xie. Reinforcement subgraph reasoning for fake news detection. In KDD, pages 2253–2262, 2022.
[61] Benedek Rozemberczki, Paul Scherer, Yixuan He, George Panagopoulos, Alexander Riedel, Maria Astefanoaei, Oliver Kiss, Ferenc Beres, Guzmán López, Nicolas Collignon, et al. Pytorch geometric temporal: Spatiotemporal signal processing with neural machine learning models. In CIKM, pages 4564–4573, 2021.
[62] Warren S Torgerson. Multidimensional scaling: I. theory and method. Psychometrika, 17(4):401–419, 1952.
[63] Seongmin Lee, Sadia Afroz, Haekyu Park, Zijie J Wang, Omar Shaikh, Vibhor Sehqal, Ankit Peshin, and Duen Horng Chau. Explaining website reliability by visualizing hyperlink connectivity. In 2022 IEEE Visualization and Visual Analytics (VIS), pages 26–30. IEEE, 2022.
[64] Kevin Li, Haoyang Yang, Evan Montoya, Anish Upadhayay, Zhiyan Zhou, Jon Saad-Falcon, and Duen Horng Chau. Visual exploration of literature with argo scholar. In CIKM, pages 4912–4916, 2022.
[65] Victor Chomel, Nathanaël Cuvelle-Magar, Maziyar Panahi, and David Chavalarias. Polarization identification on multiple timescale using representation learning on temporal graphs in eulerian description. In NeurIPS 2022 Temporal Graph Learning Workshop, 2022.
[66] Jitesh Shetty and Jafar Adibi. Discovering important nodes through graph entropy the case of enron email database. In Proceedings of the 3rd international workshop on Link discovery, pages 74–81, 2005.
[67] Matthew W Seeger and Robert R Ulmer. Explaining enron: Communication and responsible leadership. Management Communication Quarterly, 17(1):58–84, 2003.
[68] Cees BM Van Riel and Charles J Fombrun. Essentials of corporate communication: Implementing practices for effective reputation management. Routledge, 2007.
[69] JoAnne Yates. Control through communication: The rise of system in American management, volume 6. JHU Press, 1993.
[70] Linjuan Rita Men. Strategic internal communication: Transformational leadership, communication channels, and employee satisfaction. Management communication quarterly, 28(2):264–284, 2014.
[71] Zoltán Kovács, Zsolt Jenő Farkas, Tamás Egedy, Attila Csaba Kondor, Balázs Szabó, József Lennert, Dorián Baka, and Balázs Kohán. Urban sprawl and land conversion in post-socialist cities: The case of metropolitan budapest. Cities, 92:71–81, 2019.
[72] Wadie Skaf, Arzu Tosayeva, and Dániel T Várkonyi. Towards automatic forecasting: Evaluation of time-series forecasting models for chickenpox cases estimation in hungary. In ISDA, pages 1–10. Springer, 2022.

A Dataset Introduction

Datasets.

We used 9 publicly available datasets spanning 8 different domains to demonstrate DyGETViz’s wide applicability across all of these subject areas. Table S1 provides the statistics of the nine datasets.

•

Reddit [1] encompasses YouTube videos shared across 29,461 subreddits over a five-year period, from January 2018 to December 2022. The dataset forms a bipartite graph with each node representing a video or a subreddit. Each edge in the graph indicates a video being shared in a subreddit, and its weight is determined by the frequency of sharing.
•

Enron [44] includes the email communication history of Enron Corporation from June 1999 to December 2001. Each node represents an employee and each edge represents an email between them.
•

UN Comtrade⁵⁵5https://comtradeplus.un.org/ (United Nations Comtrade database) [33] offers extensive global annual trade statistics. Our analysis focuses on the annual export data from 1988 to 2022. Nodes represent countries and edges represent the logarithmic values of the annual export volumes between countries.
•

HistWords-EN. The HistWords embeddings is derived from the diachronic word embeddings trained using SGNS (Skip-Gram with Negative Sampling) on the Google N-Gram dataset [26], which uses English documents from the 1800s to the 1990s as the corpus. Each node represents a word, and each edge represents word similarity. The detailed dataset construction process is described in Section 2.2
•

HistWords-CN [2] is trained in the same manner as HistWords-EN using SGNS vectors of Chinese words from the Google N-Gram dataset over the period of 1950s to 1990s.
•

Chickenpox [45] features the weekly chickenpox cases in Hungary between January 2005 and January 2015. Nodes represent the counties, and edges are constructed based on geographical locations — an edge exists between two counties if they are adjacent. The training objective is to predict the number of weekly cases in each county.
•

Ant [9] features ants behaviors over a 41 days’ period. Nodes represent ants, and edges represent interactions between two ants.
•

DGraph [4] is a finance dataset about fraudster detection. Nodes represents Finvolution users, which fall under 3 categories — normal users, fraudsters, and background users (users who are not detection targets due to insufficient borrowing behaviors). An edge from one user to the other means that the user regards the other one as the emergency contact. We randomly sampled a subgraph with 100,000 nodes.
•

Aging [42] provides the human gene expression data at 37 ages spanning between 20 and 99 years. For each age, an aging-specific graph snapshot is constructed, in which nodes represent genes and edges represent interactions between genes. The edge weight represents the strength of the protein-protein interactions (PPIs) between two genes [46].

Table S1: Statistics of our datasets. “Interval” indicates the time interval for each snapshot. ‘\’ indicates that the snapshot interval is not constant.

Datasets	Domains	#Nodes	#Edges	#Snapshots	Interval
Reddit [1]	Social Studies	4,303,032	27,836,000	60	1 month
DGraph [4]	Finance	100,000	119,352	17	1 week
HistWords-EN [2]	Linguistics	100,000	14,539,140	20	10 years
HistWords-CN [2]	Linguistics	29,701	763,100	5	10 years
Aging [42]	Genetics	8,938	71,800	37	\
Enron [44]	Communication Studies	143	22,784	16	2 months
Ant [9]	Ethology	113	111,578	41	1 day
UN Comtrade [33]	International Relations	107	162,322	35	1 year
Chickenpox	Epidemiology	20	102	517	1 week

B Method

In this section, we introduce our novel and computationally efficient framework DyGETViz for visualizing and analyzing dynamic graph embedding trajectories. Our framework effectively addresses the challenges associated with DGs mentioned in Section 1, including continuously evolving node embeddings and constant node addition and deletion. Figure S1 and Algorithm S1 describe the workflow and the pseudocode of DyGETViz, respectively.

B.1 Embedding Training

Given the sequence of graph snapshots $\{G_{t}\}$ , DyGETViz first learns a DTDG model using the joint training objective $\mathcal{L}$ , which is the linear combination of the link prediction loss $\mathcal{L}_{\text{link}}^{t}$ , node-level loss $\mathcal{L}_{\text{node}}^{t}$ , and edge-level loss $\mathcal{L}_{\text{edge}}^{t}$ .

\displaystyle\mathcal{L}

\displaystyle=\sum_{t\in[1,T]}\lambda_{1}\mathcal{L}_{\text{link}}^{t}+\lambda% _{2}\mathcal{L}_{\text{node}}^{t}+\lambda_{3}\mathcal{L}_{\text{edge}}^{t}.

(1)

Here, $\mathcal{L}_{\text{node}}^{t}$ (resp. $\mathcal{L}_{\text{edge}}^{t}$ ) can be defined as the mean squared error or cross-entropy loss between the predicted node (resp. edge) attributes and the ground-truth, depending on the problem formulation (e.g., linear regression or node/edge classification). $\lambda_{1},\lambda_{2},\lambda_{3}\in\mathbb{R}$ denote hyperparameters that control the weights of each loss term. This process generates temporal node embeddings $\{\mathbf{V}^{t}\}_{t=1}^{T}$ across $T$ timestamps (Line 3).

B.2 Embedding Visualization

Algorithm S1 DTDG embedding visualization.

\{\mathbf{V}^{t}\}

is the set of temporal embedding matrix, where

\mathbf{V}^{t}\in\mathbb{R}^{|V^{t}|\times d}

\mathbf{X}\in\mathbb{R}^{|V^{\prime}|\times d}

is the static embedding matrix for the anchor nodes.

\operatorname{sim}(\cdot):\mathbb{R}^{d\times d}\rightarrow\mathbb{R}

is a similarity measure.

\mathcal{N}(i,k,t)

denotes the

k

nearest neighbors of

v_{i}

at time

t

in the embedding space.

\operatorname{Agg}(\cdot)

is an aggregation function.

\alpha

is an interpolation factor.

\mathbf{Z}=\{\mathbf{z}_{i}\}_{i=1}^{|V^{\prime}|}\in\mathbb{R}^{|V^{\prime}|% \times p}

is the

p

-dimensional projection of nodes

v_{i}\in V^{\prime}

\{\mathcal{G}^{t}\}

2:Dynamic Graph Visualization

\mathcal{P}

3:Train a DTDG model using objective

\mathcal{L}

and derive

\{\mathbf{V}^{t}\}

\triangleright

Discrete-Time Dynamic Graph Model Training

4:Compute

\mathbf{X}

for

v_{i}\in V^{\prime}

\mathbf{Z}=f(\mathbf{X})

\triangleright

Compute

p

-dimensional projection of

V^{\prime}

6:Create

\mathcal{P}

and project

\mathbf{Z}=\{\mathbf{z}_{i}\}_{i=1}^{|V^{\prime}|}

for

v_{i}\in V^{\prime}

onto

\mathcal{P}

7:for

t\leftarrow 1,\ldots,T

\triangleright

Cross-Time Alignment

8: for

v_{i}\in V^{t}

9: for

v_{j}\in V^{\prime}\setminus\{v_{i}\}

10:

s_{ij}^{t}\leftarrow\operatorname{sim}(\mathbf{v}_{i}^{t},\mathbf{v}_{j}^{t})

\triangleright

Embedding Similarity

11: end for

12: Compute

\mathcal{N}(i,k,t)

according to

\{s_{ij}^{t}\}

13:

\mathbf{\hat{z}}_{i}^{t}\leftarrow\operatorname{Agg}(\{\mathbf{z}_{j}|v_{j}\in% \mathcal{N}(i,k,t)\})

\triangleright

Aggregation

14:

\mathbf{z}_{i}^{t}=\begin{cases}\alpha\cdot\mathbf{z}_{i}+(1-\alpha)\cdot% \mathbf{\hat{z}}_{i}^{t}&\text{if\ }v_{i}\in V^{\prime}\\ \mathbf{\hat{z}}_{i}^{t}&\text{otherwise}\end{cases}

\triangleright

Interpolation

15: Project

\mathbf{z}_{i}^{t}

onto

\mathcal{P}

16: end for

17:end for

A major challenge in embedding trajectories visualization is cross-time alignment, as the DTDG embeddings from different snapshots reside in distinct embedding spaces and are not directly comparable with each other [47]. To address this challenge, we construct a uniform reference frame for the embedding projection of all snapshots using carefully selected anchor nodes $V^{\prime}$ . The anchor nodes are selected from the set of nodes present in $V^{0}$ to ensure meaningful cosine similarity computation in each snapshot. $\mathbf{X}$ , the node embeddings of $V^{\prime}$ , can be derived from a subset of any temporal embedding $\mathbf{V}^{t}$ trained on time $t$ (Line 4). DyGETViz is based on the assumption that the embeddings of nodes in $V^{\prime}$ do not undergo significant changes over time [48].

We then employ the projection function $f(\cdot)$ to derive the $p$ -dimensional representations $\mathbf{Z}$ (Line 5). The choice of $f(\cdot)$ provides flexibility, allowing various projection algorithms that preserve the node-node proximity in the embedding space such as Principal Component Analysis (PCA) [10], t-SNE [11], H-SNE [12], UMAP [49], locally linear embedding (LLE) [50], and Isomap [51] to be employed. This initial projection serves as a steady topological foundation that ensures consistency across all timestamps. As DyGETViz progresses through each timestamp $t$ , it updates the visual representations of each node $v_{i}\in V^{t}$ in $G^{t}$ , considering its new positions (Lines 8-11). To this end, we identify $v_{i}$ ’s $k$ nearest anchor nodes $v_{j}\in V^{\prime}$ based on the similarity between the temporal embeddings of $v_{i}$ and $v_{j}$ (Lines 10, 12). We then aggregate the visual representations of the neighboring anchor nodes $v_{j}$ to determine the new position for $v_{i}$ (Line 13). This method efficiently aligns nodes across different timestamps and allows for the inference of new nodes by aggregating information from anchor nodes. Therefore, DyGETViz can seamlessly incorporate new nodes into the visualization space, such as newly formed COVID-related online communities on social platforms during the COVID-19 pandemic, or the inclusion of new words into a vocabulary in diachronic linguistic analysis. To ensure coherence and smooth transitions between timestamps, the final node projection is obtained by interpolation, combining the aggregated projection $\mathbf{\hat{z}}_{i}^{t}$ with $v_{i}$ ’s static embedding $\mathbf{z}_{i}$ (Line 14).

B.3 Analytics Module.

We employ micro-level and macro-level measures to quantify the structural shifts in both local and global topology.

Measuring Micro-level Changes.

To quantify the micro-level changes in the local topology of each node, we employ two similarity measures: Jaccard index ( $\operatorname{Jaccard}_{n}$ ) [16] and Rank-biased Overlap (RBO) [17, 18]. The Jaccard index quantifies the agreement between the closest $n$ nodes of a given node $i$ at time $(t-1)$ and those at time $t$ in the embedding space. It is calculated as the intersection size between two sets divided by the size of their union.

\operatorname{Jaccard}_{n}(i,t)=\frac{\mathcal{N}(i,n,t-1)\cap\mathcal{N}(i,n,% t)}{\mathcal{N}(i,n,t-1)\cup\mathcal{N}(i,n,t)},

(2)

where $\mathcal{N}(i,n,t)$ indicates the closest $n$ nodes of $v_{i}$ sorted in ascending order based on their distance from $v_{i}$ at time $t$ . The resulting $\mathrm{Jaccard}_{n}$ ranges from 0 to 1 and is agnostic to the ordering of the top- $n$ nodes. A $\mathrm{Jaccard}_{n}$ close to 1 during the period $[t-1,t]$ indicates minimal changes in the node’s local topology in the embedding space.

As a complementary measure, Ranked Bias Overlap (RBO) considers the absolute ranking of nodes. RBO gradually incorporates lower-ranked nodes while also accounting for the top-ranked ones.

\operatorname{RBO}(i,m,t)=(1-p)\sum_{d=1}^{m}p^{d-1}\frac{|\mathcal{N}(i,m,t-1% )\cap\mathcal{N}(i,m,t)|}{d},

(3)

where $m$ represents the maximum depth of the ranked list considered, and $p\in[0,1]$ is the dam** factor that determines the weight assigned to the top of the list. A higher value of $p$ (closer to 1) assigns more significance to the top of the list. In our experiments, we set $p$ to 0.9. The RBO metric ranges from 0 to 1, with a higher value indicating greater similarity in the node ordering between the two lists. Intuitively, if a node’s RBO is close to 1 during the period $(t-1)$ to $t$ , the node’s global topology in the input DG has undergone minimal changes.

It is worth noting that alternative ranking evaluation measures, such as Spearman’s rank correlation coefficient [52] and Kendall’s tau [53, 54], exist. However, these measures do not explicitly differentiate the importance of the ranks at different positions in the list and are sensitive to small perturbations of rankings, particularly towards the middle of the list [55]. To demonstrate this, Supp. Table S4 shows the distribution of average cosine similarity for all nodes in the four datasets, HistWords-CN, Reddit, Ant, and DGraph. We observe that the cosine similarity usually plateaus in the middle range, suggesting a large number of nodes with highly similar cosine similarity.

Consequently, they cannot accurately reflect the extent to which the local neighbors of a node have changed. Moreover, these are also mainly focused on conjoint rankings [17] where both lists consist of the same set of items, making them less suitable for scenarios where the set of nodes in adjacent snapshots are different due to new nodes constantly being added for comparison. In contrast, RBO and Jaccard index are more responsive to changes in the top portion of two ranked lists and can be applied to indefinite ranking scenarios, which aligns well with our objectives, as we emphasize the importance of top- $n$ nodes for assessing changes of the local neighbors of each node in the visualization.

Measuring Macro-level Changes.

To assess the changes in global topology, we introduce a novel metric called Normalized Average Rank Change (NARC), which builds upon the Average Rank Change (ARC) metric:

	$\displaystyle\operatorname{ARC}(i,t)=\frac{1}{N^{t}}\sum_{j=1}^{N^{t}}\|r_{ij}^% {t}-r_{ij}^{t-1}\|,$		(4)
	$\displaystyle\operatorname{NARC}=\frac{1}{T}\sum_{t=1}^{T}\frac{1}{N^{t}-1}% \sum_{i=1}^{N^{t}}\operatorname{ARC}(i,t),$		(5)

where $N^{t}=|V^{t}\cap V^{t-1}|$ represents the number of nodes jointly present in both time $(t-1)$ and $t$ . $\operatorname{ARC}(i,t)$ measures the changes of a node $i$ ’s nearest neighbors in the period $[t-1,t]$ , where a greater $\operatorname{ARC}(i,t)$ indicates a larger change in $i$ ’s topology. The NARC metric is an aggregated metric across all nodes and timestamps. By normalizing each $\operatorname{ARC}(i,t)$ by a factor of $N^{t}-1$ , we make the NARC metric comparable across datasets with different sizes. The NARC metric provides a comprehensive assessment of the changes in the global topology across all nodes and timestamps, offering valuable insights into the dynamic nature of the evolving network.

To measure the absolute movements of node embeddings in the embedding space over time, we use the L1 and L2 distances between the embeddings of each node $v_{i}$ in adjacent timestamps:

\displaystyle\operatorname{L}_{p}

\displaystyle=\frac{1}{T-1}\frac{1}{N^{t}}\sum_{t=1}^{T-1}\sum_{i=1}^{N^{t}}\|% \mathbf{h}_{i}^{t}-\mathbf{h}_{i}^{t-1}\|_{p},\quad p\in[1,2],

(6)

where $\mathbf{h}_{i}^{t}$ is an embedding at time $t$ . Here, we consider $\mathbf{h}_{i}^{t}$ being one of $\mathbf{v}_{i}^{t},\mathbf{\tilde{v}}_{i}^{t-1}$ , and $\mathbf{z}_{i}^{t}$ , where $\mathbf{v}_{i}^{t}$ is the original embedding, $\mathbf{\tilde{v}}_{i}^{t-1}=\mathbf{v}_{i}^{t}/\|\mathbf{v}_{i}^{t}\|$ is the normalized embedding, and $\mathbf{z}_{i}^{t}$ is the projected embeddings.

Finally, we extend the RBO metric to a macro-level version, which is called macro-level RBO, as follows:

\operatorname{RBO}_{\text{macro}}(i,m,t)=\frac{1}{T}\sum_{t=1}^{T}\frac{1}{N^{% t}}\sum_{i=1}^{N^{t}}\operatorname{RBO}(i,m,t).

(7)

C Related Works

C.1 Graph Neural Network

Graph neural networks [56] have emerged as a powerful framework for modeling complex relationships in graph-structured data. In particular, dynamic graph models, which capture temporal dynamics in evolving systems, have been successfully applied in analyzing various domains such as communication networks [15], transaction networks [57], social networks [58, 1, 59, 60], disease control [61], and international trade [3].

C.2 Visualization

Visualization is a popular approach for model analytics due to its user-friendly and intuitive nature, which allows researchers and analysts to easily comprehend complex temporal relationships. Techniques such as Principal Component Analysis (PCA) [10], t-Distributed Stochastic Neighbor Embedding (t-SNE) [11], Multidimensional Scaling (MDS) [62], and Uniform Manifold Approximation and Projection (UMAP) [49] have been widely used to represent high-dimensional data in a lower-dimensional space by preserve the structural relationships of the original data. Despite these advancements, there is still a need for visualization techniques that can effectively capture and represent the dynamics of evolving graph data over an extended period of time. Although researchers have explored visualization techniques for graphs, existing works usually focus on static graphs [63, 64] or consecutive graph snapshots [65], limiting their ability to showcase the trajectory of node embeddings over time [65]. This limitation hinders the comprehensive understanding of how nodes evolve and interact within the graph structure.

Table S2: Notations used in this paper

Notation	Description
$G$	A static graph
$G^{t}$	A graph snapshot at timestamp $t$
$V,E$	Sets of nodes and edges
$V^{t},E^{t}$	Sets of nodes and edges in each snapshot $G^{t}$
$V^{\prime}$	Set of anchor nodes
$v_{i}$	A node $i$
$\mathcal{N}(i,k,t)$	$v_{i}$ ’s list of $k$ nearest neighbors in $\mathbf{V}^{t}$ , sorted in descending order
$\mathbf{v}_{i}^{t}$	Temporal node embedding of $v_{i}$ at timestamp $t$
$\mathbf{V}^{t}$	Temporal node embeddings for $V^{t}$ at timestamp $t$
$\mathbf{X}$	Anchor embeddings
$\mathcal{L}$	Training Objective
$\alpha,\lambda_{1},\lambda_{2}$	Hyperparameters

Table S3: Evolution of word associations from the 1950s to the 1990s. The term “gay” initially carried connotations with happiness and fortune, but underwent a decline in positivity during the 1970s as its association with homosexuality became more prevalent.

Word	1950	1960	1970	1980	1990
happy	happier, fortunate, glad, lucky, delighted	happier, pleasant, lucky, loved, delighted	glad, happier, fortunate, longed, delighted	glad, delighted, happier, pleasant, fortunate	glad, happier, delighted, eager, lucky
delighted	glad, surprised, astonished, pleased, gratified	gratified, surprised, astonished, glad, amused	glad, surprised, astonished, gratified, pleased	glad, surprised, happy, amused, astonished	surprised, glad, pleased, happy, astonished
gay	charming, lovely, beautiful, elegant, bright	elegant, charming, cheerful, lovely, witty	charming, cheerful, clubs, forlorn, ugly	boys, clubs, lovers, charming, men	men, victims, violence lesbian, bisexual
homosexual	sex, intimacy, prostitution, males, females	sex, cruelties, immoral, notorious, scandalous	sex, unmarried, gender, adultery, immoral	women, unmarried, gay, immoral, sex	gay, males, immoral, illegal, sex
lesbian	\	\	vehemently, clubs, gang, gay, dance	feminist, gay, sexuality, identities, women	gay, women, female, victims, violence

Table S4: Average

\operatorname{Jaccard}_{100}

of subreddits in each year.

Subreddit	2018	2019	2020	2021	2022
videos	0.118	0.211	0.206	0.258	0.256
YouTubeSubscribeBoost	0.244	0.389	0.296	0.295	0.316
YouTube_startups	0.233	0.339	0.289	0.341	0.312
AdvertiseYourVideos	0.287	0.349	0.292	0.313	0.332
SmallYoutubers	0.267	0.356	0.304	0.306	0.354
GetMoreViewsYT	0.284	0.311	0.269	0.280	0.305
gaming	0.263	0.280	0.285	0.272	0.249
videogames	0.303	0.275	0.324	0.322	0.292
pcgaming	0.258	0.236	0.277	0.266	0.223
YouTubeGamers	0.279	0.364	0.291	0.343	0.397
PromoteGamingVideos	0.317	0.423	0.336	0.383	0.407
gamingvids	0.318	0.329	0.295	0.296	0.300
GlobalOffensive	0.071	0.111	0.162	0.093	0.080
apexlegends	N/A	0.114	0.110	0.099	0.073
leagueoflegends	0.069	0.044	0.057	0.062	0.038
Minecraft	0.040	0.115	0.136	0.143	0.072

Subreddit	2018	2019	2020	2021	2022
kpop	0.066	0.082	0.129	0.164	0.151
popheads	0.170	0.160	0.181	0.221	0.195
indieheads	0.221	0.258	0.263	0.278	0.274
Music	0.187	0.224	0.291	0.339	0.331
hiphopheads	0.232	0.296	0.308	0.333	0.276
listentothis	0.234	0.325	0.359	0.374	0.338
hiphop	0.266	0.363	0.424	0.442	0.312
rap	0.282	0.413	0.421	0.417	0.296
sports	0.090	0.111	0.101	0.101	0.084
nba	0.121	0.109	0.089	0.082	0.076
nfl	0.077	0.112	0.061	0.049	0.050
MMA	0.103	0.086	0.090	0.088	0.070
SquaredCircle	0.065	0.051	0.058	0.085	0.086
politics	0.384	0.378	0.423	0.313	0.290
The_Donald	0.331	0.375	0.188	N/A	N/A
WayOfTheBern	0.385	0.405	0.447	0.441	0.308

D Additional Experimental Results

D.1 Enron: Email Communication

Enron Corporation, founded by Kenneth Lay in 1985, was a prominent energy company until its notorious collapse due to an institutionalized and systematic accounting fraud [44, 66]. In February 2001, Jeffrey Skilling became Enron’s CEO, initiating a period characterized by aggressive and intricate accounting practices [67]. Jeffrey resigned in December 2001, shortly before Enron’s downfall. Studies show that Enron’s collapse can be attributed to a failure of responsible communication [67]. Lay and Skilling were only partially aware of the financial misconduct of their subordinates. In this study, we investigate the email communication network from June 1999 to December 2001 to shed light on the internal communication patterns that contribute to Enron’s failure. Our visualization provides valuable insights into these communication patterns, particularly highlighting the trajectories of CEOs like Kenneth Lay and Jeffrey Skilling.

In Fig. S6b, the trajectories of Lay and Skilling indicate relatively static communication communities, confined within a small range that mainly involves a limited number of vice presidents and managers. These patterns reflect their shortcomings in two-way communication, as demonstrated in previous studies [66, 67] — the failure to deliver honest, ethical messages to employees and the lack of awareness regarding company operations. Meanwhile, ordinary employees such as Geir Solberg, Kay Mann, and Scott Neal have specific job functions that confine their communication to their respective teams, resulting in relatively fixed communication patterns with a limited number of partners and trajectories with less variability.

In contrast, managers and presidents play crucial roles in facilitating communication between upper management and ordinary employees [68, 69, 70], resulting in more diverse interactions with individuals at different levels within the organization. Previous analyses identified the top three influential nodes in the Enron dataset as Louis Kitchen (President), Mike Grigsby (Manager), and Greg Whalley (President) according to graph entropy [66]. Their trajectories in the visualization exhibit broader and more diverse patterns, involving different individuals in different positions over different time periods. This reflects their extensive responsibilities and significant roles in managing the organization.

Moreover, as shown in Fig. S6a, the decline in RBO and $\operatorname{Jaccard}_{3}$ across all employees during the CEO transition from Lay to Skilling (January - March 2001) highlights the impact of leadership changes on communication dynamics within the organization. DyGETViz uncovers differences in communication patterns among employees, offering a novel perspective on organizational structure and dynamics. Understanding these patterns provides valuable insights for future research and organizational management.

D.2 Dynamic Temporal and Spatial Modeling of Chickenpox Spread in Hungary

In epidemiology forecasting, dynamic graph models can potentially enhance our understanding and prediction of disease spread. By incorporating temporal and spatial dynamics, these models can capture the intricate interplay between population density, geography, and mobility patterns, all of which play critical roles in disease transmission. However, the successful application of DTDG models in epidemiology forecasting relies not only on the accuracy and robustness of the models, but also on our ability to interpret and understand their mechanisms of operation. In this regard, the use of visualization techniques becomes crucial.

In this study, we use the Hungary Chickenpox dataset [45], which includes the weekly chickenpox cases in Hungarian counties and the capital Budapest between 2005 and 2015. From the trajectories in Fig. S7a, we found that the capital city Budapest stands out as the node with the most movements due to its high population and . As the second most populous county in the country, Pest exhibits a trajectory that significantly overlaps with Budapest at each snapshot, indicating a high seasonality in the number of cases. This overlap can be attributed to factors such as geographical locations and suburbanization in the metropolitan area of Budapest, which caused considerable population movements between the two regions. According to a census in 2011 [71], nearly 60% of commuters living in the suburban zone of Pest work in Budapest. Such population overlap can facilitate the spread of diseases. Bács-Kiskun, a bordering state of Pest, moves towards Pest in winter, especially the middle of December, but away from it in summer, indicating a periodicity between the winter surge and the summer decline [45, 72]. In contrast, counties such as Tolna, Vas, Zala, and Heves, which are among the five least populated counties, form their own clusters with movements limited to the lower left of the plot, indicating a lower susceptibility to diseases due to smaller populations and fewer demographic movements.