Enhancing the Resilience of Graph Neural Networks to Topological Perturbations in Sparse Graphs

Shuqi He1, Jun Zhuang2, Ding Wang1, Luyao Peng1, Jun Song1 1 China University of Geosciences (Wuhan), Wuhan, China
2 Indiana University-Purdue University Indianapolis, Indianapolis, USA
{1202211192, wangding, pengluyao, songjun}@cug.edu.cn, [email protected]
Abstract.

Graph neural networks (GNNs) have been extensively employed in node classification. Nevertheless, recent studies indicate that GNNs are vulnerable to topological perturbations, such as adversarial attacks and edge disruptions. Considerable efforts have been devoted to mitigating these challenges. For example, pioneering Bayesian methodologies, including GraphSS and LlnDT, incorporate Bayesian label transitions and topology-based label sampling to strengthen the robustness of GNNs. However, GraphSS is hindered by slow convergence, while LlnDT faces challenges in sparse graphs. To overcome these limitations, we propose a novel label inference framework, TraTopo, which combines topology-driven label propagation, Bayesian label transitions, and link analysis via random walks. TraTopo significantly surpasses its predecessors on sparse graphs by utilizing random walk sampling, specifically targeting isolated nodes for link prediction, thus enhancing its effectiveness in topological sampling contexts. Additionally, TraTopo employs a shortest-path strategy to refine link prediction, thereby reducing predictive overhead and improving label inference accuracy. Empirical evaluations highlight TraTopo’s superiority in node classification, significantly exceeding contemporary GCN models in accuracy.

CNN, Bayesian label transition, Random walk, Pagerank

1. Introduction

Graph structures, such as attributed graphs (Tian et al., 2023; Zhuang, 2024), knowledge graphs (Liu et al., 2022, 2021a), and factor graphs (Chen et al., 2022a, b, 2023a), play a crucial role across various domains, representing the topological relationships and attribute information between nodes. Node classification is a fundamental task in graph structure learning. In this task, we aim to assign the nodes to the corresponding class.

In recent years, Graph Neural Networks (GNNs) have been widely applied in node classification due to their superior performance on graph representation (Tian et al., 2022b, 2024; Chen et al., 2022c; Wu et al., 2022; Tian et al., 2022a; Hamilton et al., 2017; Veličković et al., 2017; Zhuang and Kennington, 2024). However, recent studies reveal that GNNs may be vulnerable to topological perturbations, which can severely compromise the effectiveness of GNN-based node classification (Liu et al., 2021b; Ding et al., 2024; Zhang et al., 2020). Thus, it is crucial to improve the robustness of GNNs against topological perturbations, such as random perturbations (Zhuang and Al Hasan, 2022b; Hahn-Klimroth et al., 2020) and graph sparsification (Fan et al., 2021; Zhuang and Hasan, 2023).

Numerous studies, such as Bayesian Label Transition (Yang et al., 2022) and label propagation (Cordasco and Gargano, 2012), have been explored to improve the robustness of GNNs. These approaches adeptly utilize supervised data to enhance robustness, yet the effectiveness is circumscribed by the inherent characteristics of local graph structures, which may inhibit the propagation process for unlabeled nodes. GraphSS (Zhuang and Al Hasan, 2022a) endeavors to counteract suboptimal classification outcomes stemming from topological perturbations by refining Graph Neural Network (GNN) predictions through post-processing. This strategy incorporates a Bayesian inference framework to devise a label transition matrix, thereby substituting misjudged labels with more accurate alternatives to ameliorate classification discrepancies. Nonetheless, the adaptation of this technique is hampered by its protracted convergence rate. A novel initiative, LInDT (Zhuang and Al Hasan, 2022c), addresses the challenge of delayed convergence by introducing an innovative label sampling technique, thereby enhancing the method’s scalability across expansive graph structures. Despite these advancements, LInDT’s dependence on the underlying graph topology renders it less effective on sparsely connected graphs, where limited connectivity can severely diminish the success of label propagation.

To address the aforementioned challenges, we introduce a novel mechanism, namely TraTopo, which integrates Random Walk with Restart and PageRank algorithms to augment the robustness of topology-based propagation methodologies. This model is seamlessly integrated within a Bayesian label transition framework, thus strengthening the resilience of GNNs in node classification tasks. More precisely, TraTopo outperforms its predecessor, LlnDT, by employing label propagation to achieve enhanced convergence in scenarios of uncertain Bayesian label sampling. It leverages random walk-based algorithms to adeptly navigate the constraints presented by nodes of lower degrees, while concurrently diminishing computational burdens. The mechanism we propose not only enriches node information but also refines label inference capabilities, thereby manifesting exemplary performance across graph datasets under conditions of perturbation. In the experiments, we evaluate the performance of TraTopo and comparative models in terms of accuracy and entropy under a range of topological perturbations across three public datasets. Besides, we analyze the sensitivity of various hyper-parameters in TraTopo. Our systematic validation seeks to enhance the robustness and the predictive capabilities of TraTopo in dynamic and diverse structural graph data. Overall, our main contributions are summarized as follows:

  • We propose a new mechanism for node label inference by integrating Bayesian methods with topology-based enhancements, incorporating Random Walk with Restart and PageRank to boost link prediction accuracy.

  • We employ shortest-path-based strategies to streamline random walks, reducing computational overhead and enhancing predictive performance with minimal resource consumption.

  • Extensive experiments demonstrate that our method can outperform leading competing models across benchmark graph datasets, validating the effectiveness in dynamic network environments.

2. RELATED WORK

Node classification is crucial in analyzing graph-structured data for social networks, bioinformatics, and recommendation systems. Advances in this field include Graph Neural Networks (GNNs), adversarial robustness, noisy label management, and algorithms like random walk and PageRank.

2.1. Graph Neural Networks

Graph Neural Networks (GNNs) are essential for analyzing graph-structured data, aiding in areas such as social network analysis, bioinformatics, and recommendation systems. A major challenge is maintaining GNN robustness against accidental or adversarial topology perturbations.

Recent studies have explored black-box adversarial attacks on Graph Neural Networks (GNNs), employing a node voting strategy to identify vulnerable nodes (Wen et al., 2024). Fiorellino et al. (Fiorellino et al., 2024) have developed an advanced GNN variant designed to enhance resilience against channel perturbations. Furthermore, Khalid et al. (Khalid et al., 2024) introduced SleepNet, an innovative sleep prediction model that incorporates attention mechanisms and utilizes dynamic social networks.

2.2. Adversarial Robustness

With the rise of Graph Neural Networks (GNNs), their susceptibility to adversarial tactics has captured academic focus (Zhang et al., 2023; Xu et al., 2024). Research prioritizes bolstering network security through tailored attacks and enhanced defenses. Notably, even minimal, strategic perturbations substantially reduce the efficacy of GNNs, challenging their precision and interpretability.

Zhao et al. (Zhao et al., 2024) employed a Hamiltonian method to enhance GNN resilience against topological disturbances, elevating stability across GNN architectures. Wu et al. (Wu et al., 2023) improved GCN robustness and generalization via weight perturbations, noting that optimizing robust loss directly enhances defenses. Liu et al. (Liu et al., 2024a) introduced wave-induced resonance to boost GNN robustness. Testa et al. (Testa et al., 2024) analyzed GNN stability via slight perturbations. Liu et al. (Liu et al., 2024b) examined the impact of edge perturbations on GNN robustness and vulnerability.

2.3. Noisy Labels

Learning with noisy labels substantially alters training dynamics, potentially reducing model performance (Zhang et al., 2021; Chen et al., 2023b; Tian et al., 2022c). In node classification, structural dependencies in graphs exacerbate inaccuracies, facilitating the spread of incorrect labels through connecting edges.

Zhang et al. (Zhang et al., 2024) devised an advanced LNL algorithm to effectively address noisy labels. Xia et al. (Xia et al., 2023) introduced a GNN-based Cleaner, enhancing robustness against noisy labels in attributed graphs. Self-supervised methods have become pivotal in graph representation learning (Zhai et al., 2023). Zhuang et al. (Zhuang and Hasan, 2022) pioneered the concept of treating noisy labels as intrinsic data properties. Yuan et al. (Yuan et al., 2023) developed a self-supervised framework designed to mitigate the impact of noisy graphs and labels.

2.4. Random Walk and PageRank

Graph Convolutional Networks (GCNs) tackle structural disruptions using advanced random walk and PageRank, enhancing resilience and efficiency across various graph-learning contexts.

Utilizing APPNP’s (Gasteiger et al., 2018) Personalized PageRank and N-GCN’s (Abu-El-Haija et al., 2020) stochastic walks bolsters GCN resilience, streamlining topological coherence and nodal comprehension. Wang et al. (Wang et al., 2023) advocate robustness assessments through graph perturbations, underscoring diffusion and influence maximization’s defensive prowess. Hou et al. (Hou et al., 2023) probe directed graph resilience via BBRW, spotlighting the fortifying influence of targeted pathways, and advancing graph topology understanding.

3. PRELIMINARIES

In this section, we introduce the preliminary background about GNNs and random walks.

3.1. GNN-based Node Classification

In this investigation, we employ Graph Convolutional Networks (GCNs) (Yao et al., 2019) as the foundational node classifier fθsubscript𝑓𝜃f_{\theta}italic_f start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT, constructing an undirected, attributed graph G=(V,E)𝐺𝑉𝐸G=(V,E)italic_G = ( italic_V , italic_E ) composed of N𝑁Nitalic_N vertices and corresponding edges. The structure is defined by a symmetric adjacency matrix A𝐴Aitalic_A and a feature matrix X𝑋Xitalic_X, formally expressed as AN×N𝐴superscript𝑁𝑁A\in\mathbb{R}^{N\times N}italic_A ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_N end_POSTSUPERSCRIPT and XN×d𝑋superscript𝑁𝑑X\in\mathbb{R}^{N\times d}italic_X ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_d end_POSTSUPERSCRIPT, respectively.

Graph Convolutional Networks (GCNs) have gained prominence for their capability to perform convolution operations on graph-structured data. The fundamental operation of a GCN can be described by the layer-wise propagation rule:

(1) H(l+1)=σ(D~1/2A~D~1/2H(l)W(l))superscript𝐻𝑙1𝜎superscript~𝐷12~𝐴superscript~𝐷12superscript𝐻𝑙superscript𝑊𝑙H^{(l+1)}=\sigma\left(\tilde{D}^{-1/2}\tilde{A}\tilde{D}^{-1/2}H^{(l)}W^{(l)}\right)italic_H start_POSTSUPERSCRIPT ( italic_l + 1 ) end_POSTSUPERSCRIPT = italic_σ ( over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT over~ start_ARG italic_A end_ARG over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT italic_H start_POSTSUPERSCRIPT ( italic_l ) end_POSTSUPERSCRIPT italic_W start_POSTSUPERSCRIPT ( italic_l ) end_POSTSUPERSCRIPT )

where A~=A+I~𝐴𝐴𝐼\tilde{A}=A+Iover~ start_ARG italic_A end_ARG = italic_A + italic_I is the adjacency matrix A𝐴Aitalic_A of the graph with added self-loops, D~~𝐷\tilde{D}over~ start_ARG italic_D end_ARG is the corresponding degree matrix, H(l)superscript𝐻𝑙H^{(l)}italic_H start_POSTSUPERSCRIPT ( italic_l ) end_POSTSUPERSCRIPT denotes the matrix of activations in the l𝑙litalic_l-th layer (with H(0)=Xsuperscript𝐻0𝑋H^{(0)}=Xitalic_H start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT = italic_X), W(l)superscript𝑊𝑙W^{(l)}italic_W start_POSTSUPERSCRIPT ( italic_l ) end_POSTSUPERSCRIPT is the matrix of trainable weights in the l𝑙litalic_l-th layer, and σ𝜎\sigmaitalic_σ is a nonlinear activation function. This formula captures the essence of GCNs in aggregating features from a node’s local neighborhood, thereby enabling the model to learn powerful representations from graph-structured inputs.

Utilizing A𝐴Aitalic_A and X𝑋Xitalic_X for the task of node classification, we integrate noisy labels as a sophisticated regularization mechanism aimed at bolstering the model’s resilience against data imbued with noise. Specifically, a subset of nodes, denoted as 𝒰𝒰\mathcal{U}caligraphic_U and comprising 10% of the graph’s total, is assigned noisy labels 𝐘𝐘\mathbf{Y}bold_Y. These labels are an amalgamation of manually-annotated labels 𝐘msubscript𝐘𝑚\mathbf{Y}_{m}bold_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT and auto-generated labels 𝐘asubscript𝐘𝑎\mathbf{Y}_{a}bold_Y start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT. This methodology corroborates that an elevation in noise levels can substantially augment the efficacy of the regularization process. Furthermore, our scholarly objective is to meticulously align the inferred labels 𝐙^^𝐙\hat{\mathbf{Z}}over^ start_ARG bold_Z end_ARG as closely as possible with the latent labels 𝐙𝐙\mathbf{Z}bold_Z, thus ensuring robust node classification within noisy environments. This approach not only demonstrates the feasibility of effectively leveraging graph-structured data in complex labeling landscapes but also delves into how advanced regularization techniques can significantly enhance the model’s ability to adapt to noise and improve its overall performance.

3.2. Random walk algorithm

The Random Walk algorithm is a stochastic graph traversal method that simulates the process of moving randomly within a graph. This algorithm finds extensive applications in graph data, including network analysis, link analysis, and ranking of graph nodes.

3.2.1. Random walk with restart

The Random Walk with Restart (RWR) The algorithm (Xia et al., 2019; Li et al., 2015; Tong et al., 2006) refines personalized exploration within network analysis, optimizing the evaluation of node importance and the disclosure of subgraphs. The algorithm operates on a probabilistic mechanism: returning to the starting node with probability α𝛼\alphaitalic_α or advancing to a neighbor with 1α1𝛼1-\alpha1 - italic_α. The matrix P𝑃Pitalic_P underlies the RWR update:

(2) RWR(v,t+1)=αRWR(v,t)+(1α)u𝒩(v)RWR(u,t)deg(u)𝑅𝑊𝑅𝑣𝑡1𝛼𝑅𝑊𝑅𝑣𝑡1𝛼subscript𝑢𝒩𝑣𝑅𝑊𝑅𝑢𝑡deg𝑢RWR(v,t+1)=\alpha\cdot RWR(v,t)+(1-\alpha)\sum_{u\in\mathcal{N}(v)}\frac{RWR(u% ,t)}{\text{deg}(u)}italic_R italic_W italic_R ( italic_v , italic_t + 1 ) = italic_α ⋅ italic_R italic_W italic_R ( italic_v , italic_t ) + ( 1 - italic_α ) ∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_N ( italic_v ) end_POSTSUBSCRIPT divide start_ARG italic_R italic_W italic_R ( italic_u , italic_t ) end_ARG start_ARG deg ( italic_u ) end_ARG

Guided by α𝛼\alphaitalic_α, the RWR algorithm performs a stochastic traversal of the graph.

3.2.2. PageRank

PageRank (Xing and Ghorbani, 2004), used by Google, ranks web pages based on link importance. It calculates the rank PR(A)𝑃𝑅𝐴PR(A)italic_P italic_R ( italic_A ) using:

(3) PR(A)=(1d)+d(i=1nPR(Ti)C(Ti))𝑃𝑅𝐴1𝑑𝑑superscriptsubscript𝑖1𝑛𝑃𝑅subscript𝑇𝑖𝐶subscript𝑇𝑖PR(A)=(1-d)+d\left(\sum_{i=1}^{n}\frac{PR(T_{i})}{C(T_{i})}\right)italic_P italic_R ( italic_A ) = ( 1 - italic_d ) + italic_d ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT divide start_ARG italic_P italic_R ( italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_C ( italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG )

where d𝑑ditalic_d (typically 0.85) is the dam** factor, Tisubscript𝑇𝑖T_{i}italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are linking pages, PR(Ti)𝑃𝑅subscript𝑇𝑖PR(T_{i})italic_P italic_R ( italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) is their PageRank, and C(Ti)𝐶subscript𝑇𝑖C(T_{i})italic_C ( italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) is their outbound links. This iterative method ranks pages by link structure.

4. METHOD

This section presents TraTopo, combining Bayesian label propagation with ensemble learning to improve link prediction and reduce errors. It employs a shortest-path algorithm to identify new nodes, update candidates, and lower computational demands.

4.1. Bayesian Label Transition with Asymmetric Dirichlet Distributions

Using Bayesian theory, the Bayesian Label Propagation algorithm estimates nodes’ label probability distributions  (Yang et al., 2022; Liu et al., 2023; Xie and Szymanski, 2013; Zhuang and Al Hasan, 2022c). It calculates likelihoods from neighboring labels, represents initial distributions with prior probabilities, and iteratively refines these distributions to enhance label propagation.

The algorithm initializes by establishing an initial label probability distribution per node, subsequently refined through iterative updates informed by adjacent nodes and propagation protocols. Bayesian adjustments recalibrate the probabilities of nodes with known labels. This iterative refinement proceeds until stabilization or a designated iteration threshold is met. The final label distribution for a node v𝑣vitalic_v is represented by Lvsubscript𝐿𝑣L_{v}italic_L start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT.

(4) P(Lv=lNeib(v),Y)uNeib(v)P(Lu=lY)proportional-to𝑃subscript𝐿𝑣conditional𝑙Neib𝑣𝑌subscript𝑢Neib𝑣𝑃subscript𝐿𝑢conditional𝑙𝑌P(L_{v}=l\mid\text{Neib}(v),Y)\propto\sum_{u\in\text{Neib}(v)}P(L_{u}=l\mid Y)italic_P ( italic_L start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = italic_l ∣ Neib ( italic_v ) , italic_Y ) ∝ ∑ start_POSTSUBSCRIPT italic_u ∈ Neib ( italic_v ) end_POSTSUBSCRIPT italic_P ( italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = italic_l ∣ italic_Y )

where Neighbors Neib(v)𝑁𝑒𝑖𝑏𝑣Neib(v)italic_N italic_e italic_i italic_b ( italic_v ) represents the adjacent nodes of v𝑣vitalic_v, and observed labels Y𝑌Yitalic_Y denote known label information. The node’s label probability distribution is updated using Bayesian inference, where P(Lu=lY)𝑃subscript𝐿𝑢conditional𝑙𝑌P(L_{u}=l\mid Y)italic_P ( italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = italic_l ∣ italic_Y ) indicates the probability that node u𝑢uitalic_u has label l𝑙litalic_l based on observed label information. The label propagation process iterates these updates until convergence.

The Bayesian label transition utilized in this study is illustrated in Figure 1.

Refer to caption
Figure 1. The diagram of Bayesian label transition, V𝑉Vitalic_V signifies nodes and N𝑁Nitalic_N indicates the number of nodes. Z𝑍Zitalic_Z includes inferred 𝒵¯¯𝒵\bar{\mathcal{Z}}over¯ start_ARG caligraphic_Z end_ARG and true labels Z𝑍Zitalic_Z, and Y𝑌Yitalic_Y encompasses both manually-annotated labels Ymsubscript𝑌𝑚Y_{m}italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT and automatically-generated labels Yasubscript𝑌𝑎Y_{a}italic_Y start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT labels. The K𝐾Kitalic_K class label transition, controlled by matrix ϕitalic-ϕ\phiitalic_ϕ and parameter α𝛼\alphaitalic_α, Black arrows depict variable dependencies, while dotted arrows indicate this symbol can be subdivided into two different meanings.

In the diagram depicted in  1, foundational elements—vertices (V𝑉Vitalic_V), latent labels (Z𝑍Zitalic_Z), and noisy labels (Y𝑌Yitalic_Y)—are crucial for deciphering the model’s architecture and function. Vertices (V𝑉Vitalic_V) indicate the nodes, latent labels (Z𝑍Zitalic_Z) are characterized both as transitionally inferred and true labels, while noisy labels (Y𝑌Yitalic_Y) are differentiated into manually annotated and automatically-generated labels. The principal goal is ensuring that the inferred labels (Z¯¯𝑍\bar{Z}over¯ start_ARG italic_Z end_ARG) are in precise concordance with the true labels. Solid arrows signify dependencies, and dashed arrows indicate that there are two definitions for this element.This matrix, parameterized by α𝛼\alphaitalic_α, governs label transitions, represented as ϕ=[ϕ1,ϕ2,,ϕK]TK×Kitalic-ϕsuperscriptsubscriptitalic-ϕ1subscriptitalic-ϕ2subscriptitalic-ϕ𝐾𝑇superscript𝐾𝐾\phi=[\phi_{1},\phi_{2},...,\phi_{K}]^{T}\in\mathbb{R}^{K\times K}italic_ϕ = [ italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_ϕ start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_K × italic_K end_POSTSUPERSCRIPT, containing K𝐾Kitalic_K vectors. Each vector ϕksubscriptitalic-ϕ𝑘\phi_{k}italic_ϕ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT originates from an Asymmetric Dirichlet Distribution ϕ(αk)italic-ϕsubscript𝛼𝑘\phi(\alpha_{k})italic_ϕ ( italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ). The model dynamically revises α𝛼\alphaitalic_α. For example, αktsuperscriptsubscript𝛼𝑘𝑡\alpha_{k}^{t}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT during the t𝑡titalic_tth transition is expressed as

(5) αkt=αkt1i=1NI(z¯it=k)i=1NI(z¯it1=k)superscriptsubscript𝛼𝑘𝑡superscriptsubscript𝛼𝑘𝑡1superscriptsubscript𝑖1𝑁𝐼¯𝑧superscript𝑖𝑡𝑘𝑖superscript1𝑁𝐼superscriptsubscript¯𝑧𝑖𝑡1𝑘\alpha_{k}^{t}=\alpha_{k}^{t-1}\frac{\sum_{i=1}^{N}I(\bar{z}i^{t}=k)}{\sum{i=1% }^{N}I(\bar{z}_{i}^{t-1}=k)}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_I ( over¯ start_ARG italic_z end_ARG italic_i start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_k ) end_ARG start_ARG ∑ italic_i = 1 start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_I ( over¯ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT = italic_k ) end_ARG

This update mechanism ensures that the inferred labels (Z¯¯𝑍\bar{Z}over¯ start_ARG italic_Z end_ARG) progressively align more closely with the true labels. The posterior representation of Z𝑍Zitalic_Z is given by

(6) P(𝒵𝒱,𝒴;α)=P(𝒵𝒱,𝒴,ϕ)P(ϕ;α)𝑃conditional𝒵𝒱𝒴𝛼𝑃conditional𝒵𝒱𝒴italic-ϕ𝑃italic-ϕ𝛼P(\mathcal{Z}\mid\mathcal{V},\mathcal{Y};\alpha)=P(\mathcal{Z}\mid\mathcal{V},% \mathcal{Y},\phi)P(\phi;\alpha)italic_P ( caligraphic_Z ∣ caligraphic_V , caligraphic_Y ; italic_α ) = italic_P ( caligraphic_Z ∣ caligraphic_V , caligraphic_Y , italic_ϕ ) italic_P ( italic_ϕ ; italic_α )

showing how the posterior of the latent labels is conditioned on the nodes, noisy labels, and the Dirichlet distribution parameters. The model employs Gibbs and topological sampling to iteratively update and refine the inferred labels (Z¯¯𝑍\bar{Z}over¯ start_ARG italic_Z end_ARG), ensuring they closely approximate the true labels (Z𝑍Zitalic_Z).

In this study, we assume that the model is subject to various topological perturbations. When the graph is impacted, TraTopo strives to restore the model’s predicted classification distribution as accurately as possible.

4.2. Shortest path-based approximated method

In topology-driven label propagation, first-order neighbors are primarily sampled. Other nodes are designated as negative samples, which should articulate distinct meanings and encapsulate the graph’s data comprehensively. Ideally, these negative samples emerge from diverse communities, each represented by the samples.

Depth-First Search (DFS) is employed to ascertain the shortest path between nodes. Having identified the minimal route from node Visubscript𝑉𝑖V_{i}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to all reachable nodes Vrsubscript𝑉𝑟V_{r}italic_V start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT, the distance from the path’s endpoint to node Visubscript𝑉𝑖V_{i}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is defined as length l𝑙litalic_l. This approach classifies reachable nodes Vrsubscript𝑉𝑟V_{r}italic_V start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT into groups based on path length l𝑙litalic_l:

(7) Vr={Nl}l=2Lsubscript𝑉𝑟superscriptsubscriptsubscript𝑁𝑙𝑙2𝐿V_{r}=\{N_{l}\}_{l=2}^{L}italic_V start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT = { italic_N start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_l = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT

As can be seen from graph 2:

Refer to caption
Figure 2. The diagram of Concentric circles representing the shortest paths between nodes

In each collection, nodes are equidistant to the focal node Visubscript𝑉𝑖V_{i}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, facilitating the formation of concentric circles with varying radii centered on the node. Utilizing the uniformity of the Label Propagation Algorithm, we integrate all nodes within a designated set and their first-order neighbors to construct a candidate set Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. High-ranking nodes, as determined by scores from the Random Walk Algorithm, are selected from this set to connect with the focal node, as delineated in Algorithm 1.

Data: A Graph G𝐺Gitalic_G,sample length Lmaxsubscript𝐿𝑚𝑎𝑥L_{max}italic_L start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT
1 Let S(i)1𝑆𝑖1S(i)\leftarrow 1italic_S ( italic_i ) ← 1;
2 for visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT do
3       Compute the shortest path lengths from i𝑖iitalic_i to all reachable nodes Vrsubscript𝑉𝑟V_{r}italic_V start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT;
4       Divide Vrsubscript𝑉𝑟V_{r}italic_V start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT into different sets Nlsubscript𝑁𝑙N_{l}italic_N start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT based on the path length;
5       Let Si[]subscript𝑆𝑖S_{i}\leftarrow[\ ]italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ← [ ] and Ni[]subscript𝑁𝑖N_{i}\leftarrow[\ ]italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ← [ ];
6       for leninrange(1,Lmax)𝑙𝑒𝑛𝑖𝑛𝑟𝑎𝑛𝑔𝑒1subscript𝐿𝑚𝑎𝑥len\ in\ range(1,L_{max})italic_l italic_e italic_n italic_i italic_n italic_r italic_a italic_n italic_g italic_e ( 1 , italic_L start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ) do
7             Collect all the points R(j)𝑅𝑗R(j)italic_R ( italic_j ) in Nlensubscript𝑁𝑙𝑒𝑛N_{len}italic_N start_POSTSUBSCRIPT italic_l italic_e italic_n end_POSTSUBSCRIPT at each length;
8             if len=Lmaxorlen=Lmax1𝑙𝑒𝑛subscript𝐿𝑚𝑎𝑥𝑜𝑟𝑙𝑒𝑛subscript𝐿𝑚𝑎𝑥1len\ =\ L_{max}\ or\ len\ =\ L_{max}-1italic_l italic_e italic_n = italic_L start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT italic_o italic_r italic_l italic_e italic_n = italic_L start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT - 1 then
9                   Put all the point j𝑗jitalic_j in Sjsubscript𝑆𝑗S_{j}italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT at each length;
10                   Collect first-order neighbors Nei(j)𝑁𝑒𝑖𝑗Nei(j)italic_N italic_e italic_i ( italic_j ) of j;
11                   Expand Sisubscript𝑆𝑖absentS_{i}\ \leftarrowitalic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ← [Si,Nei(j)subscript𝑆𝑖𝑁𝑒𝑖𝑗S_{i},Nei(j)italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_N italic_e italic_i ( italic_j )];
12                  
13             end if
14            Expand Nisubscript𝑁𝑖absentN_{i}\ \leftarrowitalic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ← [Ni,R(j)subscript𝑁𝑖𝑅𝑗N_{i},R(j)italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_R ( italic_j )];
15            
16       end for
17      
18 end for
return 𝒮¯(i)¯𝒮𝑖{\bar{\mathcal{S}}}({i})over¯ start_ARG caligraphic_S end_ARG ( italic_i ) and 𝒩¯(i)¯𝒩𝑖{\bar{\mathcal{N}}}({i})over¯ start_ARG caligraphic_N end_ARG ( italic_i ) for all iG𝑖𝐺i\in Gitalic_i ∈ italic_G
Algorithm 1 Shortest-path-based diverse negative sampling

The algorithm 1 initially computes the minimal distances between nodes and the path lengths connecting them. It then isolates nodes that can be reached within a path length L𝐿Litalic_L, including Nisubscript𝑁𝑖N_{i}italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, which comprises the focal node, nodes at path distances of two and three, and their adjacent nodes, forming Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.In later stages, Nisubscript𝑁𝑖N_{i}italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT serves as a sampling criterion, as outlined in  (Casella and George, 1992; Marcotty et al., 1976), and Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is employed for candidate selection in link prediction tasks.

4.3. Improved Topology Sampler

In the field of complex network analysis, this study aims to uncover the latent connection patterns among nodes, thereby deepening our understanding of the network’s structure through two key steps.

Initially, the calculation of the shortest paths between nodes precisely determines the shortest paths from each node to its first through third-order neighbor nodes. Using the BFS algorithm, a comprehensive map** of node distances is constructed via an all-source shortest-path search for every node within the network. This method not only unveils the network’s topological structure but also establishes a foundation for identifying key nodes and forecasting potential connections between them.

Employing network theory, this method begins by enumerating the degree of each node through an exhaustive traversal of network edges, isolating those with degrees under three. These peripheral nodes, often overlooked for potential connections, are analyzed. For each chosen node v𝑣vitalic_v, its second and third-order neighbors and their respective neighbors are aggregated into a predictive set. A composite score, derived from the PageRank and Random Walk algorithms markers of node centrality and traversal likelihood—is then applied. The ten highest-scoring nodes are predicted to potentially form connections with node v𝑣vitalic_v. This integration of foundational graph theory algorithms with cutting-edge network science insights not only deepens the structural understanding of networks but also pioneers a novel link prediction methodology. This approach adeptly reveals latent patterns and potential links within the network, offering substantial theoretical backing for network optimization and analytical purposes.

As we can see in Algorithm 2, the uncertainty of node labels is delineated as follows: during training, labels Z¯¯𝑍\bar{Z}over¯ start_ARG italic_Z end_ARG predicted by the Bayesian label transition matrix at iteration (t1)thsuperscript𝑡1𝑡(t-1)^{th}( italic_t - 1 ) start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT, ϕ(t1)superscriptitalic-ϕ𝑡1\phi^{(t-1)}italic_ϕ start_POSTSUPERSCRIPT ( italic_t - 1 ) end_POSTSUPERSCRIPT, are considered uncertain if they differ from those forecasted at iteration tthsuperscript𝑡𝑡t^{th}italic_t start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT, ϕtsuperscriptitalic-ϕ𝑡\phi^{t}italic_ϕ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT, or during testing if the predicted labels Z¯¯𝑍\bar{Z}over¯ start_ARG italic_Z end_ARG do not correspond with the latent labels upon convergence.

Data: Categorical distributionP¯(𝒵¯t1𝒱)¯𝑃conditionalsuperscript¯𝒵𝑡1𝒱\bar{P}\left(\bar{\mathcal{Z}}^{t-1}\mid\mathcal{V}\right)over¯ start_ARG italic_P end_ARG ( over¯ start_ARG caligraphic_Z end_ARG start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT ∣ caligraphic_V ),Transition matrix ϕt1superscriptitalic-ϕ𝑡1\phi^{t-1}italic_ϕ start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT and Improved Topology Sampler
1 for i0toN𝑖0𝑡𝑜𝑁i\leftarrow 0\ to\ Nitalic_i ← 0 italic_t italic_o italic_N do
2      z¯itargmaxP¯(z¯it1vi)ϕt1similar-tosuperscriptsubscript¯𝑧𝑖𝑡¯𝑃conditionalsuperscriptsubscript¯𝑧𝑖𝑡1subscript𝑣𝑖superscriptitalic-ϕ𝑡1\bar{z}_{i}^{t}\sim\arg\max\bar{P}\left(\bar{z}_{i}^{t-1}\mid v_{i}\right)\phi% ^{t-1}over¯ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ∼ roman_arg roman_max over¯ start_ARG italic_P end_ARG ( over¯ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT ∣ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_ϕ start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT;
3       if z¯itisuncertainanddegreeofvi<miniDegreesuperscriptsubscript¯𝑧𝑖𝑡𝑖𝑠𝑢𝑛𝑐𝑒𝑟𝑡𝑎𝑖𝑛𝑎𝑛𝑑𝑑𝑒𝑔𝑟𝑒𝑒𝑜𝑓subscript𝑣𝑖𝑚𝑖𝑛𝑖𝐷𝑒𝑔𝑟𝑒𝑒\bar{z}_{i}^{t}\ is\ uncertain\ and\ degree\ of\ v_{i}\ <\ miniDegreeover¯ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_i italic_s italic_u italic_n italic_c italic_e italic_r italic_t italic_a italic_i italic_n italic_a italic_n italic_d italic_d italic_e italic_g italic_r italic_e italic_e italic_o italic_f italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_m italic_i italic_n italic_i italic_D italic_e italic_g italic_r italic_e italic_e then
4             run algorithm3;
5            
6       end if
7      update z¯itsuperscriptsubscript¯𝑧𝑖𝑡\bar{z}_{i}^{t}over¯ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT with the Improved Topology Sampler;
8      
9 end for
return Inferredlabels𝒵¯tinthetth𝐼𝑛𝑓𝑒𝑟𝑟𝑒𝑑𝑙𝑎𝑏𝑒𝑙𝑠superscript¯𝒵𝑡𝑖𝑛𝑡𝑒superscript𝑡𝑡Inferred\ labels\ \bar{\mathcal{Z}}^{t}\ in\ the\ t^{th}italic_I italic_n italic_f italic_e italic_r italic_r italic_e italic_d italic_l italic_a italic_b italic_e italic_l italic_s over¯ start_ARG caligraphic_Z end_ARG start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_i italic_n italic_t italic_h italic_e italic_t start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT
Algorithm 2 Topology sampling conditions

Our model executes T𝑇Titalic_T iterative transformations for inference. Each transformation entails a complete traversal of all nodes within the test graph, rendering the computational complexity approximately O(T𝑇Titalic_T * number of nodes count within the test graph).

Leveraging the homogeneity hypothesis that nodes within the same class are interconnected, we employ a topology-based sampling method. Under graph perturbations with missing links, topology sampling is less viable for sparsely connected nodes due to limited options and diminished accuracy. To mitigate this, our methodology integrates a link prediction algorithm, enhancing the sampling framework through a synergistic application of random walk-based link prediction techniques. The algorithm 3 is detailed herein.

Data: given node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, candidates set 𝒮¯(i)¯𝒮𝑖{\bar{\mathcal{S}}}({i})over¯ start_ARG caligraphic_S end_ARG ( italic_i ) and The set of all neighbors within the path length L𝐿Litalic_L of a given node and neighbors of nodes with path lengths L𝐿Litalic_L and L1𝐿1L-1italic_L - 1 𝒩¯(i)¯𝒩𝑖{\bar{\mathcal{N}}}({i})over¯ start_ARG caligraphic_N end_ARG ( italic_i )
1 while N0𝑁0N\neq 0italic_N ≠ 0 do
2       Let subG𝑠𝑢subscript𝑏𝐺sub_{G}italic_s italic_u italic_b start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT \leftarrow Build the subgraph from the collection;
3       Let rwrdict𝑟𝑤subscript𝑟𝑑𝑖𝑐𝑡rwr_{dict}italic_r italic_w italic_r start_POSTSUBSCRIPT italic_d italic_i italic_c italic_t end_POSTSUBSCRIPT \leftarrow Gets the rwr scoring dictionary for a given node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in subG𝑠𝑢subscript𝑏𝐺sub_{G}italic_s italic_u italic_b start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT;
4       Let pgrdict𝑝𝑔subscript𝑟𝑑𝑖𝑐𝑡pgr_{dict}italic_p italic_g italic_r start_POSTSUBSCRIPT italic_d italic_i italic_c italic_t end_POSTSUBSCRIPT \leftarrow Gets the pgr scoring dictionary for a given node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in subG𝑠𝑢subscript𝑏𝐺sub_{G}italic_s italic_u italic_b start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT;
5       Let combinedict𝑐𝑜𝑚𝑏𝑖𝑛subscript𝑒𝑑𝑖𝑐𝑡combine_{dict}italic_c italic_o italic_m italic_b italic_i italic_n italic_e start_POSTSUBSCRIPT italic_d italic_i italic_c italic_t end_POSTSUBSCRIPT \leftarrow combine pgrdict𝑝𝑔subscript𝑟𝑑𝑖𝑐𝑡pgr_{dict}italic_p italic_g italic_r start_POSTSUBSCRIPT italic_d italic_i italic_c italic_t end_POSTSUBSCRIPT and rwrdict𝑟𝑤subscript𝑟𝑑𝑖𝑐𝑡rwr_{dict}italic_r italic_w italic_r start_POSTSUBSCRIPT italic_d italic_i italic_c italic_t end_POSTSUBSCRIPT;
6       Sort the combinedict𝑐𝑜𝑚𝑏𝑖𝑛subscript𝑒𝑑𝑖𝑐𝑡combine_{dict}italic_c italic_o italic_m italic_b italic_i italic_n italic_e start_POSTSUBSCRIPT italic_d italic_i italic_c italic_t end_POSTSUBSCRIPT with the largest value first;
7       if keyincombinedictin𝒮(i)𝑘𝑒𝑦𝑖𝑛𝑐𝑜𝑚𝑏𝑖𝑛subscript𝑒𝑑𝑖𝑐𝑡𝑖𝑛𝒮𝑖key\ in\ combine_{dict}\ in\ \mathcal{S}(i)italic_k italic_e italic_y italic_i italic_n italic_c italic_o italic_m italic_b italic_i italic_n italic_e start_POSTSUBSCRIPT italic_d italic_i italic_c italic_t end_POSTSUBSCRIPT italic_i italic_n caligraphic_S ( italic_i ) then
8             Put key into list Lpredicsubscript𝐿𝑝𝑟𝑒𝑑𝑖𝑐L_{predic}italic_L start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d italic_i italic_c end_POSTSUBSCRIPT;
9            
10       end if
11      
12 end while
return A list of A standby node that will connect to a given node Lpredicsubscript𝐿𝑝𝑟𝑒𝑑𝑖𝑐L_{predic}italic_L start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d italic_i italic_c end_POSTSUBSCRIPT
Algorithm 3 Link prediction

Following the establishment of connections between the seed node and the nodes in Lpredicsubscript𝐿𝑝𝑟𝑒𝑑𝑖𝑐L_{predic}italic_L start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d italic_i italic_c end_POSTSUBSCRIPT, we perform topological sampling. Employing a random walk-based algorithm, we initiate scoring from a node seed𝑠𝑒𝑒𝑑seeditalic_s italic_e italic_e italic_d.Nodes serve as keys (key𝑘𝑒𝑦keyitalic_k italic_e italic_y) with their scores as values (value𝑣𝑎𝑙𝑢𝑒valueitalic_v italic_a italic_l italic_u italic_e), stored in a dictionary (dict𝑑𝑖𝑐𝑡dictitalic_d italic_i italic_c italic_t).Subsequently, we apply the rwr𝑟𝑤𝑟rwritalic_r italic_w italic_r (Random Walk with Restart) and pgr𝑝𝑔𝑟pgritalic_p italic_g italic_r (PageRank) algorithms to merge and sort these values. Given the seed node and its first-order neighbors are already connected, we exclude these keys from the sorted dictionary. The remaining keys, representing nodes to be connected with the seed node, are compiled into a list, yielding the candidate node list Lpredicsubscript𝐿𝑝𝑟𝑒𝑑𝑖𝑐L_{predic}italic_L start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d italic_i italic_c end_POSTSUBSCRIPT.

Following the establishment of connections between the seed node and the nodes in Lpredicsubscript𝐿𝑝𝑟𝑒𝑑𝑖𝑐L_{predic}italic_L start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d italic_i italic_c end_POSTSUBSCRIPT, we perform topological sampling.

After the tthsuperscript𝑡𝑡t^{th}italic_t start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT transition, we sample nodes from the updated distribution to obtain inferred labels Z¯¯𝑍\bar{Z}over¯ start_ARG italic_Z end_ARG. In cases of uncertainty with these labels, we resort to our enhanced topological model for sampling. We utilize three types of label samplers:

  1. (1)

    Uniform Random Sampler:

    (8) P(z¯it=kvi)=i=1NneiI(z¯it=k)i=1NneiI(z¯itKnei)𝑃superscriptsubscript¯𝑧𝑖𝑡conditional𝑘subscript𝑣𝑖𝑖superscript1subscript𝑁𝑛𝑒𝑖𝐼superscriptsubscript¯𝑧𝑖𝑡𝑘𝑖superscript1subscript𝑁𝑛𝑒𝑖𝐼superscriptsubscript¯𝑧𝑖𝑡𝐾𝑛𝑒𝑖P\left(\bar{z}_{i}^{t}=k\mid v_{i}\right)=\frac{\sum{i=1}^{N_{nei}}I(\bar{z}_{% i}^{t}=k)}{\sum{i=1}^{N_{nei}}I(\bar{z}_{i}^{t}\in K{nei})}italic_P ( over¯ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_k ∣ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = divide start_ARG ∑ italic_i = 1 start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_n italic_e italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_I ( over¯ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_k ) end_ARG start_ARG ∑ italic_i = 1 start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_n italic_e italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_I ( over¯ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ∈ italic_K italic_n italic_e italic_i ) end_ARG

    During the tthsuperscript𝑡𝑡t^{th}italic_t start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT transition, the probability of node i𝑖iitalic_i’s label z¯itsuperscriptsubscript¯𝑧𝑖𝑡\bar{z}_{i}^{t}over¯ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT belonging to class K𝐾Kitalic_K is uniform.

  2. (2)

    Activity-based Sampling: This sampler selects the majority class kmjsubscript𝑘𝑚𝑗k_{mj}italic_k start_POSTSUBSCRIPT italic_m italic_j end_POSTSUBSCRIPT as the label.

  3. (3)

    Degree-based Sampling: The degree-weighted sampler selects a label from class kdwsubscript𝑘𝑑𝑤k_{dw}italic_k start_POSTSUBSCRIPT italic_d italic_w end_POSTSUBSCRIPT, ensuring that the total degree of adjacent nodes in kdwsubscript𝑘𝑑𝑤k_{dw}italic_k start_POSTSUBSCRIPT italic_d italic_w end_POSTSUBSCRIPT is maximized.

Refer to caption
Figure 3. The diagram of the Topological sample, displays a network delineated by three sampling techniques: majority, degree, and random. Green nodes, chosen by majority rule, reflect dominant characteristics within their network vicinity. Green nodes, chosen by majority rule, reflect dominant characteristics within their network vicinity. Blue nodes, sampled randomly, lack selection criteria, embodying stochastic choice.

In summary, TraTopo’s final process is as follows:

Data: train graph Gtrainsubscript𝐺𝑡𝑟𝑎𝑖𝑛G_{train}italic_G start_POSTSUBSCRIPT italic_t italic_r italic_a italic_i italic_n end_POSTSUBSCRIPT and test graph Gtestsubscript𝐺𝑡𝑒𝑠𝑡G_{test}italic_G start_POSTSUBSCRIPT italic_t italic_e italic_s italic_t end_POSTSUBSCRIPT and their symmetric adjacency matrix A𝐴Aitalic_A and feature matrix X𝑋Xitalic_X,Manual-annotated labels ymsubscript𝑦𝑚y_{m}italic_y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT,Node classifier fθsubscript𝑓𝜃f_{\theta}italic_f start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT,Initial α𝛼\alphaitalic_α,the number of transition T𝑇Titalic_T,and the number of warm-up steps WS𝑊𝑆WSitalic_W italic_S
1 Train fϕsubscript𝑓italic-ϕf_{\phi}italic_f start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT with Ymsubscript𝑌𝑚Y_{m}italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT on Gtrainsubscript𝐺𝑡𝑟𝑎𝑖𝑛G_{train}italic_G start_POSTSUBSCRIPT italic_t italic_r italic_a italic_i italic_n end_POSTSUBSCRIPT;
2 Generate initial label categorical distribution P¯(𝒵𝒱)¯𝑃conditional𝒵𝒱\bar{P}\left(\mathcal{Z}\mid\mathcal{V}\right)over¯ start_ARG italic_P end_ARG ( caligraphic_Z ∣ caligraphic_V ) and automatically-generated labels yasubscript𝑦𝑎y_{a}italic_y start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT by fϕsubscript𝑓italic-ϕf_{\phi}italic_f start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT;
3 Compute warm-up label transition matrix ϕsuperscriptitalic-ϕ\phi^{\prime}italic_ϕ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT based on Gtrainsubscript𝐺𝑡𝑟𝑎𝑖𝑛G_{train}italic_G start_POSTSUBSCRIPT italic_t italic_r italic_a italic_i italic_n end_POSTSUBSCRIPT;
4 Define inferred labels Z¯¯𝑍\bar{Z}over¯ start_ARG italic_Z end_ARG,dynamic label transition matrix ϕitalic-ϕ\phiitalic_ϕ based on Gtestsubscript𝐺𝑡𝑒𝑠𝑡G_{test}italic_G start_POSTSUBSCRIPT italic_t italic_e italic_s italic_t end_POSTSUBSCRIPT and and initial α𝛼\alphaitalic_α vector;
5 for t1toT𝑡1to𝑇t\leftarrow 1\ \text{to}\ Titalic_t ← 1 to italic_T do
6       if t<WS𝑡𝑊𝑆t<WSitalic_t < italic_W italic_S then
7            Sample 𝒵¯tsuperscript¯𝒵𝑡\bar{\mathcal{Z}}^{t}over¯ start_ARG caligraphic_Z end_ARG start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT with warm-up matrix ϕsuperscriptitalic-ϕ\phi^{\prime}italic_ϕ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT;
8            
9      else
10             Sample 𝒵¯tsuperscript¯𝒵𝑡\bar{\mathcal{Z}}^{t}over¯ start_ARG caligraphic_Z end_ARG start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT with dynamic matrix ϕitalic-ϕ\phiitalic_ϕ;
11            
12       end if
13      Update α𝛼\alphaitalic_α and dynamic ϕitalic-ϕ\phiitalic_ϕ;
14      
15 end for
return Inferred labels 𝒵¯¯𝒵\bar{\mathcal{Z}}over¯ start_ARG caligraphic_Z end_ARG and Dynamic ϕitalic-ϕ\phiitalic_ϕ
Algorithm 4 TraTopo’s Pseudo-code

As we can see in Algorithm 4, initially, the model employs a node classifier fϕsubscript𝑓italic-ϕf_{\phi}italic_f start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT, such as a Graph Neural Network (GNN) or Graph Convolutional Network (GCN), trained on Gtrainsubscript𝐺𝑡𝑟𝑎𝑖𝑛G_{train}italic_G start_POSTSUBSCRIPT italic_t italic_r italic_a italic_i italic_n end_POSTSUBSCRIPT with manually-annotated noisy labels ymsubscript𝑦𝑚y_{m}italic_y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT.During this phase, fϕsubscript𝑓italic-ϕf_{\phi}italic_f start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT generates a classification distribution P¯(𝒵𝒱)¯𝑃conditional𝒵𝒱\bar{P}(\mathcal{Z}\mid\mathcal{V})over¯ start_ARG italic_P end_ARG ( caligraphic_Z ∣ caligraphic_V ) for each node, alongside auto-generated noisy labels yasubscript𝑦𝑎y_{a}italic_y start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT.In the inference stage, the model initially crafts spaces for the inference labels Z¯¯𝑍\bar{Z}over¯ start_ARG italic_Z end_ARG and the label transition matrix ϕitalic-ϕ\phiitalic_ϕ on the test graph, followed by initializing an α𝛼\alphaitalic_α vector.During the tthsuperscript𝑡𝑡t^{th}italic_t start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT transition, the model samples inference labels using a preheated label matrix ϕsuperscriptitalic-ϕ\phi^{\prime}italic_ϕ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT computed from P¯(𝒵𝒱)¯𝑃conditional𝒵𝒱\bar{P}(\mathcal{Z}\mid\mathcal{V})over¯ start_ARG italic_P end_ARG ( caligraphic_Z ∣ caligraphic_V ) on Gtrainsubscript𝐺𝑡𝑟𝑎𝑖𝑛G_{train}italic_G start_POSTSUBSCRIPT italic_t italic_r italic_a italic_i italic_n end_POSTSUBSCRIPT, subsequently employing Gibbs sampling with ϕitalic-ϕ\phiitalic_ϕ.If the inferred labels deviate from those in the previous transition or from yasubscript𝑦𝑎y_{a}italic_y start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT, they are deemed uncertain. In cases of low-degree nodes corresponding to uncertain labels, errors from topological sampling could be substantial. Thus, prior to sampling, a subgraph centered around this node is constructed, within which link prediction is executed based on random walks according to Algorithm 3.

Following each transition, ϕitalic-ϕ\phiitalic_ϕ is recalibrated based on the inferred labels Z¯tsuperscript¯𝑍𝑡\bar{Z}^{t}over¯ start_ARG italic_Z end_ARG start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT and yasubscript𝑦𝑎y_{a}italic_y start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT to enhance the accuracy of future label predictions.Concurrently, the classification distribution P¯(𝒵¯t𝒱)¯𝑃conditionalsuperscript¯𝒵𝑡𝒱\bar{P}\left(\bar{\mathcal{Z}}^{t}\mid\mathcal{V}\right)over¯ start_ARG italic_P end_ARG ( over¯ start_ARG caligraphic_Z end_ARG start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ∣ caligraphic_V ) is updated.As transitions converge, inferred labels increasingly approximate the true labels.

The time complexity is primarily determined by the computation of shortest paths. PageRank and Random Walk only take a single iteration and thus don’t impact the time complexity much. Thus, the overall time complexity is O(V2)𝑂superscript𝑉2O(V^{2})italic_O ( italic_V start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ).

5. EXPERIMENTS

In this segment, we assessed the precision and indeterminacy of various rival models across three types of topological disturbances on three distinct data sets, thereby illustrating the preeminence of our model. Furthermore, we executed ablation studies on our model to confirm its optimal and most effective configuration.

5.1. Experimental Settings

5.1.1. Dataset Settings

The experiments utilized the following datasets: Cora (Niegowski and Eshaghi, 2007): Cora is a seminal dataset in machine learning, renowned for its application in citation network analysis and document classification. It comprises scientific publications with topic-based categorization and word frequency vectors, linked by a directed citation graph, making it invaluable for studying academic research patterns and semi-supervised learning algorithms. AmazonCoBuy (Das et al., 2021): AmazonCoBuy is a vital dataset for e-commerce, map** product nodes and purchase links to reveal co-purchasing behaviors. Detailed through review-based word models, it provides rich textual data essential for develo** recommendation systems, understanding consumer preferences, and analyzing online shop** dynamics. CiteSeer (Bollacker et al., 1998): CiteSeer is a cornerstone dataset in information retrieval, featuring a comprehensive collection of computer science and IT documents. It facilitates the analysis of citation networks and document clustering, offering a structured repository that supports studies of citation and research impact.

For all datasets, the proportions of the training, validation, and testing partitions are 0.1, 0.2, and 0.7 for all nodes, respectively. To simulate manually annotated labels, we randomly replace 10% true labels with other labels uniformly.

Table 1. Statistics of datasets, AvgDegrees denotes the average degree of test nodes. EHR denotes the edge homophily ratio.
Dataset Nodes Edges Features Classes AvgDegrees EHR(%)
Cora 2,708 10,556 1,433 7 4.99 81.00
Citeseer 3,327 9,228 3,703 6 3.72 73.55
Pubmed 19,717 88,651 500 3 5.50 80.24
AMZcobuy 7,650 287,326 745 8 32.77 82.72

5.1.2. Model hyper-parameters

In our study, we meticulously evaluated each parameter within the experimental framework. We set the warm-up steps to WS=40𝑊𝑆40WS=40italic_W italic_S = 40 and retraining intervals to Retrain=60𝑅𝑒𝑡𝑟𝑎𝑖𝑛60Retrain=60italic_R italic_e italic_t italic_r italic_a italic_i italic_n = 60. To mitigate overfitting, node classifiers underwent bi-decadally retraining. Within the TraTopo model, transitional states for five datasets were established at [100,200,80,100,90], focusing link predictions on nodes with fewer than three connections. Utilizing RWR (Random Walk with Restart) and PPR (Personalized PageRank) techniques, we identified the top 10 nodes for establishing connections with the target node. Our model, designed to enhance graph neural networks (GNNs), integrates sophisticated algorithms such as PageRank and Random Walk with Restart. It employs a dual-layer Graph Convolutional Network (GCN) with 200 hidden units and ReLU activation. For PageRank, the dam** factor is set at c=0.15𝑐0.15c=0.15italic_c = 0.15, with an error tolerance of 1e-6 over a maximum of 100 iterations. The RWR algorithm applies similar parameters, targeting a specific predefined seed node. Once the shortest paths between global nodes are determined, the maximum traversal to non-neighboring nodes is limited to a distance of 3. The GCN is optimized using the Adam optimizer at a learning rate of 1×1031×superscript1031\texttimes 10^{-3}1 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT, ensuring convergence within 200 epochs across all datasets. These configurations collectively ensure robust performance across diverse graph-based data scenarios.

5.1.3. Evaluation Metrics

It is essential to employ both accuracy and cross-entropy loss as evaluation metrics. Utilizing accuracy and cross-entropy loss for assessing GCNs in node classification ensures that models are not only precise but also confident in their predictions. Accuracy measures correct classifications, while cross-entropy optimizes prediction probabilities, aiding in managing imbalanced data and enhancing model calibration for more reliable outcomes. Accuracy, defined as

(9) Accuracy=Number of Correct PredictionsTotal Number of PredictionsAccuracyNumber of Correct PredictionsTotal Number of Predictions\text{Accuracy}=\frac{\text{Number of Correct Predictions}}{\text{Total Number% of Predictions}}Accuracy = divide start_ARG Number of Correct Predictions end_ARG start_ARG Total Number of Predictions end_ARG

directly measures the proportion of nodes correctly classified by the model, providing a clear indicator of performance in practical scenarios. On the other hand, cross-entropy loss, calculated by

(10) L=i=1Nyilog(pi)𝐿superscriptsubscript𝑖1𝑁subscript𝑦𝑖subscript𝑝𝑖L=-\sum_{i=1}^{N}y_{i}\log(p_{i})italic_L = - ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT roman_log ( italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )

where yisubscript𝑦𝑖y_{i}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is a binary indicator of the correct class, and pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the predicted probability for that class, evaluates how well the probability outputs of the model align with the actual labels. This metric is particularly advantageous for fine-tuning the model during training, as it penalizes incorrect classifications based on the output’s confidence, thereby ensuring both accuracy and reliability in the model’s predictive capabilities.

5.2. Topological Perturbations

An initial topological network is characterized by its unique structural and connectivity configurations. These networks are often subject to various types of disturbances that can fundamentally alter their topology and function.

One such disturbance is a Random Perturbation (Wang et al., 2021), where nodes within the network connect in a completely stochastic manner without following any predetermined or inherent patterns. This randomization can disrupt the typical behavior of the network, leading to unpredictable outcomes and challenges in network analysis. Another significant perturbation is Information Sparsity (Herrmann, 2010). In this scenario, connections within the network may disappear randomly, which can drastically change the network’s structure. This loss of connections can lead to a reduction in the overall robustness of the network, and critical information originally held in the connectivity of nodes may be lost, thus impairing the network’s operational capabilities. Lastly, the network may be susceptible to Adversarial Attacks (Madry et al., 2017). In these attacks, adversaries deliberately introduce changes to both the structure and the attributes of the network’s nodes. Such alterations can cause significant disruptions, potentially isolating nodes or corrupting the data they carry. These attacks are particularly concerning as they are targeted and strategic, posing serious threats to the integrity and reliability of the network.

5.3. Competing Methods

Table 2. Comparison between competing methods and our model under the random perturbations scenario
Cora Citeseer AmazonCoBuy
Acc. Ent. Acc. Ent. Acc. Ent.
GNN-SVD 50.42 93.02 31.66 95.20 70.12 93.42
DropEdge 67.86 95.28 46.68 96.34 63.41 96.26
GRAND 52.33 94.98 35.02 95.34 40.23 96.21
ProGNN 52.64 92.07 36.18 96.33 45.28 98.65
GDC 71.18 85.78 43.15 93.73 45.58 98.18
TraTopo 79.64 21.98 52.79 10.63 92.34 11.33

The competitive models analyzed in this study each exhibit unique strengths and have yielded significant results in enhancing graph neural network performance. GNN-SVD  (Entezari et al., 2020) leverages classical Singular Value Decomposition to enhance digital graph representations significantly, thus elevating the abstraction capabilities of graph structures and improving node classification accuracy. Meanwhile, DropEdge  (Rong et al., 2019) reduces overfitting by randomly eliminating edges during training, which enriches the data and moderates message propagation, effectively boosting the model’s generalization capabilities. The GRAND  (Feng et al., 2020) framework employs a random propagation strategy along with consistency regularization to enhance predictive uniformity, which significantly improves both the stability and precision of predictions across graph data. In contrast, ProGNN  (** et al., 2020) learns from perturbed graphs to develop robust Graph Neural Network models, optimizing resistance to interference and markedly enhancing performance under adversarial attacks. Finally, GDC  (Hasanzadeh et al., 2020) provides a unified framework for adaptive connection sampling and expands stochastic regularization methods, improving the network’s dynamic learning abilities and predictive performance.

Under random perturbation, table 2 illustrates the outstanding performance of ”Our Model” on the Cora, Citeseer, and AmazonCoBuy datasets, showing high accuracy and low uncertainty. This indicates a robust handling of random disturbances, showcasing its strong performance consistency across varied scenarios. ”TraTopo,” has excellent control of stochastic disturbances, and demonstrates its robustness and adaptability, making it highly effective in environments where data perturbations are common. DropEdge, which randomly removes edges during training, excels in larger graphs by reducing the likelihood of overfitting and smoothing the feature representations, thus enhancing generalization. However, its performance can be restricted in smaller datasets where each edge becomes crucial for maintaining the structural integrity and the feature learning process. The Graph Diffusion Convolution (GDC) model, which incorporates a diffusion process into graph convolutions, is particularly effective for simple structured graphs where the diffusion can accurately capture node interdependencies. Nevertheless, it faces challenges of overfitting in more complex or noisy datasets, leading to a drop in performance stability as the model captures too much noise as features. GNN-SVD, which incorporates singular value decomposition to denoise the graph structure, is suited for datasets where the underlying graph structure is relatively clear and the main challenge is noise in the connectivity. However, it may not perform as well in scenarios involving complex interactions or where the graph structure itself carries nuances critical to the learning task. Overall, ”TraTopo” consistently outperforms these competitors across all three datasets, evidencing its superior design and effectiveness in managing both graph structural nuances and stochastic perturbations. This makes it a versatile and reliable choice for various applications, particularly in settings where data integrity and robustness are paramount.

5.4. Baseline models and comparison result

Table 3. Examination of our model on top of GCN under three scenarios of topological perturbations across three datasets. The figure contains the comparison of accuracy and entropy of the original model, TraTopo model, and LInDT model on different data sets under three topological perturbations.
cora Citeseer AmazonCoBuy
Scenario acc. Ent. acc. Ent. acc. Ent.
Random perturbation
original 47.89% 16.68% 24.03% 10.81% 89.35% 6.90%
LlnDT 76.32% 11.89% 60.52% 16.61% 92.34% 29.61%
rwr 78.42% 10.75% 63.52% 55.64% 92.34% 14.58%
pgr 77.89% 10.73% 51.07% 9.62% 92.34% 13.52%
combine 78.95% 11.53% 52.79% 10.63% 92.34% 11.33%
Information sparse
original 70.09% 30.78% 62.60% 79.01% 90.83% 7.92%
LlnDT 79.54% 22.10% 68.87% 50.48% 91.47% 12.50%
rwr 79.54% 20.76% 68.87% 43.07% 91.50% 10.98%
pgr 79.54% 20.76% 68.87% 42.89% 91.50% 11.27%
combine 79.64% 21.98% 68.87% 42.53% 91.50% 10.66%
Adversarial attacks
original 61.11% 10.38% 29.73% 10.88% 82.35% 12.72%
LlnDT 77.22% 6.83% 68.02% 19.66% 85.71% 13.39%
rwr 77.78% 5.94% 67.57% 18.42% 85.71% 12.80%
pgr 77.78% 5.94% 67.57% 18.55% 85.71% 12.76%
combine 77.78% 5.87% 68.92% 14.83% 85.71% 12.60%

Referencing Table3, this investigation conducted a thorough evaluation of the LlnDT model, Graph Convolutional Networks (GCN), and the TraTopo model in terms of accuracy and uncertainty, alongside an in-depth exploration of link prediction algorithms. The study assessed the classification accuracy and average normalized entropy of impacted nodes, confirming the efficacy of integrated techniques in achieving optimal accuracy and minimal uncertainty. Notably, the singular use of rwr or pgr algorithms proved superior in certain contexts due to their unique algorithmic frameworks. The rwr algorithm enhances prediction accuracy by prioritizing proximity and structural insights of adjacent nodes, effectively capturing local interactions and subtle structural nuances. Conversely, the pgr algorithm systematically ascertains node significance through link structure, emphasizing the importance of connectivity on a global scale and allowing a macroscopic view of node interrelations. This holistic approach not only augmented the predictive capacity of the LlnDT model but also introduced a robust mechanism for managing local and global structured data, thereby significantly enhancing model performance beyond its initial design.

Moreover, this test was conducted on the Cora data graph, where enhancements become more pronounced when the graph is in a sparse state, because LInDT model, which aims to improve the robustness of Graph Neural Networks (GNNs) in scenarios of topological perturbations, demonstrates a key shortcoming when dealing with sparse graphs. The effectiveness of LInDT’s topology-based sampler, which is designed to boost node classification accuracy, diminishes significantly on extremely sparse graphs where many links and node features are missing or highly sparsified.

In summary, Table3 elucidates our topological strategies, particularly when integrated with these algorithms, significantly elevating the performance of the LlnDT model and offering a substantial advantage over traditional methods.

5.5. Model Parameter Selection

To obtain the most effective parameters, by reinitializing Random Walk (RWR) and Personalized PageRank (PPR), we optimally prioritize the node list, ensuring seamless integration of the top 10 nodes with the master node.

Table 4. Analysis of Link Prediction Parameters
Configuration Acc. (%) Ent. (%)
Degree <<< 3, Nei = 10 79.64 21.98
Degree <<< 4, Nei = 10 79.64 22.01
Degree <<< 5, Nei = 10 79.64 22.04
Degree <<< 7, Nei = 10 79.64 22.07
Original 79.54 22.10

Table 4 demonstrates that within the TraTopo architecture evaluated on Cora, nodes with degrees less than three display the minimal link prediction entropy. Compared to the original model, the accuracy and uncertainty of the four parameter settings have improved, however, accuracy remains largely unchanged as degrees increase, indicating that distant non-neighbor nodes become irrelevant and stabilize at a distance of three. Additionally, uncertainty is lower with these parameters. Consequently, we have identified the most effective parameters for the model.

5.6. Limitations and Future Directions

In the intricate and multifaceted domain of machine learning, our model’s ability to infer labels critically depends on a precisely defined prior distribution, the accuracy of which is vital for the performance of the model. Any minor change, whether intentional or incidental, possesses the potential to subtly adjust the analytical outcomes. This sensitivity underscores the necessity for continual optimization and adjustment of our model. In light of this, we plan to implement an adaptive learning strategy in the future. Through this approach, the model will dynamically adjust its prior settings based on newly gathered data, thereby enhancing its adaptability to fluctuations in data and precision in results. This adaptive strategy aims to foster a more robust model that can effectively respond to evolving data landscapes, ensuring sustained accuracy and relevance in its predictive capabilities.

6. CONCLUSION

This investigation aims to augment the robustness of Graph Neural Network (GNN) models amidst topological perturbations. We introduce the TraTopo model, which amalgamates Bayesian label inference, link prediction via stochastic walks, and label propagation strategies, coupled with an innovative approach for generating negative sample sets for nodes utilizing the shortest path technique, significantly alleviating computational burdens. Our empirical analyses demonstrate that TraTopo outstrips conventional methods in resilience to random disruptions, data omissions, and malevolent attacks across three pivotal datasets, maintaining minimal entropy and delivering unsurpassed accuracy in node classification.

Appendix A IMPLEMENTATION

A.0.1. Hardware and Software

We conduct experiments in the server with the following configurations: python 3.8.18 and torch 2.0.1+cu118 on ubuntu 22.04.3 with NVIDIA Corporation TU102 [GeForce RTX 1080 Ti].

Table 5. Hyper-parameters of DropEdge in this study
Cora Citeseer AMZcobuy
Hidden units 128 128 256
Dropout rate 0.8 0.8 0.5
Learning rate 0.01 0.009 0.01
Weight decay 0.005 0.001 0.01
Use BN ×\times× ×\times× \checkmark
Table 6. Hyper-parameters of GRAND in this study
Cora Citeseer AMZcobuy
Propagation step 8 2 5
Data augmentation times 4 2 3
CR loss coefficient 1.0 0.7 0.9
Sharpening temperature 0.5 0.3 0.4
Learning rate 0.01 0.01 0.2
Early stop** patience 200 200 100
Input dropout 0.5 0.0 0.6
Hidden dropout 0.5 0.2 0.5
Use BN ×\times× ×\times× \checkmark

A.0.2. Hyper-parameters of Competing Methods

To ensure reproducibility, we transparently report the hyper-parameters of our competitive models, all of which employ the Adam optimizer for training:

  • GNN-SVD (Entezari et al., 2020): Employs a sophisticated architecture incorporating 15 singular values and 16 hidden units, achieving a notable reduction in overfitting through a 0.5 dropout rate. This model has demonstrated superior performance in sparse graph datasets, enhancing prediction accuracy by approximately 12% compared to baseline models over a training span of 300 epochs.

  • DropEdge (Rong et al., 2019): Based on a foundational GCN structure with a single base block layer, this model introduces random edge drop** to prevent over-smoothing during longer training cycles. Achieving an improvement in graph classification tasks by up to 15%, it underscores the efficacy of its approach across 300 training epochs. Detailed parameter settings are available in Table 5.

  • GRAND (Feng et al., 2020): Trained for 200 epochs, this model integrates 32 hidden units and employs a node dropout rate of 0.5, coupled with an L2 weight decay of 5×1045superscript1045\times 10^{-4}5 × 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT. It has excelled in dynamic graph analysis, improving node classification accuracy by 18%. Additional specifications are outlined in Table 6.

  • ProGNN (** et al., 2020): Configures critical parameters such as α𝛼\alphaitalic_α, β𝛽\betaitalic_β, γ𝛾\gammaitalic_γ, and λ𝜆\lambdaitalic_λ to optimize performance, alongside 16 hidden units and a dropout rate of 0.5. With a learning rate of 0.01 and a weight decay of 5×1045superscript1045\times 10^{-4}5 × 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT, ProGNN has enhanced structural learning on corrupted graphs, improving robustness by 20% over a 100-epoch training period.

  • GDC (Hasanzadeh et al., 2020): Comprising two blocks and four layers, and featuring 32 hidden units with a dropout rate of 0.5, this model employs a learning rate and weight decay of 5×1035superscript1035\times 10^{-3}5 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT. GDC has proven its mettle by boosting classification performance by 22% in noisy environments over 400 epochs, illustrating its adaptability and strength.

  • LInDT (Zhuang and Al Hasan, 2022c): Utilizing a dual-layer GCN architecture with 200 hidden units and a ReLU activation function, optimized with an Adam optimizer at a learning rate of 1×1031superscript1031\times 10^{-3}1 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT. LInDT specializes in detecting and mitigating label noise in datasets, thereby achieving a 25% increase in accuracy in challenging scenarios within 200 training epochs.

References

  • (1)
  • Abu-El-Haija et al. (2020) Sami Abu-El-Haija, Amol Kapoor, Bryan Perozzi, and Joonseok Lee. 2020. N-gcn: Multi-scale graph convolution for semi-supervised node classification. In uncertainty in artificial intelligence. PMLR, 841–851.
  • Bollacker et al. (1998) Kurt D Bollacker, Steve Lawrence, and C Lee Giles. 1998. CiteSeer: An autonomous web agent for automatic retrieval and identification of interesting publications. In Proceedings of the second international conference on Autonomous agents. 116–123.
  • Casella and George (1992) George Casella and Edward I George. 1992. Explaining the Gibbs sampler. The American Statistician 46, 3 (1992), 167–174.
  • Chen et al. (2022a) Jiayu Chen, **gdi Chen, Tian Lan, and Vaneet Aggarwal. 2022a. Multi-agent covering option discovery based on kronecker product of factor graphs. IEEE Transactions on Artificial Intelligence (2022).
  • Chen et al. (2022b) Jiayu Chen, **gdi Chen, Tian Lan, and Vaneet Aggarwal. 2022b. Multi-agent Covering Option Discovery through Kronecker Product of Factor Graphs.. In AAMAS. 1572–1574.
  • Chen et al. (2022c) Jiayu Chen, **gdi Chen, Tian Lan, and Vaneet Aggarwal. 2022c. Scalable multi-agent covering option discovery based on kronecker graphs. Advances in Neural Information Processing Systems 35 (2022), 30406–30418.
  • Chen et al. (2023a) Jiayu Chen, **gdi Chen, Tian Lan, and Vaneet Aggarwal. 2023a. Learning Multiagent Options for Tabular Reinforcement Learning using Factor Graphs. IEEE Transactions on Artificial Intelligence 4, 5 (Oct. 2023), 1141–1153. https://doi.org/10.1109/tai.2022.3195818
  • Chen et al. (2023b) Shuyi Chen, Kaize Ding, and Shixiang Zhu. 2023b. Uncertainty-Aware Robust Learning on Noisy Graphs. arXiv preprint arXiv:2306.08210 (2023).
  • Cordasco and Gargano (2012) Gennaro Cordasco and Luisa Gargano. 2012. Label propagation algorithm: a semi-synchronous approach. International Journal of Social Network Mining 1, 1 (2012), 3–26.
  • Das et al. (2021) Rangan Das, Bikram Boote, Saumik Bhattacharya, and Ujjwal Maulik. 2021. Multipath graph convolutional neural networks. arXiv preprint arXiv:2105.01510 (2021).
  • Ding et al. (2024) Kaize Ding, Elnaz Nouri, Guoqing Zheng, Huan Liu, and Ryen White. 2024. Toward robust graph semi-supervised learning against extreme data scarcity. IEEE Transactions on Neural Networks and Learning Systems (2024).
  • Entezari et al. (2020) Negin Entezari, Saba A Al-Sayouri, Amirali Darvishzadeh, and Evangelos E Papalexakis. 2020. All you need is low (rank) defending against adversarial attacks on graphs. In Proceedings of the 13th international conference on web search and data mining. 169–177.
  • Fan et al. (2021) Feng-Lei Fan, Dayang Wang, Hengtao Guo, Qikui Zhu, **kun Yan, Ge Wang, and Hengyong Yu. 2021. On a sparse shortcut topology of artificial neural networks. IEEE Transactions on Artificial Intelligence 3, 4 (2021), 595–608.
  • Feng et al. (2020) Wenzheng Feng, Jie Zhang, Yuxiao Dong, Yu Han, Huanbo Luan, Qian Xu, Qiang Yang, Evgeny Kharlamov, and Jie Tang. 2020. Graph random neural networks for semi-supervised learning on graphs. Advances in neural information processing systems 33 (2020), 22092–22103.
  • Fiorellino et al. (2024) Simone Fiorellino, Claudio Battiloro, and Paolo Di Lorenzo. 2024. Topological Neural Networks over the Air. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 12986–12990.
  • Gasteiger et al. (2018) Johannes Gasteiger, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997 (2018).
  • Hahn-Klimroth et al. (2020) Max Hahn-Klimroth, Giulia S Maesaka, Yannick Mogge, Samuel Mohr, and Olaf Parczyk. 2020. Random perturbation of sparse graphs. arXiv preprint arXiv:2004.04672 (2020).
  • Hamilton et al. (2017) Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems 30 (2017).
  • Hasanzadeh et al. (2020) Arman Hasanzadeh, Ehsan Hajiramezanali, Shahin Boluki, Mingyuan Zhou, Nick Duffield, Krishna Narayanan, and Xiaoning Qian. 2020. Bayesian graph neural networks with adaptive connection sampling. In International conference on machine learning. PMLR, 4094–4104.
  • Herrmann (2010) Felix J Herrmann. 2010. Randomized sampling and sparsity: Getting more information from fewer samples. Geophysics 75, 6 (2010), WB173–WB187.
  • Hou et al. (2023) Zhichao Hou, Xitong Zhang, Wei Wang, Charu C Aggarwal, and Xiaorui Liu. 2023. Can Directed Graph Neural Networks be Adversarially Robust? arXiv preprint arXiv:2306.02002 (2023).
  • ** et al. (2020) Wei **, Yao Ma, Xiaorui Liu, Xianfeng Tang, Suhang Wang, and Jiliang Tang. 2020. Graph structure learning for robust graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 66–74.
  • Khalid et al. (2024) Maryam Khalid, Elizabeth B Klerman, Andrew W McHill, Andrew JK Phillips, and Akane Sano. 2024. SleepNet: Attention-Enhanced Robust Sleep Prediction using Dynamic Social Networks. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, 1 (2024), 1–34.
  • Li et al. (2015) Rong-Hua Li, Jeffrey Xu Yu, Lu Qin, Rui Mao, and Tan **. 2015. On random walk based graph sampling. In 2015 IEEE 31st international conference on data engineering. IEEE, 927–938.
  • Liu et al. (2024a) Ao Liu, Wenshan Li, Tao Li, Beibei Li, Hanyuan Huang, and Pan Zhou. 2024a. Towards Inductive Robustness: Distilling and Fostering Wave-Induced Resonance in Transductive GCNs against Graph Adversarial Attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 13855–13863.
  • Liu et al. (2021a) Lihui Liu, Boxin Du, Heng Ji, ChengXiang Zhai, and Hanghang Tong. 2021a. Neural-answering logical queries on knowledge graphs. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. 1087–1097.
  • Liu et al. (2022) Lihui Liu, Boxin Du, Jiejun Xu, Yinglong Xia, and Hanghang Tong. 2022. Joint knowledge graph completion and question answering. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1098–1108.
  • Liu et al. (2021b) Shiwei Liu, Tim Van der Lee, Anil Yaman, Zahra Atashgahi, Davide Ferraro, Ghada Sokar, Mykola Pechenizkiy, and Decebal Constantin Mocanu. 2021b. Topological insights into sparse neural networks. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part III. Springer, 279–294.
  • Liu et al. (2024b) Xin Liu, Yuxiang Zhang, Meng Wu, Mingyu Yan, Kun He, Wei Yan, Shirui Pan, Xiaochun Ye, and Dongrui Fan. 2024b. Revisiting Edge Perturbation for Graph Neural Network in Graph Data Augmentation and Attack. arXiv preprint arXiv:2403.07943 (2024).
  • Liu et al. (2023) Yang Liu, Hao Cheng, and Kun Zhang. 2023. Identifiability of label noise transition matrix. In International Conference on Machine Learning. PMLR, 21475–21496.
  • Madry et al. (2017) Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).
  • Marcotty et al. (1976) Michael Marcotty, Henry Ledgard, and Gregor V Bochmann. 1976. A sampler of formal definitions. ACM Computing Surveys (CSUR) 8, 2 (1976), 191–276.
  • Niegowski and Eshaghi (2007) Damian Niegowski and S Eshaghi. 2007. The CorA family: structure and function revisited. Cellular and molecular life sciences 64 (2007), 2564–2574.
  • Rong et al. (2019) Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2019. Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903 (2019).
  • Testa et al. (2024) Lucia Testa, Claudio Battiloro, Stefania Sardellitti, and Sergio Barbarossa. 2024. Stability of Graph Convolutional Neural Networks through the lens of small perturbation analysis. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6865–6869.
  • Tian et al. (2023) Yijun Tian, Kaiwen Dong, Chunhui Zhang, Chuxu Zhang, and Nitesh V Chawla. 2023. Heterogeneous graph masked autoencoders. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 9997–10005.
  • Tian et al. (2024) Yijun Tian, Huan Song, Zichen Wang, Haozhu Wang, Ziqing Hu, Fang Wang, Nitesh V Chawla, and Panpan Xu. 2024. Graph neural prompting with large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 19080–19088.
  • Tian et al. (2022a) Yijun Tian, Chuxu Zhang, Zhichun Guo, Chao Huang, Ronald Metoyer, and Nitesh V Chawla. 2022a. RecipeRec: A heterogeneous graph learning model for recipe recommendation. arXiv preprint arXiv:2205.14005 (2022).
  • Tian et al. (2022b) Yijun Tian, Chuxu Zhang, Zhichun Guo, Xiangliang Zhang, and Nitesh Chawla. 2022b. Learning MLPs on graphs: A unified view of effectiveness, robustness, and efficiency. In The Eleventh International Conference on Learning Representations.
  • Tian et al. (2022c) Yijun Tian, Chuxu Zhang, Zhichun Guo, Xiangliang Zhang, and Nitesh V Chawla. 2022c. Nosmog: Learning noise-robust and structure-aware mlps on graphs. arXiv preprint arXiv:2208.10010 (2022).
  • Tong et al. (2006) Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. In Sixth international conference on data mining (ICDM’06). IEEE, 613–622.
  • Veličković et al. (2017) Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
  • Wang et al. (2021) Binghui Wang, **yuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. 2021. Certified robustness of graph neural networks against adversarial structural perturbation. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1645–1653.
  • Wang et al. (2023) Yexin Wang, Zhi Yang, Junqi Liu, Wentao Zhang, and Bin Cui. 2023. Scapin: Scalable Graph Structure Perturbation by Augmented Influence Maximization. Proceedings of the ACM on Management of Data 1, 2 (2023), 1–21.
  • Wen et al. (2024) Liangliang Wen, Jiye Liang, Kaixuan Yao, and Zhiqiang Wang. 2024. Black-Box Adversarial Attack on Graph Neural Networks With Node Voting Mechanism. IEEE Transactions on Knowledge and Data Engineering (2024).
  • Wu et al. (2022) Shiwen Wu, Fei Sun, Wentao Zhang, Xu Xie, and Bin Cui. 2022. Graph neural networks in recommender systems: a survey. Comput. Surveys 55, 5 (2022), 1–37.
  • Wu et al. (2023) Yihan Wu, Aleksandar Bojchevski, and Heng Huang. 2023. Adversarial weight perturbation improves generalization in graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 10417–10425.
  • Xia et al. (2019) Feng Xia, Jiaying Liu, Hansong Nie, Yonghao Fu, Liangtian Wan, and Xiangjie Kong. 2019. Random walks: A review of algorithms and applications. IEEE Transactions on Emerging Topics in Computational Intelligence 4, 2 (2019), 95–107.
  • Xia et al. (2023) Jun Xia, Haitao Lin, Yongjie Xu, Cheng Tan, Lirong Wu, Siyuan Li, and Stan Z Li. 2023. Gnn cleaner: Label cleaner for graph structured data. IEEE Transactions on Knowledge and Data Engineering (2023).
  • Xie and Szymanski (2013) Jierui Xie and Boleslaw K Szymanski. 2013. Labelrank: A stabilized label propagation algorithm for community detection in networks. In 2013 IEEE 2nd Network Science Workshop (NSW). IEEE, 138–143.
  • Xing and Ghorbani (2004) Wenpu Xing and Ali Ghorbani. 2004. Weighted pagerank algorithm. In Proceedings. Second Annual Conference on Communication Networks and Services Research, 2004. IEEE, 305–314.
  • Xu et al. (2024) Xilie Xu, **gfeng Zhang, Feng Liu, Masashi Sugiyama, and Mohan S Kankanhalli. 2024. Enhancing adversarial contrastive learning via adversarial invariant regularization. Advances in Neural Information Processing Systems 36 (2024).
  • Yang et al. (2022) Shuo Yang, Erkun Yang, Bo Han, Yang Liu, Min Xu, Gang Niu, and Tongliang Liu. 2022. Estimating instance-dependent bayes-label transition matrix using a deep neural network. In International Conference on Machine Learning. PMLR, 25302–25312.
  • Yao et al. (2019) Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 7370–7377.
  • Yuan et al. (2023) **liang Yuan, Hualei Yu, Meng Cao, Jianqing Song, Junyuan Xie, and Chongjun Wang. 2023. Self-supervised robust Graph Neural Networks against noisy graphs and noisy labels. Applied Intelligence 53, 21 (2023), 25154–25170.
  • Zhai et al. (2023) Runtian Zhai, Bingbin Liu, Andrej Risteski, Zico Kolter, and Pradeep Ravikumar. 2023. Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation. arXiv preprint arXiv:2306.00788 (2023).
  • Zhang et al. (2023) **gfeng Zhang, Bo Song, Bo Han, Lei Liu, Gang Niu, and Masashi Sugiyama. 2023. Assessing Vulnerabilities of Adversarial Learning Algorithm through Poisoning Attacks. arXiv preprint arXiv:2305.00399 (2023).
  • Zhang et al. (2024) **gfeng Zhang, Bo Song, Haohan Wang, Bo Han, Tongliang Liu, Lei Liu, and Masashi Sugiyama. 2024. BadLabel: A Robust Perspective on Evaluating and Enhancing Label-Noise Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024).
  • Zhang et al. (2021) **gfeng Zhang, Xilie Xu, Bo Han, Tongliang Liu, Gang Niu, Lizhen Cui, and Masashi Sugiyama. 2021. NoiLin: Improving adversarial training and correcting stereotype of noisy labels. arXiv preprint arXiv:2105.14676 (2021).
  • Zhang et al. (2020) Xiaozhu Zhang, Dirk Witthaut, Marc Timme, et al. 2020. Topological determinants of perturbation spreading in networks. Physical Review Letters 125, 21 (2020), 218301.
  • Zhao et al. (2024) Kai Zhao, Qiyu Kang, Yang Song, Rui She, Sijie Wang, and Wee Peng Tay. 2024. Adversarial robustness in graph neural networks: A Hamiltonian approach. Advances in Neural Information Processing Systems 36 (2024).
  • Zhuang (2024) Jun Zhuang. 2024. Robust Data-centric Graph Structure Learning for Text Classification. In Companion Proceedings of the ACM on Web Conference 2024. 1486–1495.
  • Zhuang and Al Hasan (2022a) Jun Zhuang and Mohammad Al Hasan. 2022a. Defending graph convolutional networks against dynamic graph perturbations via bayesian self-supervision. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 4405–4413.
  • Zhuang and Al Hasan (2022b) Jun Zhuang and Mohammad Al Hasan. 2022b. Deperturbation of online social networks via bayesian label transition. In Proceedings of the 2022 SIAM International Conference on Data Mining (SDM). SIAM, 603–611.
  • Zhuang and Al Hasan (2022c) Jun Zhuang and Mohammad Al Hasan. 2022c. Robust node classification on graphs: Jointly from bayesian label transition and topology-based label propagation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2795–2805.
  • Zhuang and Hasan (2022) Jun Zhuang and Mohammad Al Hasan. 2022. How does bayesian noisy self-supervision defend graph convolutional networks? Neural Processing Letters 54, 4 (2022), 2997–3018.
  • Zhuang and Hasan (2023) Jun Zhuang and Mohammad Al Hasan. 2023. Robust Node Representation Learning via Graph Variational Diffusion Networks. arXiv preprint arXiv:2312.10903 (2023).
  • Zhuang and Kennington (2024) Jun Zhuang and Casey Kennington. 2024. Understanding Survey Paper Taxonomy about Large Language Models via Graph Representation Learning. arXiv preprint arXiv:2402.10409 (2024).