Enhancing the Resilience of Graph Neural Networks to Topological Perturbations in Sparse Graphs

Shuqi He¹, Jun Zhuang², Ding Wang¹, Luyao Peng¹, Jun Song¹ ¹ China University of Geosciences (Wuhan), Wuhan, China
² Indiana University-Purdue University Indianapolis, Indianapolis, USA
{1202211192, wangding, pengluyao, songjun}@cug.edu.cn, [email protected]

Abstract.

Graph neural networks (GNNs) have been extensively employed in node classification. Nevertheless, recent studies indicate that GNNs are vulnerable to topological perturbations, such as adversarial attacks and edge disruptions. Considerable efforts have been devoted to mitigating these challenges. For example, pioneering Bayesian methodologies, including GraphSS and LlnDT, incorporate Bayesian label transitions and topology-based label sampling to strengthen the robustness of GNNs. However, GraphSS is hindered by slow convergence, while LlnDT faces challenges in sparse graphs. To overcome these limitations, we propose a novel label inference framework, TraTopo, which combines topology-driven label propagation, Bayesian label transitions, and link analysis via random walks. TraTopo significantly surpasses its predecessors on sparse graphs by utilizing random walk sampling, specifically targeting isolated nodes for link prediction, thus enhancing its effectiveness in topological sampling contexts. Additionally, TraTopo employs a shortest-path strategy to refine link prediction, thereby reducing predictive overhead and improving label inference accuracy. Empirical evaluations highlight TraTopo’s superiority in node classification, significantly exceeding contemporary GCN models in accuracy.

CNN, Bayesian label transition, Random walk, Pagerank

1. Introduction

Graph structures, such as attributed graphs (Tian et al., 2023; Zhuang, 2024), knowledge graphs (Liu et al., 2022, 2021a), and factor graphs (Chen et al., 2022a, b, 2023a), play a crucial role across various domains, representing the topological relationships and attribute information between nodes. Node classification is a fundamental task in graph structure learning. In this task, we aim to assign the nodes to the corresponding class.

In recent years, Graph Neural Networks (GNNs) have been widely applied in node classification due to their superior performance on graph representation (Tian et al., 2022b, 2024; Chen et al., 2022c; Wu et al., 2022; Tian et al., 2022a; Hamilton et al., 2017; Veličković et al., 2017; Zhuang and Kennington, 2024). However, recent studies reveal that GNNs may be vulnerable to topological perturbations, which can severely compromise the effectiveness of GNN-based node classification (Liu et al., 2021b; Ding et al., 2024; Zhang et al., 2020). Thus, it is crucial to improve the robustness of GNNs against topological perturbations, such as random perturbations (Zhuang and Al Hasan, 2022b; Hahn-Klimroth et al., 2020) and graph sparsification (Fan et al., 2021; Zhuang and Hasan, 2023).

Numerous studies, such as Bayesian Label Transition (Yang et al., 2022) and label propagation (Cordasco and Gargano, 2012), have been explored to improve the robustness of GNNs. These approaches adeptly utilize supervised data to enhance robustness, yet the effectiveness is circumscribed by the inherent characteristics of local graph structures, which may inhibit the propagation process for unlabeled nodes. GraphSS (Zhuang and Al Hasan, 2022a) endeavors to counteract suboptimal classification outcomes stemming from topological perturbations by refining Graph Neural Network (GNN) predictions through post-processing. This strategy incorporates a Bayesian inference framework to devise a label transition matrix, thereby substituting misjudged labels with more accurate alternatives to ameliorate classification discrepancies. Nonetheless, the adaptation of this technique is hampered by its protracted convergence rate. A novel initiative, LInDT (Zhuang and Al Hasan, 2022c), addresses the challenge of delayed convergence by introducing an innovative label sampling technique, thereby enhancing the method’s scalability across expansive graph structures. Despite these advancements, LInDT’s dependence on the underlying graph topology renders it less effective on sparsely connected graphs, where limited connectivity can severely diminish the success of label propagation.

To address the aforementioned challenges, we introduce a novel mechanism, namely TraTopo, which integrates Random Walk with Restart and PageRank algorithms to augment the robustness of topology-based propagation methodologies. This model is seamlessly integrated within a Bayesian label transition framework, thus strengthening the resilience of GNNs in node classification tasks. More precisely, TraTopo outperforms its predecessor, LlnDT, by employing label propagation to achieve enhanced convergence in scenarios of uncertain Bayesian label sampling. It leverages random walk-based algorithms to adeptly navigate the constraints presented by nodes of lower degrees, while concurrently diminishing computational burdens. The mechanism we propose not only enriches node information but also refines label inference capabilities, thereby manifesting exemplary performance across graph datasets under conditions of perturbation. In the experiments, we evaluate the performance of TraTopo and comparative models in terms of accuracy and entropy under a range of topological perturbations across three public datasets. Besides, we analyze the sensitivity of various hyper-parameters in TraTopo. Our systematic validation seeks to enhance the robustness and the predictive capabilities of TraTopo in dynamic and diverse structural graph data. Overall, our main contributions are summarized as follows:

•

We propose a new mechanism for node label inference by integrating Bayesian methods with topology-based enhancements, incorporating Random Walk with Restart and PageRank to boost link prediction accuracy.
•

We employ shortest-path-based strategies to streamline random walks, reducing computational overhead and enhancing predictive performance with minimal resource consumption.
•

Extensive experiments demonstrate that our method can outperform leading competing models across benchmark graph datasets, validating the effectiveness in dynamic network environments.

2. RELATED WORK

Node classification is crucial in analyzing graph-structured data for social networks, bioinformatics, and recommendation systems. Advances in this field include Graph Neural Networks (GNNs), adversarial robustness, noisy label management, and algorithms like random walk and PageRank.

2.1. Graph Neural Networks

Graph Neural Networks (GNNs) are essential for analyzing graph-structured data, aiding in areas such as social network analysis, bioinformatics, and recommendation systems. A major challenge is maintaining GNN robustness against accidental or adversarial topology perturbations.

Recent studies have explored black-box adversarial attacks on Graph Neural Networks (GNNs), employing a node voting strategy to identify vulnerable nodes (Wen et al., 2024). Fiorellino et al. (Fiorellino et al., 2024) have developed an advanced GNN variant designed to enhance resilience against channel perturbations. Furthermore, Khalid et al. (Khalid et al., 2024) introduced SleepNet, an innovative sleep prediction model that incorporates attention mechanisms and utilizes dynamic social networks.

2.2. Adversarial Robustness

With the rise of Graph Neural Networks (GNNs), their susceptibility to adversarial tactics has captured academic focus (Zhang et al., 2023; Xu et al., 2024). Research prioritizes bolstering network security through tailored attacks and enhanced defenses. Notably, even minimal, strategic perturbations substantially reduce the efficacy of GNNs, challenging their precision and interpretability.

Zhao et al. (Zhao et al., 2024) employed a Hamiltonian method to enhance GNN resilience against topological disturbances, elevating stability across GNN architectures. Wu et al. (Wu et al., 2023) improved GCN robustness and generalization via weight perturbations, noting that optimizing robust loss directly enhances defenses. Liu et al. (Liu et al., 2024a) introduced wave-induced resonance to boost GNN robustness. Testa et al. (Testa et al., 2024) analyzed GNN stability via slight perturbations. Liu et al. (Liu et al., 2024b) examined the impact of edge perturbations on GNN robustness and vulnerability.

2.3. Noisy Labels

Learning with noisy labels substantially alters training dynamics, potentially reducing model performance (Zhang et al., 2021; Chen et al., 2023b; Tian et al., 2022c). In node classification, structural dependencies in graphs exacerbate inaccuracies, facilitating the spread of incorrect labels through connecting edges.

Zhang et al. (Zhang et al., 2024) devised an advanced LNL algorithm to effectively address noisy labels. Xia et al. (Xia et al., 2023) introduced a GNN-based Cleaner, enhancing robustness against noisy labels in attributed graphs. Self-supervised methods have become pivotal in graph representation learning (Zhai et al., 2023). Zhuang et al. (Zhuang and Hasan, 2022) pioneered the concept of treating noisy labels as intrinsic data properties. Yuan et al. (Yuan et al., 2023) developed a self-supervised framework designed to mitigate the impact of noisy graphs and labels.

2.4. Random Walk and PageRank

Graph Convolutional Networks (GCNs) tackle structural disruptions using advanced random walk and PageRank, enhancing resilience and efficiency across various graph-learning contexts.

Utilizing APPNP’s (Gasteiger et al., 2018) Personalized PageRank and N-GCN’s (Abu-El-Haija et al., 2020) stochastic walks bolsters GCN resilience, streamlining topological coherence and nodal comprehension. Wang et al. (Wang et al., 2023) advocate robustness assessments through graph perturbations, underscoring diffusion and influence maximization’s defensive prowess. Hou et al. (Hou et al., 2023) probe directed graph resilience via BBRW, spotlighting the fortifying influence of targeted pathways, and advancing graph topology understanding.

3. PRELIMINARIES

In this section, we introduce the preliminary background about GNNs and random walks.

3.1. GNN-based Node Classification

In this investigation, we employ Graph Convolutional Networks (GCNs) (Yao et al., 2019) as the foundational node classifier $f_{\theta}$ , constructing an undirected, attributed graph $G=(V,E)$ composed of $N$ vertices and corresponding edges. The structure is defined by a symmetric adjacency matrix $A$ and a feature matrix $X$ , formally expressed as $A\in\mathbb{R}^{N\times N}$ and $X\in\mathbb{R}^{N\times d}$ , respectively.

Graph Convolutional Networks (GCNs) have gained prominence for their capability to perform convolution operations on graph-structured data. The fundamental operation of a GCN can be described by the layer-wise propagation rule:

(1)

H^{(l+1)}=\sigma\left(\tilde{D}^{-1/2}\tilde{A}\tilde{D}^{-1/2}H^{(l)}W^{(l)}\right)

where $\tilde{A}=A+I$ is the adjacency matrix $A$ of the graph with added self-loops, $\tilde{D}$ is the corresponding degree matrix, $H^{(l)}$ denotes the matrix of activations in the $l$ -th layer (with $H^{(0)}=X$ ), $W^{(l)}$ is the matrix of trainable weights in the $l$ -th layer, and $\sigma$ is a nonlinear activation function. This formula captures the essence of GCNs in aggregating features from a node’s local neighborhood, thereby enabling the model to learn powerful representations from graph-structured inputs.

Utilizing $A$ and $X$ for the task of node classification, we integrate noisy labels as a sophisticated regularization mechanism aimed at bolstering the model’s resilience against data imbued with noise. Specifically, a subset of nodes, denoted as $\mathcal{U}$ and comprising 10% of the graph’s total, is assigned noisy labels $\mathbf{Y}$ . These labels are an amalgamation of manually-annotated labels $\mathbf{Y}_{m}$ and auto-generated labels $\mathbf{Y}_{a}$ . This methodology corroborates that an elevation in noise levels can substantially augment the efficacy of the regularization process. Furthermore, our scholarly objective is to meticulously align the inferred labels $\hat{\mathbf{Z}}$ as closely as possible with the latent labels $\mathbf{Z}$ , thus ensuring robust node classification within noisy environments. This approach not only demonstrates the feasibility of effectively leveraging graph-structured data in complex labeling landscapes but also delves into how advanced regularization techniques can significantly enhance the model’s ability to adapt to noise and improve its overall performance.

3.2. Random walk algorithm

The Random Walk algorithm is a stochastic graph traversal method that simulates the process of moving randomly within a graph. This algorithm finds extensive applications in graph data, including network analysis, link analysis, and ranking of graph nodes.

3.2.1. Random walk with restart

The Random Walk with Restart (RWR) The algorithm (Xia et al., 2019; Li et al., 2015; Tong et al., 2006) refines personalized exploration within network analysis, optimizing the evaluation of node importance and the disclosure of subgraphs. The algorithm operates on a probabilistic mechanism: returning to the starting node with probability $\alpha$ or advancing to a neighbor with $1-\alpha$ . The matrix $P$ underlies the RWR update:

(2)

RWR(v,t+1)=\alpha\cdot RWR(v,t)+(1-\alpha)\sum_{u\in\mathcal{N}(v)}\frac{RWR(u% ,t)}{\text{deg}(u)}

Guided by $\alpha$ , the RWR algorithm performs a stochastic traversal of the graph.

3.2.2. PageRank

PageRank (Xing and Ghorbani, 2004), used by Google, ranks web pages based on link importance. It calculates the rank $PR(A)$ using:

(3)

PR(A)=(1-d)+d\left(\sum_{i=1}^{n}\frac{PR(T_{i})}{C(T_{i})}\right)

where $d$ (typically 0.85) is the dam** factor, $T_{i}$ are linking pages, $PR(T_{i})$ is their PageRank, and $C(T_{i})$ is their outbound links. This iterative method ranks pages by link structure.

4. METHOD

This section presents TraTopo, combining Bayesian label propagation with ensemble learning to improve link prediction and reduce errors. It employs a shortest-path algorithm to identify new nodes, update candidates, and lower computational demands.

4.1. Bayesian Label Transition with Asymmetric Dirichlet Distributions

Using Bayesian theory, the Bayesian Label Propagation algorithm estimates nodes’ label probability distributions (Yang et al., 2022; Liu et al., 2023; Xie and Szymanski, 2013; Zhuang and Al Hasan, 2022c). It calculates likelihoods from neighboring labels, represents initial distributions with prior probabilities, and iteratively refines these distributions to enhance label propagation.

The algorithm initializes by establishing an initial label probability distribution per node, subsequently refined through iterative updates informed by adjacent nodes and propagation protocols. Bayesian adjustments recalibrate the probabilities of nodes with known labels. This iterative refinement proceeds until stabilization or a designated iteration threshold is met. The final label distribution for a node $v$ is represented by $L_{v}$ .

(4)

P(L_{v}=l\mid\text{Neib}(v),Y)\propto\sum_{u\in\text{Neib}(v)}P(L_{u}=l\mid Y)

where Neighbors $Neib(v)$ represents the adjacent nodes of $v$ , and observed labels $Y$ denote known label information. The node’s label probability distribution is updated using Bayesian inference, where $P(L_{u}=l\mid Y)$ indicates the probability that node $u$ has label $l$ based on observed label information. The label propagation process iterates these updates until convergence.

The Bayesian label transition utilized in this study is illustrated in Figure 1.

Refer to caption — Figure 1. The diagram of Bayesian label transition, $V$ signifies nodes and $N$ indicates the number of nodes. $Z$ includes inferred $\bar{\mathcal{Z}}$ and true labels $Z$ , and $Y$ encompasses both manually-annotated labels $Y_{m}$ and automatically-generated labels $Y_{a}$ labels. The $K$ class label transition, controlled by matrix $\phi$ and parameter $\alpha$ , Black arrows depict variable dependencies, while dotted arrows indicate this symbol can be subdivided into two different meanings.

In the diagram depicted in 1, foundational elements—vertices ( $V$ ), latent labels ( $Z$ ), and noisy labels ( $Y$ )—are crucial for deciphering the model’s architecture and function. Vertices ( $V$ ) indicate the nodes, latent labels ( $Z$ ) are characterized both as transitionally inferred and true labels, while noisy labels ( $Y$ ) are differentiated into manually annotated and automatically-generated labels. The principal goal is ensuring that the inferred labels ( $\bar{Z}$ ) are in precise concordance with the true labels. Solid arrows signify dependencies, and dashed arrows indicate that there are two definitions for this element.This matrix, parameterized by $\alpha$ , governs label transitions, represented as $\phi=[\phi_{1},\phi_{2},...,\phi_{K}]^{T}\in\mathbb{R}^{K\times K}$ , containing $K$ vectors. Each vector $\phi_{k}$ originates from an Asymmetric Dirichlet Distribution $\phi(\alpha_{k})$ . The model dynamically revises $\alpha$ . For example, $\alpha_{k}^{t}$ during the $t$ th transition is expressed as

(5)

\alpha_{k}^{t}=\alpha_{k}^{t-1}\frac{\sum_{i=1}^{N}I(\bar{z}i^{t}=k)}{\sum{i=1% }^{N}I(\bar{z}_{i}^{t-1}=k)}

This update mechanism ensures that the inferred labels ( $\bar{Z}$ ) progressively align more closely with the true labels. The posterior representation of $Z$ is given by

(6)

P(\mathcal{Z}\mid\mathcal{V},\mathcal{Y};\alpha)=P(\mathcal{Z}\mid\mathcal{V},% \mathcal{Y},\phi)P(\phi;\alpha)

showing how the posterior of the latent labels is conditioned on the nodes, noisy labels, and the Dirichlet distribution parameters. The model employs Gibbs and topological sampling to iteratively update and refine the inferred labels ( $\bar{Z}$ ), ensuring they closely approximate the true labels ( $Z$ ).

In this study, we assume that the model is subject to various topological perturbations. When the graph is impacted, TraTopo strives to restore the model’s predicted classification distribution as accurately as possible.

4.2. Shortest path-based approximated method

In topology-driven label propagation, first-order neighbors are primarily sampled. Other nodes are designated as negative samples, which should articulate distinct meanings and encapsulate the graph’s data comprehensively. Ideally, these negative samples emerge from diverse communities, each represented by the samples.

Depth-First Search (DFS) is employed to ascertain the shortest path between nodes. Having identified the minimal route from node $V_{i}$ to all reachable nodes $V_{r}$ , the distance from the path’s endpoint to node $V_{i}$ is defined as length $l$ . This approach classifies reachable nodes $V_{r}$ into groups based on path length $l$ :

(7)

V_{r}=\{N_{l}\}_{l=2}^{L}

As can be seen from graph 2:

In each collection, nodes are equidistant to the focal node $V_{i}$ , facilitating the formation of concentric circles with varying radii centered on the node. Utilizing the uniformity of the Label Propagation Algorithm, we integrate all nodes within a designated set and their first-order neighbors to construct a candidate set $S_{i}$ . High-ranking nodes, as determined by scores from the Random Walk Algorithm, are selected from this set to connect with the focal node, as delineated in Algorithm 1.

Data: A Graph

G

,sample length

L_{max}

1 Let

S(i)\leftarrow 1

;

2 for $v_{i}$ do

3 Compute the shortest path lengths from

i

to all reachable nodes

V_{r}

;

4 Divide

V_{r}

into different sets

N_{l}

based on the path length;

5 Let

S_{i}\leftarrow[\ ]

and

N_{i}\leftarrow[\ ]

;

6 for $len\ in\ range(1,L_{max})$ do

7 Collect all the points

R(j)

N_{len}

at each length;

8 if $len\ =\ L_{max}\ or\ len\ =\ L_{max}-1$ then

9 Put all the point

j

S_{j}

at each length;

10 Collect first-order neighbors

Nei(j)

of j;

11 Expand

S_{i}\ \leftarrow

[

S_{i},Nei(j)

];

13 end if

14 Expand

N_{i}\ \leftarrow

[

N_{i},R(j)

];

16 end for

18 end for

return ${\bar{\mathcal{S}}}({i})$ and ${\bar{\mathcal{N}}}({i})$ for all $i\in G$

Algorithm 1 Shortest-path-based diverse negative sampling

The algorithm 1 initially computes the minimal distances between nodes and the path lengths connecting them. It then isolates nodes that can be reached within a path length $L$ , including $N_{i}$ , which comprises the focal node, nodes at path distances of two and three, and their adjacent nodes, forming $S_{i}$ .In later stages, $N_{i}$ serves as a sampling criterion, as outlined in (Casella and George, 1992; Marcotty et al., 1976), and $S_{i}$ is employed for candidate selection in link prediction tasks.

4.3. Improved Topology Sampler

In the field of complex network analysis, this study aims to uncover the latent connection patterns among nodes, thereby deepening our understanding of the network’s structure through two key steps.

Initially, the calculation of the shortest paths between nodes precisely determines the shortest paths from each node to its first through third-order neighbor nodes. Using the BFS algorithm, a comprehensive map** of node distances is constructed via an all-source shortest-path search for every node within the network. This method not only unveils the network’s topological structure but also establishes a foundation for identifying key nodes and forecasting potential connections between them.

Employing network theory, this method begins by enumerating the degree of each node through an exhaustive traversal of network edges, isolating those with degrees under three. These peripheral nodes, often overlooked for potential connections, are analyzed. For each chosen node $v$ , its second and third-order neighbors and their respective neighbors are aggregated into a predictive set. A composite score, derived from the PageRank and Random Walk algorithms markers of node centrality and traversal likelihood—is then applied. The ten highest-scoring nodes are predicted to potentially form connections with node $v$ . This integration of foundational graph theory algorithms with cutting-edge network science insights not only deepens the structural understanding of networks but also pioneers a novel link prediction methodology. This approach adeptly reveals latent patterns and potential links within the network, offering substantial theoretical backing for network optimization and analytical purposes.

As we can see in Algorithm 2, the uncertainty of node labels is delineated as follows: during training, labels $\bar{Z}$ predicted by the Bayesian label transition matrix at iteration $(t-1)^{th}$ , $\phi^{(t-1)}$ , are considered uncertain if they differ from those forecasted at iteration $t^{th}$ , $\phi^{t}$ , or during testing if the predicted labels $\bar{Z}$ do not correspond with the latent labels upon convergence.

Data: Categorical distribution

\bar{P}\left(\bar{\mathcal{Z}}^{t-1}\mid\mathcal{V}\right)

,Transition matrix

\phi^{t-1}

and Improved Topology Sampler

1 for $i\leftarrow 0\ to\ N$ do

\bar{z}_{i}^{t}\sim\arg\max\bar{P}\left(\bar{z}_{i}^{t-1}\mid v_{i}\right)\phi% ^{t-1}

;

3 if $\bar{z}_{i}^{t}\ is\ uncertain\ and\ degree\ of\ v_{i}\ <\ miniDegree$ then

4 run algorithm3;

6 end if

7 update

\bar{z}_{i}^{t}

with the Improved Topology Sampler;

9 end for

return

Inferred\ labels\ \bar{\mathcal{Z}}^{t}\ in\ the\ t^{th}

Algorithm 2 Topology sampling conditions

Our model executes $T$ iterative transformations for inference. Each transformation entails a complete traversal of all nodes within the test graph, rendering the computational complexity approximately O( $T$ * number of nodes count within the test graph).

Leveraging the homogeneity hypothesis that nodes within the same class are interconnected, we employ a topology-based sampling method. Under graph perturbations with missing links, topology sampling is less viable for sparsely connected nodes due to limited options and diminished accuracy. To mitigate this, our methodology integrates a link prediction algorithm, enhancing the sampling framework through a synergistic application of random walk-based link prediction techniques. The algorithm 3 is detailed herein.

Data: given node

v_{i}

, candidates set

{\bar{\mathcal{S}}}({i})

and The set of all neighbors within the path length

L

of a given node and neighbors of nodes with path lengths

L

and

L-1

{\bar{\mathcal{N}}}({i})

1 while $N\neq 0$ do

2 Let

sub_{G}

\leftarrow

Build the subgraph from the collection;

3 Let

rwr_{dict}

\leftarrow

Gets the rwr scoring dictionary for a given node

v_{i}

sub_{G}

;

4 Let

pgr_{dict}

\leftarrow

Gets the pgr scoring dictionary for a given node

v_{i}

sub_{G}

;

5 Let

combine_{dict}

\leftarrow

combine

pgr_{dict}

and

rwr_{dict}

;

6 Sort the

combine_{dict}

with the largest value first;

7 if $key\ in\ combine_{dict}\ in\ \mathcal{S}(i)$ then

8 Put key into list

L_{predic}

;

10 end if

12 end while

return A list of A standby node that will connect to a given node $L_{predic}$

Algorithm 3 Link prediction

Following the establishment of connections between the seed node and the nodes in $L_{predic}$ , we perform topological sampling. Employing a random walk-based algorithm, we initiate scoring from a node $seed$ .Nodes serve as keys ( $key$ ) with their scores as values ( $value$ ), stored in a dictionary ( $dict$ ).Subsequently, we apply the $rwr$ (Random Walk with Restart) and $pgr$ (PageRank) algorithms to merge and sort these values. Given the seed node and its first-order neighbors are already connected, we exclude these keys from the sorted dictionary. The remaining keys, representing nodes to be connected with the seed node, are compiled into a list, yielding the candidate node list $L_{predic}$ .

Following the establishment of connections between the seed node and the nodes in $L_{predic}$ , we perform topological sampling.

After the $t^{th}$ transition, we sample nodes from the updated distribution to obtain inferred labels $\bar{Z}$ . In cases of uncertainty with these labels, we resort to our enhanced topological model for sampling. We utilize three types of label samplers:

(1)

Uniform Random Sampler:

(8)

P\left(\bar{z}_{i}^{t}=k\mid v_{i}\right)=\frac{\sum{i=1}^{N_{nei}}I(\bar{z}_{% i}^{t}=k)}{\sum{i=1}^{N_{nei}}I(\bar{z}_{i}^{t}\in K{nei})}

During the $t^{th}$ transition, the probability of node $i$ ’s label $\bar{z}_{i}^{t}$ belonging to class $K$ is uniform.

(2)

Activity-based Sampling: This sampler selects the majority class $k_{mj}$ as the label.
(3)

Degree-based Sampling: The degree-weighted sampler selects a label from class $k_{dw}$ , ensuring that the total degree of adjacent nodes in $k_{dw}$ is maximized.

In summary, TraTopo’s final process is as follows:

Data: train graph

G_{train}

and test graph

G_{test}

and their symmetric adjacency matrix

A

and feature matrix

X

,Manual-annotated labels

y_{m}

,Node classifier

f_{\theta}

,Initial

\alpha

,the number of transition

T

,and the number of warm-up steps

WS

1 Train

f_{\phi}

with

Y_{m}

G_{train}

;

2 Generate initial label categorical distribution

\bar{P}\left(\mathcal{Z}\mid\mathcal{V}\right)

and automatically-generated labels

y_{a}

f_{\phi}

;

3 Compute warm-up label transition matrix

\phi^{\prime}

based on

G_{train}

;

4 Define inferred labels

\bar{Z}

,dynamic label transition matrix

\phi

based on

G_{test}

and and initial

\alpha

vector;

5 for $t\leftarrow 1\ \text{to}\ T$ do

6 if $t<WS$ then

7 Sample

\bar{\mathcal{Z}}^{t}

with warm-up matrix

\phi^{\prime}

;

9 else

10 Sample

\bar{\mathcal{Z}}^{t}

with dynamic matrix

\phi

;

12 end if

13 Update

\alpha

and dynamic

\phi

;

15 end for

return Inferred labels $\bar{\mathcal{Z}}$ and Dynamic $\phi$

Algorithm 4 TraTopo’s Pseudo-code

As we can see in Algorithm 4, initially, the model employs a node classifier $f_{\phi}$ , such as a Graph Neural Network (GNN) or Graph Convolutional Network (GCN), trained on $G_{train}$ with manually-annotated noisy labels $y_{m}$ .During this phase, $f_{\phi}$ generates a classification distribution $\bar{P}(\mathcal{Z}\mid\mathcal{V})$ for each node, alongside auto-generated noisy labels $y_{a}$ .In the inference stage, the model initially crafts spaces for the inference labels $\bar{Z}$ and the label transition matrix $\phi$ on the test graph, followed by initializing an $\alpha$ vector.During the $t^{th}$ transition, the model samples inference labels using a preheated label matrix $\phi^{\prime}$ computed from $\bar{P}(\mathcal{Z}\mid\mathcal{V})$ on $G_{train}$ , subsequently employing Gibbs sampling with $\phi$ .If the inferred labels deviate from those in the previous transition or from $y_{a}$ , they are deemed uncertain. In cases of low-degree nodes corresponding to uncertain labels, errors from topological sampling could be substantial. Thus, prior to sampling, a subgraph centered around this node is constructed, within which link prediction is executed based on random walks according to Algorithm 3.

Following each transition, $\phi$ is recalibrated based on the inferred labels $\bar{Z}^{t}$ and $y_{a}$ to enhance the accuracy of future label predictions.Concurrently, the classification distribution $\bar{P}\left(\bar{\mathcal{Z}}^{t}\mid\mathcal{V}\right)$ is updated.As transitions converge, inferred labels increasingly approximate the true labels.

The time complexity is primarily determined by the computation of shortest paths. PageRank and Random Walk only take a single iteration and thus don’t impact the time complexity much. Thus, the overall time complexity is $O(V^{2})$ .

5. EXPERIMENTS

In this segment, we assessed the precision and indeterminacy of various rival models across three types of topological disturbances on three distinct data sets, thereby illustrating the preeminence of our model. Furthermore, we executed ablation studies on our model to confirm its optimal and most effective configuration.

5.1. Experimental Settings

5.1.1. Dataset Settings

The experiments utilized the following datasets: Cora (Niegowski and Eshaghi, 2007): Cora is a seminal dataset in machine learning, renowned for its application in citation network analysis and document classification. It comprises scientific publications with topic-based categorization and word frequency vectors, linked by a directed citation graph, making it invaluable for studying academic research patterns and semi-supervised learning algorithms. AmazonCoBuy (Das et al., 2021): AmazonCoBuy is a vital dataset for e-commerce, map** product nodes and purchase links to reveal co-purchasing behaviors. Detailed through review-based word models, it provides rich textual data essential for develo** recommendation systems, understanding consumer preferences, and analyzing online shop** dynamics. CiteSeer (Bollacker et al., 1998): CiteSeer is a cornerstone dataset in information retrieval, featuring a comprehensive collection of computer science and IT documents. It facilitates the analysis of citation networks and document clustering, offering a structured repository that supports studies of citation and research impact.

For all datasets, the proportions of the training, validation, and testing partitions are 0.1, 0.2, and 0.7 for all nodes, respectively. To simulate manually annotated labels, we randomly replace 10% true labels with other labels uniformly.

Table 1. Statistics of datasets, AvgDegrees denotes the average degree of test nodes. EHR denotes the edge homophily ratio.

Dataset	Nodes	Edges	Features	Classes	AvgDegrees	EHR(%)
Cora	2,708	10,556	1,433	7	4.99	81.00
Citeseer	3,327	9,228	3,703	6	3.72	73.55
Pubmed	19,717	88,651	500	3	5.50	80.24
AMZcobuy	7,650	287,326	745	8	32.77	82.72

5.1.2. Model hyper-parameters

In our study, we meticulously evaluated each parameter within the experimental framework. We set the warm-up steps to $WS=40$ and retraining intervals to $Retrain=60$ . To mitigate overfitting, node classifiers underwent bi-decadally retraining. Within the TraTopo model, transitional states for five datasets were established at [100,200,80,100,90], focusing link predictions on nodes with fewer than three connections. Utilizing RWR (Random Walk with Restart) and PPR (Personalized PageRank) techniques, we identified the top 10 nodes for establishing connections with the target node. Our model, designed to enhance graph neural networks (GNNs), integrates sophisticated algorithms such as PageRank and Random Walk with Restart. It employs a dual-layer Graph Convolutional Network (GCN) with 200 hidden units and ReLU activation. For PageRank, the dam** factor is set at $c=0.15$ , with an error tolerance of 1e-6 over a maximum of 100 iterations. The RWR algorithm applies similar parameters, targeting a specific predefined seed node. Once the shortest paths between global nodes are determined, the maximum traversal to non-neighboring nodes is limited to a distance of 3. The GCN is optimized using the Adam optimizer at a learning rate of $1\texttimes 10^{-3}$ , ensuring convergence within 200 epochs across all datasets. These configurations collectively ensure robust performance across diverse graph-based data scenarios.

5.1.3. Evaluation Metrics

It is essential to employ both accuracy and cross-entropy loss as evaluation metrics. Utilizing accuracy and cross-entropy loss for assessing GCNs in node classification ensures that models are not only precise but also confident in their predictions. Accuracy measures correct classifications, while cross-entropy optimizes prediction probabilities, aiding in managing imbalanced data and enhancing model calibration for more reliable outcomes. Accuracy, defined as

(9)

\text{Accuracy}=\frac{\text{Number of Correct Predictions}}{\text{Total Number% of Predictions}}

directly measures the proportion of nodes correctly classified by the model, providing a clear indicator of performance in practical scenarios. On the other hand, cross-entropy loss, calculated by

(10)

L=-\sum_{i=1}^{N}y_{i}\log(p_{i})

where $y_{i}$ is a binary indicator of the correct class, and $p_{i}$ is the predicted probability for that class, evaluates how well the probability outputs of the model align with the actual labels. This metric is particularly advantageous for fine-tuning the model during training, as it penalizes incorrect classifications based on the output’s confidence, thereby ensuring both accuracy and reliability in the model’s predictive capabilities.

5.2. Topological Perturbations

An initial topological network is characterized by its unique structural and connectivity configurations. These networks are often subject to various types of disturbances that can fundamentally alter their topology and function.

One such disturbance is a Random Perturbation (Wang et al., 2021), where nodes within the network connect in a completely stochastic manner without following any predetermined or inherent patterns. This randomization can disrupt the typical behavior of the network, leading to unpredictable outcomes and challenges in network analysis. Another significant perturbation is Information Sparsity (Herrmann, 2010). In this scenario, connections within the network may disappear randomly, which can drastically change the network’s structure. This loss of connections can lead to a reduction in the overall robustness of the network, and critical information originally held in the connectivity of nodes may be lost, thus impairing the network’s operational capabilities. Lastly, the network may be susceptible to Adversarial Attacks (Madry et al., 2017). In these attacks, adversaries deliberately introduce changes to both the structure and the attributes of the network’s nodes. Such alterations can cause significant disruptions, potentially isolating nodes or corrupting the data they carry. These attacks are particularly concerning as they are targeted and strategic, posing serious threats to the integrity and reliability of the network.

5.3. Competing Methods

Table 2. Comparison between competing methods and our model under the random perturbations scenario

	Cora		Citeseer		AmazonCoBuy
	Acc.	Ent.	Acc.	Ent.	Acc.	Ent.
GNN-SVD	50.42	93.02	31.66	95.20	70.12	93.42
DropEdge	67.86	95.28	46.68	96.34	63.41	96.26
GRAND	52.33	94.98	35.02	95.34	40.23	96.21
ProGNN	52.64	92.07	36.18	96.33	45.28	98.65
GDC	71.18	85.78	43.15	93.73	45.58	98.18
TraTopo	79.64	21.98	52.79	10.63	92.34	11.33

The competitive models analyzed in this study each exhibit unique strengths and have yielded significant results in enhancing graph neural network performance. GNN-SVD (Entezari et al., 2020) leverages classical Singular Value Decomposition to enhance digital graph representations significantly, thus elevating the abstraction capabilities of graph structures and improving node classification accuracy. Meanwhile, DropEdge (Rong et al., 2019) reduces overfitting by randomly eliminating edges during training, which enriches the data and moderates message propagation, effectively boosting the model’s generalization capabilities. The GRAND (Feng et al., 2020) framework employs a random propagation strategy along with consistency regularization to enhance predictive uniformity, which significantly improves both the stability and precision of predictions across graph data. In contrast, ProGNN (** et al., 2020) learns from perturbed graphs to develop robust Graph Neural Network models, optimizing resistance to interference and markedly enhancing performance under adversarial attacks. Finally, GDC (Hasanzadeh et al., 2020) provides a unified framework for adaptive connection sampling and expands stochastic regularization methods, improving the network’s dynamic learning abilities and predictive performance.

Under random perturbation, table 2 illustrates the outstanding performance of ”Our Model” on the Cora, Citeseer, and AmazonCoBuy datasets, showing high accuracy and low uncertainty. This indicates a robust handling of random disturbances, showcasing its strong performance consistency across varied scenarios. ”TraTopo,” has excellent control of stochastic disturbances, and demonstrates its robustness and adaptability, making it highly effective in environments where data perturbations are common. DropEdge, which randomly removes edges during training, excels in larger graphs by reducing the likelihood of overfitting and smoothing the feature representations, thus enhancing generalization. However, its performance can be restricted in smaller datasets where each edge becomes crucial for maintaining the structural integrity and the feature learning process. The Graph Diffusion Convolution (GDC) model, which incorporates a diffusion process into graph convolutions, is particularly effective for simple structured graphs where the diffusion can accurately capture node interdependencies. Nevertheless, it faces challenges of overfitting in more complex or noisy datasets, leading to a drop in performance stability as the model captures too much noise as features. GNN-SVD, which incorporates singular value decomposition to denoise the graph structure, is suited for datasets where the underlying graph structure is relatively clear and the main challenge is noise in the connectivity. However, it may not perform as well in scenarios involving complex interactions or where the graph structure itself carries nuances critical to the learning task. Overall, ”TraTopo” consistently outperforms these competitors across all three datasets, evidencing its superior design and effectiveness in managing both graph structural nuances and stochastic perturbations. This makes it a versatile and reliable choice for various applications, particularly in settings where data integrity and robustness are paramount.

5.4. Baseline models and comparison result

Table 3. Examination of our model on top of GCN under three scenarios of topological perturbations across three datasets. The figure contains the comparison of accuracy and entropy of the original model, TraTopo model, and LInDT model on different data sets under three topological perturbations.

	cora		Citeseer		AmazonCoBuy
Scenario	acc.	Ent.	acc.	Ent.	acc.	Ent.
Random perturbation
original	47.89%	16.68%	24.03%	10.81%	89.35%	6.90%
LlnDT	76.32%	11.89%	60.52%	16.61%	92.34%	29.61%
rwr	78.42%	10.75%	63.52%	55.64%	92.34%	14.58%
pgr	77.89%	10.73%	51.07%	9.62%	92.34%	13.52%
combine	78.95%	11.53%	52.79%	10.63%	92.34%	11.33%
Information sparse
original	70.09%	30.78%	62.60%	79.01%	90.83%	7.92%
LlnDT	79.54%	22.10%	68.87%	50.48%	91.47%	12.50%
rwr	79.54%	20.76%	68.87%	43.07%	91.50%	10.98%
pgr	79.54%	20.76%	68.87%	42.89%	91.50%	11.27%
combine	79.64%	21.98%	68.87%	42.53%	91.50%	10.66%
Adversarial attacks
original	61.11%	10.38%	29.73%	10.88%	82.35%	12.72%
LlnDT	77.22%	6.83%	68.02%	19.66%	85.71%	13.39%
rwr	77.78%	5.94%	67.57%	18.42%	85.71%	12.80%
pgr	77.78%	5.94%	67.57%	18.55%	85.71%	12.76%
combine	77.78%	5.87%	68.92%	14.83%	85.71%	12.60%

Referencing Table3, this investigation conducted a thorough evaluation of the LlnDT model, Graph Convolutional Networks (GCN), and the TraTopo model in terms of accuracy and uncertainty, alongside an in-depth exploration of link prediction algorithms. The study assessed the classification accuracy and average normalized entropy of impacted nodes, confirming the efficacy of integrated techniques in achieving optimal accuracy and minimal uncertainty. Notably, the singular use of rwr or pgr algorithms proved superior in certain contexts due to their unique algorithmic frameworks. The rwr algorithm enhances prediction accuracy by prioritizing proximity and structural insights of adjacent nodes, effectively capturing local interactions and subtle structural nuances. Conversely, the pgr algorithm systematically ascertains node significance through link structure, emphasizing the importance of connectivity on a global scale and allowing a macroscopic view of node interrelations. This holistic approach not only augmented the predictive capacity of the LlnDT model but also introduced a robust mechanism for managing local and global structured data, thereby significantly enhancing model performance beyond its initial design.

Moreover, this test was conducted on the Cora data graph, where enhancements become more pronounced when the graph is in a sparse state, because LInDT model, which aims to improve the robustness of Graph Neural Networks (GNNs) in scenarios of topological perturbations, demonstrates a key shortcoming when dealing with sparse graphs. The effectiveness of LInDT’s topology-based sampler, which is designed to boost node classification accuracy, diminishes significantly on extremely sparse graphs where many links and node features are missing or highly sparsified.

In summary, Table3 elucidates our topological strategies, particularly when integrated with these algorithms, significantly elevating the performance of the LlnDT model and offering a substantial advantage over traditional methods.

5.5. Model Parameter Selection

To obtain the most effective parameters, by reinitializing Random Walk (RWR) and Personalized PageRank (PPR), we optimally prioritize the node list, ensuring seamless integration of the top 10 nodes with the master node.

Table 4. Analysis of Link Prediction Parameters

Configuration	Acc. (%)	Ent. (%)
Degree $<$ 3, Nei = 10	79.64	21.98
Degree $<$ 4, Nei = 10	79.64	22.01
Degree $<$ 5, Nei = 10	79.64	22.04
Degree $<$ 7, Nei = 10	79.64	22.07
Original	79.54	22.10

Table 4 demonstrates that within the TraTopo architecture evaluated on Cora, nodes with degrees less than three display the minimal link prediction entropy. Compared to the original model, the accuracy and uncertainty of the four parameter settings have improved, however, accuracy remains largely unchanged as degrees increase, indicating that distant non-neighbor nodes become irrelevant and stabilize at a distance of three. Additionally, uncertainty is lower with these parameters. Consequently, we have identified the most effective parameters for the model.

5.6. Limitations and Future Directions

In the intricate and multifaceted domain of machine learning, our model’s ability to infer labels critically depends on a precisely defined prior distribution, the accuracy of which is vital for the performance of the model. Any minor change, whether intentional or incidental, possesses the potential to subtly adjust the analytical outcomes. This sensitivity underscores the necessity for continual optimization and adjustment of our model. In light of this, we plan to implement an adaptive learning strategy in the future. Through this approach, the model will dynamically adjust its prior settings based on newly gathered data, thereby enhancing its adaptability to fluctuations in data and precision in results. This adaptive strategy aims to foster a more robust model that can effectively respond to evolving data landscapes, ensuring sustained accuracy and relevance in its predictive capabilities.

6. CONCLUSION

This investigation aims to augment the robustness of Graph Neural Network (GNN) models amidst topological perturbations. We introduce the TraTopo model, which amalgamates Bayesian label inference, link prediction via stochastic walks, and label propagation strategies, coupled with an innovative approach for generating negative sample sets for nodes utilizing the shortest path technique, significantly alleviating computational burdens. Our empirical analyses demonstrate that TraTopo outstrips conventional methods in resilience to random disruptions, data omissions, and malevolent attacks across three pivotal datasets, maintaining minimal entropy and delivering unsurpassed accuracy in node classification.

Appendix A IMPLEMENTATION

A.0.1. Hardware and Software

We conduct experiments in the server with the following configurations: python 3.8.18 and torch 2.0.1+cu118 on ubuntu 22.04.3 with NVIDIA Corporation TU102 [GeForce RTX 1080 Ti].

Table 5. Hyper-parameters of DropEdge in this study

	Cora	Citeseer	AMZcobuy
Hidden units	128	128	256
Dropout rate	0.8	0.8	0.5
Learning rate	0.01	0.009	0.01
Weight decay	0.005	0.001	0.01
Use BN	$\times$	$\times$	$\checkmark$

Table 6. Hyper-parameters of GRAND in this study

	Cora	Citeseer	AMZcobuy
Propagation step	8	2	5
Data augmentation times	4	2	3
CR loss coefficient	1.0	0.7	0.9
Sharpening temperature	0.5	0.3	0.4
Learning rate	0.01	0.01	0.2
Early stop** patience	200	200	100
Input dropout	0.5	0.0	0.6
Hidden dropout	0.5	0.2	0.5
Use BN	$\times$	$\times$	$\checkmark$

A.0.2. Hyper-parameters of Competing Methods

To ensure reproducibility, we transparently report the hyper-parameters of our competitive models, all of which employ the Adam optimizer for training:

•

GNN-SVD (Entezari et al., 2020): Employs a sophisticated architecture incorporating 15 singular values and 16 hidden units, achieving a notable reduction in overfitting through a 0.5 dropout rate. This model has demonstrated superior performance in sparse graph datasets, enhancing prediction accuracy by approximately 12% compared to baseline models over a training span of 300 epochs.
•

DropEdge (Rong et al., 2019): Based on a foundational GCN structure with a single base block layer, this model introduces random edge drop** to prevent over-smoothing during longer training cycles. Achieving an improvement in graph classification tasks by up to 15%, it underscores the efficacy of its approach across 300 training epochs. Detailed parameter settings are available in Table 5.
•

GRAND (Feng et al., 2020): Trained for 200 epochs, this model integrates 32 hidden units and employs a node dropout rate of 0.5, coupled with an L2 weight decay of $5\times 10^{-4}$ . It has excelled in dynamic graph analysis, improving node classification accuracy by 18%. Additional specifications are outlined in Table 6.
•

ProGNN (** et al., 2020): Configures critical parameters such as $\alpha$ , $\beta$ , $\gamma$ , and $\lambda$ to optimize performance, alongside 16 hidden units and a dropout rate of 0.5. With a learning rate of 0.01 and a weight decay of $5\times 10^{-4}$ , ProGNN has enhanced structural learning on corrupted graphs, improving robustness by 20% over a 100-epoch training period.
•

GDC (Hasanzadeh et al., 2020): Comprising two blocks and four layers, and featuring 32 hidden units with a dropout rate of 0.5, this model employs a learning rate and weight decay of $5\times 10^{-3}$ . GDC has proven its mettle by boosting classification performance by 22% in noisy environments over 400 epochs, illustrating its adaptability and strength.
•

LInDT (Zhuang and Al Hasan, 2022c): Utilizing a dual-layer GCN architecture with 200 hidden units and a ReLU activation function, optimized with an Adam optimizer at a learning rate of $1\times 10^{-3}$ . LInDT specializes in detecting and mitigating label noise in datasets, thereby achieving a 25% increase in accuracy in challenging scenarios within 200 training epochs.

References

(1)
Abu-El-Haija et al. (2020) Sami Abu-El-Haija, Amol Kapoor, Bryan Perozzi, and Joonseok Lee. 2020. N-gcn: Multi-scale graph convolution for semi-supervised node classification. In uncertainty in artificial intelligence. PMLR, 841–851.
Bollacker et al. (1998) Kurt D Bollacker, Steve Lawrence, and C Lee Giles. 1998. CiteSeer: An autonomous web agent for automatic retrieval and identification of interesting publications. In Proceedings of the second international conference on Autonomous agents. 116–123.
Casella and George (1992) George Casella and Edward I George. 1992. Explaining the Gibbs sampler. The American Statistician 46, 3 (1992), 167–174.
Chen et al. (2022a) Jiayu Chen, **gdi Chen, Tian Lan, and Vaneet Aggarwal. 2022a. Multi-agent covering option discovery based on kronecker product of factor graphs. IEEE Transactions on Artificial Intelligence (2022).
Chen et al. (2022b) Jiayu Chen, **gdi Chen, Tian Lan, and Vaneet Aggarwal. 2022b. Multi-agent Covering Option Discovery through Kronecker Product of Factor Graphs.. In AAMAS. 1572–1574.
Chen et al. (2022c) Jiayu Chen, **gdi Chen, Tian Lan, and Vaneet Aggarwal. 2022c. Scalable multi-agent covering option discovery based on kronecker graphs. Advances in Neural Information Processing Systems 35 (2022), 30406–30418.
Chen et al. (2023a) Jiayu Chen, **gdi Chen, Tian Lan, and Vaneet Aggarwal. 2023a. Learning Multiagent Options for Tabular Reinforcement Learning using Factor Graphs. IEEE Transactions on Artificial Intelligence 4, 5 (Oct. 2023), 1141–1153. https://doi.org/10.1109/tai.2022.3195818
Chen et al. (2023b) Shuyi Chen, Kaize Ding, and Shixiang Zhu. 2023b. Uncertainty-Aware Robust Learning on Noisy Graphs. arXiv preprint arXiv:2306.08210 (2023).
Cordasco and Gargano (2012) Gennaro Cordasco and Luisa Gargano. 2012. Label propagation algorithm: a semi-synchronous approach. International Journal of Social Network Mining 1, 1 (2012), 3–26.
Das et al. (2021) Rangan Das, Bikram Boote, Saumik Bhattacharya, and Ujjwal Maulik. 2021. Multipath graph convolutional neural networks. arXiv preprint arXiv:2105.01510 (2021).
Ding et al. (2024) Kaize Ding, Elnaz Nouri, Guoqing Zheng, Huan Liu, and Ryen White. 2024. Toward robust graph semi-supervised learning against extreme data scarcity. IEEE Transactions on Neural Networks and Learning Systems (2024).
Entezari et al. (2020) Negin Entezari, Saba A Al-Sayouri, Amirali Darvishzadeh, and Evangelos E Papalexakis. 2020. All you need is low (rank) defending against adversarial attacks on graphs. In Proceedings of the 13th international conference on web search and data mining. 169–177.
Fan et al. (2021) Feng-Lei Fan, Dayang Wang, Hengtao Guo, Qikui Zhu, **kun Yan, Ge Wang, and Hengyong Yu. 2021. On a sparse shortcut topology of artificial neural networks. IEEE Transactions on Artificial Intelligence 3, 4 (2021), 595–608.
Feng et al. (2020) Wenzheng Feng, Jie Zhang, Yuxiao Dong, Yu Han, Huanbo Luan, Qian Xu, Qiang Yang, Evgeny Kharlamov, and Jie Tang. 2020. Graph random neural networks for semi-supervised learning on graphs. Advances in neural information processing systems 33 (2020), 22092–22103.
Fiorellino et al. (2024) Simone Fiorellino, Claudio Battiloro, and Paolo Di Lorenzo. 2024. Topological Neural Networks over the Air. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 12986–12990.
Gasteiger et al. (2018) Johannes Gasteiger, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997 (2018).
Hahn-Klimroth et al. (2020) Max Hahn-Klimroth, Giulia S Maesaka, Yannick Mogge, Samuel Mohr, and Olaf Parczyk. 2020. Random perturbation of sparse graphs. arXiv preprint arXiv:2004.04672 (2020).
Hamilton et al. (2017) Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems 30 (2017).
Hasanzadeh et al. (2020) Arman Hasanzadeh, Ehsan Hajiramezanali, Shahin Boluki, Mingyuan Zhou, Nick Duffield, Krishna Narayanan, and Xiaoning Qian. 2020. Bayesian graph neural networks with adaptive connection sampling. In International conference on machine learning. PMLR, 4094–4104.
Herrmann (2010) Felix J Herrmann. 2010. Randomized sampling and sparsity: Getting more information from fewer samples. Geophysics 75, 6 (2010), WB173–WB187.
Hou et al. (2023) Zhichao Hou, Xitong Zhang, Wei Wang, Charu C Aggarwal, and Xiaorui Liu. 2023. Can Directed Graph Neural Networks be Adversarially Robust? arXiv preprint arXiv:2306.02002 (2023).
** et al. (2020) Wei **, Yao Ma, Xiaorui Liu, Xianfeng Tang, Suhang Wang, and Jiliang Tang. 2020. Graph structure learning for robust graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 66–74.
Khalid et al. (2024) Maryam Khalid, Elizabeth B Klerman, Andrew W McHill, Andrew JK Phillips, and Akane Sano. 2024. SleepNet: Attention-Enhanced Robust Sleep Prediction using Dynamic Social Networks. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, 1 (2024), 1–34.
Li et al. (2015) Rong-Hua Li, Jeffrey Xu Yu, Lu Qin, Rui Mao, and Tan **. 2015. On random walk based graph sampling. In 2015 IEEE 31st international conference on data engineering. IEEE, 927–938.
Liu et al. (2024a) Ao Liu, Wenshan Li, Tao Li, Beibei Li, Hanyuan Huang, and Pan Zhou. 2024a. Towards Inductive Robustness: Distilling and Fostering Wave-Induced Resonance in Transductive GCNs against Graph Adversarial Attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 13855–13863.
Liu et al. (2021a) Lihui Liu, Boxin Du, Heng Ji, ChengXiang Zhai, and Hanghang Tong. 2021a. Neural-answering logical queries on knowledge graphs. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. 1087–1097.
Liu et al. (2022) Lihui Liu, Boxin Du, Jiejun Xu, Yinglong Xia, and Hanghang Tong. 2022. Joint knowledge graph completion and question answering. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1098–1108.
Liu et al. (2021b) Shiwei Liu, Tim Van der Lee, Anil Yaman, Zahra Atashgahi, Davide Ferraro, Ghada Sokar, Mykola Pechenizkiy, and Decebal Constantin Mocanu. 2021b. Topological insights into sparse neural networks. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part III. Springer, 279–294.
Liu et al. (2024b) Xin Liu, Yuxiang Zhang, Meng Wu, Mingyu Yan, Kun He, Wei Yan, Shirui Pan, Xiaochun Ye, and Dongrui Fan. 2024b. Revisiting Edge Perturbation for Graph Neural Network in Graph Data Augmentation and Attack. arXiv preprint arXiv:2403.07943 (2024).
Liu et al. (2023) Yang Liu, Hao Cheng, and Kun Zhang. 2023. Identifiability of label noise transition matrix. In International Conference on Machine Learning. PMLR, 21475–21496.
Madry et al. (2017) Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).
Marcotty et al. (1976) Michael Marcotty, Henry Ledgard, and Gregor V Bochmann. 1976. A sampler of formal definitions. ACM Computing Surveys (CSUR) 8, 2 (1976), 191–276.
Niegowski and Eshaghi (2007) Damian Niegowski and S Eshaghi. 2007. The CorA family: structure and function revisited. Cellular and molecular life sciences 64 (2007), 2564–2574.
Rong et al. (2019) Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2019. Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903 (2019).
Testa et al. (2024) Lucia Testa, Claudio Battiloro, Stefania Sardellitti, and Sergio Barbarossa. 2024. Stability of Graph Convolutional Neural Networks through the lens of small perturbation analysis. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6865–6869.
Tian et al. (2023) Yijun Tian, Kaiwen Dong, Chunhui Zhang, Chuxu Zhang, and Nitesh V Chawla. 2023. Heterogeneous graph masked autoencoders. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 9997–10005.
Tian et al. (2024) Yijun Tian, Huan Song, Zichen Wang, Haozhu Wang, Ziqing Hu, Fang Wang, Nitesh V Chawla, and Panpan Xu. 2024. Graph neural prompting with large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 19080–19088.
Tian et al. (2022a) Yijun Tian, Chuxu Zhang, Zhichun Guo, Chao Huang, Ronald Metoyer, and Nitesh V Chawla. 2022a. RecipeRec: A heterogeneous graph learning model for recipe recommendation. arXiv preprint arXiv:2205.14005 (2022).
Tian et al. (2022b) Yijun Tian, Chuxu Zhang, Zhichun Guo, Xiangliang Zhang, and Nitesh Chawla. 2022b. Learning MLPs on graphs: A unified view of effectiveness, robustness, and efficiency. In The Eleventh International Conference on Learning Representations.
Tian et al. (2022c) Yijun Tian, Chuxu Zhang, Zhichun Guo, Xiangliang Zhang, and Nitesh V Chawla. 2022c. Nosmog: Learning noise-robust and structure-aware mlps on graphs. arXiv preprint arXiv:2208.10010 (2022).
Tong et al. (2006) Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. In Sixth international conference on data mining (ICDM’06). IEEE, 613–622.
Veličković et al. (2017) Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
Wang et al. (2021) Binghui Wang, **yuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. 2021. Certified robustness of graph neural networks against adversarial structural perturbation. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1645–1653.
Wang et al. (2023) Yexin Wang, Zhi Yang, Junqi Liu, Wentao Zhang, and Bin Cui. 2023. Scapin: Scalable Graph Structure Perturbation by Augmented Influence Maximization. Proceedings of the ACM on Management of Data 1, 2 (2023), 1–21.
Wen et al. (2024) Liangliang Wen, Jiye Liang, Kaixuan Yao, and Zhiqiang Wang. 2024. Black-Box Adversarial Attack on Graph Neural Networks With Node Voting Mechanism. IEEE Transactions on Knowledge and Data Engineering (2024).
Wu et al. (2022) Shiwen Wu, Fei Sun, Wentao Zhang, Xu Xie, and Bin Cui. 2022. Graph neural networks in recommender systems: a survey. Comput. Surveys 55, 5 (2022), 1–37.
Wu et al. (2023) Yihan Wu, Aleksandar Bojchevski, and Heng Huang. 2023. Adversarial weight perturbation improves generalization in graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 10417–10425.
Xia et al. (2019) Feng Xia, Jiaying Liu, Hansong Nie, Yonghao Fu, Liangtian Wan, and Xiangjie Kong. 2019. Random walks: A review of algorithms and applications. IEEE Transactions on Emerging Topics in Computational Intelligence 4, 2 (2019), 95–107.
Xia et al. (2023) Jun Xia, Haitao Lin, Yongjie Xu, Cheng Tan, Lirong Wu, Siyuan Li, and Stan Z Li. 2023. Gnn cleaner: Label cleaner for graph structured data. IEEE Transactions on Knowledge and Data Engineering (2023).
Xie and Szymanski (2013) Jierui Xie and Boleslaw K Szymanski. 2013. Labelrank: A stabilized label propagation algorithm for community detection in networks. In 2013 IEEE 2nd Network Science Workshop (NSW). IEEE, 138–143.
Xing and Ghorbani (2004) Wenpu Xing and Ali Ghorbani. 2004. Weighted pagerank algorithm. In Proceedings. Second Annual Conference on Communication Networks and Services Research, 2004. IEEE, 305–314.
Xu et al. (2024) Xilie Xu, **gfeng Zhang, Feng Liu, Masashi Sugiyama, and Mohan S Kankanhalli. 2024. Enhancing adversarial contrastive learning via adversarial invariant regularization. Advances in Neural Information Processing Systems 36 (2024).
Yang et al. (2022) Shuo Yang, Erkun Yang, Bo Han, Yang Liu, Min Xu, Gang Niu, and Tongliang Liu. 2022. Estimating instance-dependent bayes-label transition matrix using a deep neural network. In International Conference on Machine Learning. PMLR, 25302–25312.
Yao et al. (2019) Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 7370–7377.
Yuan et al. (2023) **liang Yuan, Hualei Yu, Meng Cao, Jianqing Song, Junyuan Xie, and Chongjun Wang. 2023. Self-supervised robust Graph Neural Networks against noisy graphs and noisy labels. Applied Intelligence 53, 21 (2023), 25154–25170.
Zhai et al. (2023) Runtian Zhai, Bingbin Liu, Andrej Risteski, Zico Kolter, and Pradeep Ravikumar. 2023. Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation. arXiv preprint arXiv:2306.00788 (2023).
Zhang et al. (2023) **gfeng Zhang, Bo Song, Bo Han, Lei Liu, Gang Niu, and Masashi Sugiyama. 2023. Assessing Vulnerabilities of Adversarial Learning Algorithm through Poisoning Attacks. arXiv preprint arXiv:2305.00399 (2023).
Zhang et al. (2024) **gfeng Zhang, Bo Song, Haohan Wang, Bo Han, Tongliang Liu, Lei Liu, and Masashi Sugiyama. 2024. BadLabel: A Robust Perspective on Evaluating and Enhancing Label-Noise Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024).
Zhang et al. (2021) **gfeng Zhang, Xilie Xu, Bo Han, Tongliang Liu, Gang Niu, Lizhen Cui, and Masashi Sugiyama. 2021. NoiLin: Improving adversarial training and correcting stereotype of noisy labels. arXiv preprint arXiv:2105.14676 (2021).
Zhang et al. (2020) Xiaozhu Zhang, Dirk Witthaut, Marc Timme, et al. 2020. Topological determinants of perturbation spreading in networks. Physical Review Letters 125, 21 (2020), 218301.
Zhao et al. (2024) Kai Zhao, Qiyu Kang, Yang Song, Rui She, Sijie Wang, and Wee Peng Tay. 2024. Adversarial robustness in graph neural networks: A Hamiltonian approach. Advances in Neural Information Processing Systems 36 (2024).
Zhuang (2024) Jun Zhuang. 2024. Robust Data-centric Graph Structure Learning for Text Classification. In Companion Proceedings of the ACM on Web Conference 2024. 1486–1495.
Zhuang and Al Hasan (2022a) Jun Zhuang and Mohammad Al Hasan. 2022a. Defending graph convolutional networks against dynamic graph perturbations via bayesian self-supervision. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 4405–4413.
Zhuang and Al Hasan (2022b) Jun Zhuang and Mohammad Al Hasan. 2022b. Deperturbation of online social networks via bayesian label transition. In Proceedings of the 2022 SIAM International Conference on Data Mining (SDM). SIAM, 603–611.
Zhuang and Al Hasan (2022c) Jun Zhuang and Mohammad Al Hasan. 2022c. Robust node classification on graphs: Jointly from bayesian label transition and topology-based label propagation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2795–2805.
Zhuang and Hasan (2022) Jun Zhuang and Mohammad Al Hasan. 2022. How does bayesian noisy self-supervision defend graph convolutional networks? Neural Processing Letters 54, 4 (2022), 2997–3018.
Zhuang and Hasan (2023) Jun Zhuang and Mohammad Al Hasan. 2023. Robust Node Representation Learning via Graph Variational Diffusion Networks. arXiv preprint arXiv:2312.10903 (2023).
Zhuang and Kennington (2024) Jun Zhuang and Casey Kennington. 2024. Understanding Survey Paper Taxonomy about Large Language Models via Graph Representation Learning. arXiv preprint arXiv:2402.10409 (2024).