Distance Recomputator and Topology Reconstructor for Graph Neural Networks

Dong Liu1 Meng Jiang2 [email protected] [email protected]
Abstract

Graph Neural Networks (GNNs) have gained prominence in semi-supervised learning for graph representation due to their ability to capture intricate node relationships. Recently, there is a trend for k-hop structure learning for GNNs. While GAMLP [ZYS+22] trains an MLP layer for each k-hop domain, ImprovingTE [YWZL23] enhances this approach by injecting contextualized substructure information to effectively utilize the k-hop structure. However, those traditional k-hop sampling approaches have largely relied on sampling performance, which limits the upper bound of accuracy and made the outcome unstable. To address this limitation, inspired by ”coraset selection”[GZB22] idea, we develop a novel approach that facilitates k-hop structure sampling and message passing, extending the reach and depth of information flow within the graph. To tackle the challenges of mislabeling and inaccuracies in datasets, we introduce two innovative models: the ”Distance Recomputator” and the ”Topology Reconstructor.” The Distance Recomputator recalibrates the distances between nodes, thereby refining node representations and interactions in a more accurate and context-aware manner. Complementing this, the Topology Reconstructor dynamically adjusts local graph structures, enhancing the model’s adaptability to complex and evolving graph topologies. Our experimental results indicate significant performance enhancements of these models over existing benchmarks.

1University of Wisconsin-Madison,

2University of Notre Dame

1 Introduction

Graph Neural Networks (GNNs) have emerged as a powerful tool in the realm of machine learning, adept at capturing the complex relationships inherent in graph-structured data. From social network analysis to molecular structure interpretation, GNNs have demonstrated remarkable versatility. However, their ability to model dependencies and interactions within graph structures is fundamentally constrained by the methods used to compute and represent node relationships. Besides, GNN in graph local structure learning (K-hop) has became more and more popular, such as [WYHL21]. Traditional GNN architectures, while effective, often struggle with efficiently encoding the dynamic and intricate topologies of real-world graphs. This limitation motivates the need for more adaptive and robust models that can better capture the nuances of graph data.

Current GNN models predominantly rely on static node representations and fixed neighborhood aggregation schemes, leading to suboptimal performance in scenarios where graph topology is not only complex but also dynamic. Additionally, the computation of node distances in large graphs often incurs significant computational overhead, thereby limiting scalability. These challenges are accentuated in applications involving large-scale graphs or rapidly evolving network structures, such as in communication networks or dynamic social graphs. The limitations in handling varying node distances and adapting to topological changes prompt the exploration of more flexible and efficient approaches.

To address these challenges, we introduce two novel models: the ”Distance Recomputator” and the ”Topology Reconstructor.” The Distance Recomputator is designed to efficiently recalibrate node distances within a specified k-hop domain, leveraging a dynamic encoding scheme that adapts to changes in node proximity and graph density. This allows for a more nuanced representation of node relationships, enhancing the accuracy of dependency modeling in the network. Complementing this, the Topology Reconstructor dynamically adjusts the local network topology, enabling the model to respond to structural changes in the graph. By integrating these models into standard GNN frameworks, we propose a solution that not only addresses the static nature of traditional models but also introduces a level of adaptability hitherto unseen in GNN architectures.

Our experimental evaluation of these models demonstrates a marked improvement in performance across a range of benchmark datasets, particularly in tasks involving dynamic or large-scale graphs. The Distance Recomputator shows enhanced efficiency in recalculating node distances, leading to more accurate node representations and predictions. Similarly, the Topology Reconstructor proves effective in adapting to topological changes, thereby maintaining model robustness. Complementing these empirical results, we provide a comprehensive theoretical analysis that elucidates the mechanisms by which these models achieve superior performance. This analysis not only validates our experimental findings but also contributes to a deeper understanding of the underlying principles governing effective GNN design.

2 Motivation

2.1 Topology imbalance Problem

”Not every link are useful as their topology reflected”. The topology of a graph in GNNs does not always accurately reflect the significance of each connection. [LLC+23] Consider a social network: an account labeled ”math” might follow one food-maker and ten mathematical educators. In this context, the link to the food-maker is less relevant for label prediction tasks. Current graph neural models, such as GNNs, indiscriminately propagate, aggregate, and update information from all 1-hop neighbors, including less relevant connections like the food-maker. This issue highlights the ’Topology Imbalance Problem,’ where certain links (e.g., those to math educators) are more informative and relevant than others (e.g., the food-maker link).

Although models like Graph Attention Networks (GAT) assign an importance score to each link, they rely on a synchronous aggregator, treating all nodes within the same hop identically during propagation and update phases. To address this imbalance, our paper introduces the ’Distance Recomputator’ model, which recalculates node distances to better reflect the relevance of links. Complementarily, we propose an ’Asynchronous Aggregator’ that enables nodes to be aggregated based on these recalculated distances, allowing for more nuanced and context-sensitive information processing.

2.2 The Shortcoming of Synchronous Aggregator

Synchronous aggregators in Graph Neural Network process neighbors in one aggregation, the synchronous approach can inadvertently amplify the impact of less relevant or erroneous links, potentially degrading the model’s performance. It fails to differentiate between the varying levels of relevance among neighboring nodes, treating all connections as equally significant during the aggregation process. Our proposed asynchronous aggregator aims to mitigate this shortcoming by allowing for selective, relevance-based aggregation of neighborhood information, thereby enhancing the model’s accuracy and robustness against irrelevant or misleading connections.

Some current work shows the asynchronous processing of GNNs, such as AEGNN[SGS22], which designs update rules that restrict recomputation of network activations only to the nodes based on each new event, while Gated Graph Sequence Neural Networks [LTBZ17] deploys a method that can use gated recurrent units to extend to output sequences of GNNs to realize asynchronous processing. While all above methods give solution for asynchronous processing in message passing, they did not give the solution at a topology level.

2.3 Heat Diffusion in Graph Neural Networks

Heat Diffusion in GNNs is inspired by the physical phenomenon of heat transimisision, where heat (information) dissipates as it moves away from the heat source.[TDKF16] Our work also explores the concept of heat diffusion in the context of GNNs, particularly in k-hop message passing. We simulate thermodynamic properties, where information intensity decreases progressively during propagation. By applying this concept to GNNs, we introduce a novel method to control information flow, ensuring that the propagation of data mimics natural attenuation over distance. This method not only provides a more realistic approach to information dissemination in networks but also helps in reducing the noise and enhancing the signal-to-noise ratio in the message-passing process. More details about this innovative approach can be found in our supplementary document, ”Heat Diffusion in GNNs”.

More details in this link: Heat Diffusion in GNNs

2.4 Paper Contribution

  • K-hop Diffusion Message Passing Strategy: We develop a message passing framework enabling vertices to learn k-hop information at each propagation stage. Post-propagation, node representations and inter-node distances are recalibrated, reflecting the acquired k-hop information. This layer-by-layer information dissemination, incorporating heat diffusion within the k-hop domain, refines node characteristics and relational metrics continuously.

  • Distance Recomputator: Our model employs an attention mechanism to recalculate node distances, considering both k-hop topology and vertex features. This recalibration allows for accurate, context-aware node relationship representations within the graph.

  • Topology Reconstructor: Introducing the first k-hop topology reconstruction method, our model leverages k-hop topology from sampling to perform reconstruction based on computed ”similarity distances.” Nodes exceeding a similarity distance threshold are repositioned to optimize network configurations for enhanced learning outcomes.

3 Related Work and Innovation

3.1 K-hop Message Passing

The prevailing K-Hop models in graph neural networks predominantly emphasize K-time message passing with 1-hop propagation schemes. Existing models lack the capability for k-hop message passing with k-hop propagation in a single iteration, or for one-time k-hop sampling methods. For instance, the enhancing multi-hop connectivity model reevaluates connectivity in the k-hop neighborhood by sampling k-hop neighbors and reweighting high-order connections, rewarding highly-related nodes and penalizing less-correlated ones [LJZ+22]. Another model, KP-GNN, incorporates peripheral embeddings to enrich representation learning at each layer [FCL+22]. Our model breaks new ground by efficiently sampling k-hop neighborhood information and realizing k-hop message passing within this domain. By exploring k-hop neighborhoods in each propagation cycle, our framework captures more locality information than conventional 1-hop methods, thereby enhancing label prediction accuracy for the central node with a comprehensive k-hop perspective.

3.2 Diffusion Propagation in Graph Neural Networks

In our study, we introduce a novel model that diverges from traditional diffusion-based approaches like the Diffusion Decent Network. A classic model in this realm is MAGNA [WYHL21], which facilitates information descent through each propagation stage. In MAGNA, vertices and edges embeddings are trained to recompute distances for both one-hop neighbors and n-hop (n\leqk) neighbors, considering all possible i-hop (i\leqk) paths. Conversely, ”Diffusion Improves Graph Learning” [KWG19] approximates the diffusion equation with an infinite series, enhancing operational speed on large graphs. Our development, a k-hop message passing model, simulates heat diffusion in the k-hop domain of graph neural networks, offering a more advanced approach to information propagation and distribution.

3.3 Graph Imbalancing Problems

The issue of label and topology imbalance in graph neural networks has been relatively underexplored. Zhao et al. [ZLZW22] proposed adjusting edge weights to mitigate topology imbalance effects without altering the graph structure. In contrast, our research introduces two innovative models to address this challenge comprehensively. The first, GKHDDRA, utilizes hop jum** to adjust the graph’s topology. The second, GDRA, fine-tunes the dataset by discarding irrelevant edges and forming new connections between strongly similar nodes, resha** the graph’s topology to achieve a more balanced and representative structure.

4 Basic Methodology on Graph Neural Networks

Consider an undirected graph 𝒢=(𝒱,)𝒢𝒱\mathcal{G}=(\mathcal{V},\mathcal{E})caligraphic_G = ( caligraphic_V , caligraphic_E ), with the node set 𝒱𝒱\mathcal{V}caligraphic_V and edge set \mathcal{E}caligraphic_E. The adjancy matrix A \in RN×Nsuperscript𝑅𝑁𝑁R^{N\times N}italic_R start_POSTSUPERSCRIPT italic_N × italic_N end_POSTSUPERSCRIPT describes the existence of connections among nodes. Every node vi𝒱subscript𝑣𝑖𝒱v_{i}\in\mathcal{V}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_V has an associated feature vector xiRd×1subscript𝑥𝑖superscript𝑅𝑑1x_{i}\in R^{d\times 1}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT italic_d × 1 end_POSTSUPERSCRIPT, so the whole feature vector space can be represented as X=[x1,x2,,xN]T𝑋superscriptsubscript𝑥1subscript𝑥2subscript𝑥𝑁𝑇X=[x_{1},x_{2},...,x_{N}]^{T}italic_X = [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, so that the goal is to predict the labels of the remaining nodes. For a semi-supervised node classification task on a graph, the labels YLsubscript𝑌𝐿Y_{L}italic_Y start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT are only available for a subset of the noes (training set), so the goal is to predict the labels of the remaining nodes.

In GNN tasks, the topology information and node feature are combined to learn the representation vetor of a node for node tasks. Modern GNNs aggregate the information from nodes in neighborhood and update the representation of the nodes by a message-passing scheme. After k-iterations of aggregation, the representation of a node captures the structural information within its k-hop neighborhood. Formally, the layer-wise aggregation of a GNN is given by

av(k)=Aggregate(k)({huk1:uN(v)}),hv(k)=Combine(k)(hv(k1),av(k))formulae-sequencesuperscriptsubscript𝑎𝑣𝑘𝐴𝑔𝑔𝑟𝑒𝑔𝑎𝑡superscript𝑒𝑘conditional-setsuperscriptsubscript𝑢𝑘1𝑢𝑁𝑣superscriptsubscript𝑣𝑘𝐶𝑜𝑚𝑏𝑖𝑛superscript𝑒𝑘superscriptsubscript𝑣𝑘1superscriptsubscript𝑎𝑣𝑘a_{v}^{(k)}=Aggregate^{(k)}(\{h_{u}^{k-1}:u\in N(v)\}),h_{v}^{(k)}=Combine^{(k% )}(h_{v}^{(k-1)},a_{v}^{(k)})italic_a start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT = italic_A italic_g italic_g italic_r italic_e italic_g italic_a italic_t italic_e start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( { italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT : italic_u ∈ italic_N ( italic_v ) } ) , italic_h start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT = italic_C italic_o italic_m italic_b italic_i italic_n italic_e start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( italic_h start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_k - 1 ) end_POSTSUPERSCRIPT , italic_a start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ) (1)

where hv(k)superscriptsubscript𝑣𝑘h_{v}^{(k)}italic_h start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT is the feature vector of node v at the kthsuperscript𝑘𝑡k^{th}italic_k start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT layer. We initialize the hv(0)=xvsuperscriptsubscript𝑣0subscript𝑥𝑣h_{v}^{(0)}=x_{v}italic_h start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT = italic_x start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT, and the 𝒩(𝓋)𝒩𝓋\mathcal{N(v)}caligraphic_N ( caligraphic_v ) is a set of nodes connected to v𝑣vitalic_v. Existing work are focus on how to improve the work within 1-hop Aggregate(k)()˙Aggregate^{(k)}(\dot{)}italic_A italic_g italic_g italic_r italic_e italic_g italic_a italic_t italic_e start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( over˙ start_ARG ) end_ARG and Combine(k)()˙Combine^{(k)}(\dot{)}italic_C italic_o italic_m italic_b italic_i italic_n italic_e start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( over˙ start_ARG ) end_ARG operations. For example, GCN employs a convolution operation with the following rule

H(l+1)=σ(A^H(l),θ(l))superscript𝐻𝑙1𝜎^𝐴superscript𝐻𝑙superscript𝜃𝑙H^{(l+1)}=\sigma(\hat{A}H^{(l)},\theta^{(l)})italic_H start_POSTSUPERSCRIPT ( italic_l + 1 ) end_POSTSUPERSCRIPT = italic_σ ( over^ start_ARG italic_A end_ARG italic_H start_POSTSUPERSCRIPT ( italic_l ) end_POSTSUPERSCRIPT , italic_θ start_POSTSUPERSCRIPT ( italic_l ) end_POSTSUPERSCRIPT ) (2)

where H(l)superscript𝐻𝑙H^{(l)}italic_H start_POSTSUPERSCRIPT ( italic_l ) end_POSTSUPERSCRIPT are the output features in the lthsuperscript𝑙𝑡l^{th}italic_l start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT layer. We initialize the H(0)=Xsuperscript𝐻0𝑋H^{(0)}=Xitalic_H start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT = italic_X, A^=D^12(A+I)D^12^𝐴superscript^𝐷12𝐴𝐼superscript^𝐷12\hat{A}=\hat{D}^{-\frac{1}{2}}(A+I)\hat{D}^{-\frac{1}{2}}over^ start_ARG italic_A end_ARG = over^ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( italic_A + italic_I ) over^ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT is the Laplacian normalized adjacency matrix, and the θ(l)superscript𝜃𝑙\theta^{(l)}italic_θ start_POSTSUPERSCRIPT ( italic_l ) end_POSTSUPERSCRIPT is the lthsuperscript𝑙𝑡l^{th}italic_l start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT layer weights of the neural network. GNN models stack GNN layer to explore the representation in larger domain, take two-layer GCN model as an example, the represetnation can be calculated as Z=softmax(A^σ(A^Xθ(0))θ(1))𝑍𝑠𝑜𝑓𝑡𝑚𝑎𝑥^𝐴𝜎^𝐴𝑋superscript𝜃0superscript𝜃1Z=softmax(\hat{A}\sigma(\hat{A}X\theta^{(0)})\theta^{(1)})italic_Z = italic_s italic_o italic_f italic_t italic_m italic_a italic_x ( over^ start_ARG italic_A end_ARG italic_σ ( over^ start_ARG italic_A end_ARG italic_X italic_θ start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ) italic_θ start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ). As for node classification task with the labeled nodes corresponding to YLsubscript𝑌𝐿Y_{L}italic_Y start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT, the objective function is

L=1|𝒴|vi𝒴yilog(zi)𝐿1subscript𝒴subscriptsubscript𝑣𝑖subscript𝒴subscript𝑦𝑖𝑙𝑜𝑔subscript𝑧𝑖L=-\frac{1}{|\mathcal{Y_{L}}|}\sum_{v_{i}\in\mathcal{Y_{L}}}y_{i}log(z_{i})italic_L = - divide start_ARG 1 end_ARG start_ARG | caligraphic_Y start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT | end_ARG ∑ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_Y start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_l italic_o italic_g ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (3)

4.1 The In-Depth Learning of Graph Neural Networks

Contemporary Graph Neural Network (GNN) models predominantly adopt a 1-hop depth message passing paradigm, where each propagation cycle explores information at a singular depth level. In contrast to GraphSAGE [HYL17], which samples along k-depth but aggregates at every depth, our approach aims to develop a model capable of both sampling and aggregating in a k-hop domain. This method is anticipated to yield a more comprehensive understanding of the local structural context within the network.

We propose an in-depth learning algorithm for k-hop sampling and message passing in GNNs, characterized by the following steps:

In the proposed Graph Neural Network model, we initiate with a Static Preprocessing Step, where a substantial number of neighbors for each node are sampled. This phase involves constructing a k-hop adjacency matrix for every node, effectively capturing and storing the k-hop neighborhood information. This process is computationally efficient, with a complexity of 𝒪(𝒱𝒦)𝒪𝒱𝒦\mathcal{O}(\mathcal{V*K})caligraphic_O ( caligraphic_V ∗ caligraphic_K ), where 𝒱𝒱\mathcal{V}caligraphic_V represents the total number of vertices in the graph. Following this, the model enters the Dynamic Update Phase during propagation and update cycles. In this phase, a smaller, more focused subset of k-hop neighbors for each node is dynamically sampled. This targeted approach allows for the refinement of the pre-processed k-hop neighborhood matrix, ensuring that the model remains up-to-date and relevant in representing the evolving graph structure. The computational complexity of this phase is maintained at 𝒪(𝒦)𝒪𝒦\mathcal{O(K*R)}caligraphic_O ( caligraphic_K ∗ caligraphic_R ), with \mathcal{R}caligraphic_R indicating the limited number of vertices sampled in the k-hop domain during each propagation step. Together, these steps create a robust framework for in-depth learning in Graph Neural Networks, optimizing both the breadth and depth of neighborhood information processing.

The k-hop attentive message passing strategy is underpinned by both dynamic and static sampling paths. This mechanism is grounded in the principles of graph diffusion networks, while the computation of distance reevaluation indicators is inspired by and derived from the architecture of the Graph Attention Network (GAT) [VCC+18]. This novel approach aims to harness the strengths of both static and dynamic sampling methodologies to enhance the depth and accuracy of message passing in GNNs.

Input :  Graph 𝒢(𝒱,)𝒢𝒱\mathcal{G(V,E)}caligraphic_G ( caligraphic_V , caligraphic_E ); depth K𝐾Kitalic_K; Sampling number per hop N𝑁Nitalic_N; Input features: Z𝑍Zitalic_Z; Adjacency Matrix: A;
Output : NGH //K-hop Sampling Storage, a 3D dictionary
NGH[0][:] \leftarrow A
// Static Sampling before Computation
for n in V𝑉Vitalic_V do
       for k=K1𝑘𝐾1k=K...1italic_k = italic_K … 1 do
             SN=RS(N,Layer(k1))𝑆𝑁𝑅𝑆𝑁𝐿𝑎𝑦𝑒superscript𝑟𝑘1SN=RS(N,Layer^{(k-1)})italic_S italic_N = italic_R italic_S ( italic_N , italic_L italic_a italic_y italic_e italic_r start_POSTSUPERSCRIPT ( italic_k - 1 ) end_POSTSUPERSCRIPT ) for visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in layer(n,k-1) do
                   NGH[k][n]NGH[k][n]GraphSAGE(vi,SN[i])𝑁𝐺𝐻delimited-[]𝑘delimited-[]𝑛𝑁𝐺𝐻delimited-[]𝑘delimited-[]𝑛𝐺𝑟𝑎𝑝𝑆𝐴𝐺𝐸subscript𝑣𝑖𝑆𝑁delimited-[]𝑖NGH[k][n]\leftarrow NGH[k][n]\cup GraphSAGE(v_{i},SN[i])italic_N italic_G italic_H [ italic_k ] [ italic_n ] ← italic_N italic_G italic_H [ italic_k ] [ italic_n ] ∪ italic_G italic_r italic_a italic_p italic_h italic_S italic_A italic_G italic_E ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_S italic_N [ italic_i ] )
             end for
            
       end for
      
end for
// Dynamical Re-Sampling during Computation
𝒦superscript𝒦\mathcal{B^{K}}\leftarrow\mathcal{B}caligraphic_B start_POSTSUPERSCRIPT caligraphic_K end_POSTSUPERSCRIPT ← caligraphic_B; // The \mathcal{B}caligraphic_B denotes the current batch of nodes to be processed.for k=K1𝑘𝐾1k=K...1italic_k = italic_K … 1 do
       Bk1Bksuperscript𝐵𝑘1superscript𝐵𝑘B^{k-1}\leftarrow B^{k}italic_B start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ← italic_B start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT; for uisubscript𝑢𝑖u_{i}\in\mathcal{B}italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_B do
             SNsRS(u,N)𝑆subscript𝑁𝑠𝑅𝑆𝑢𝑁SN_{s}\leftarrow RS(u,N)italic_S italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ← italic_R italic_S ( italic_u , italic_N )NGH[k][n]NGH[k][n]GraphSAGE(ui,SNs[i])𝑁𝐺𝐻delimited-[]𝑘delimited-[]𝑛𝑁𝐺𝐻delimited-[]𝑘delimited-[]𝑛𝐺𝑟𝑎𝑝𝑆𝐴𝐺𝐸subscript𝑢𝑖𝑆subscript𝑁𝑠delimited-[]𝑖NGH[k][n]\leftarrow NGH[k][n]\cup GraphSAGE(u_{i},SN_{s}[i])italic_N italic_G italic_H [ italic_k ] [ italic_n ] ← italic_N italic_G italic_H [ italic_k ] [ italic_n ] ∪ italic_G italic_r italic_a italic_p italic_h italic_S italic_A italic_G italic_E ( italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_S italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT [ italic_i ] )
            zuiGKHDA(ui,k,NGH[k][n])subscript𝑧subscript𝑢𝑖𝐺𝐾𝐻𝐷𝐴subscript𝑢𝑖𝑘𝑁𝐺𝐻delimited-[]𝑘delimited-[]𝑛z_{u_{i}}\leftarrow GKHDA(u_{i},k,NGH[k][n])italic_z start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ← italic_G italic_K italic_H italic_D italic_A ( italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_k , italic_N italic_G italic_H [ italic_k ] [ italic_n ] )
       end for
      
end for
Algorithm 1 Preprocess Static Sampling and Dynamic Resampling Strategies
Input : Graph 𝒢(𝒱,)𝒢𝒱\mathcal{G(V,E)}caligraphic_G ( caligraphic_V , caligraphic_E );
input features: xvsubscript𝑥𝑣x_{v}italic_x start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT
depth K;
non-linearity: σ𝜎\sigmaitalic_σ;
differentiable aggregator functions AGGREGATEk𝐴𝐺𝐺𝑅𝐸𝐺𝐴𝑇subscript𝐸𝑘AGGREGATE_{k}italic_A italic_G italic_G italic_R italic_E italic_G italic_A italic_T italic_E start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, k{1,,K}for-all𝑘1𝐾\forall k\in\{1,...,K\}∀ italic_k ∈ { 1 , … , italic_K };
neighborhood sampling functions, Nk:v2v:subscript𝑁𝑘𝑣superscript2𝑣N_{k}:v\longrightarrow 2^{v}italic_N start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT : italic_v ⟶ 2 start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT, k{1,,K}for-all𝑘1𝐾\forall k\in\{1,...,K\}∀ italic_k ∈ { 1 , … , italic_K };
NGH = \leftarrow StaticSampling(𝒢𝒢\mathcal{G}caligraphic_G,K,Z,A);
weight matrices Wksuperscript𝑊𝑘W^{k}italic_W start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT; //learnable matrix
Output : Vector Representation zvsubscript𝑧𝑣z_{v}italic_z start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT for all v𝑣v\in\mathcal{B}italic_v ∈ caligraphic_B
hu0xvsuperscriptsubscript𝑢0subscript𝑥𝑣h_{u}^{0}\leftarrow x_{v}italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ← italic_x start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT, vBfor-all𝑣𝐵\forall v\in B∀ italic_v ∈ italic_B for k=1K𝑘1𝐾k=1...Kitalic_k = 1 … italic_K do
       for uBk𝑢superscript𝐵𝑘u\in B^{k}italic_u ∈ italic_B start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT do
             hukσ(Wk(huk1,hN(u)k))subscriptsuperscript𝑘𝑢𝜎superscript𝑊𝑘superscriptsubscript𝑢𝑘1superscriptsubscript𝑁𝑢𝑘h^{k}_{u}\leftarrow\sigma(W^{k}\cdot(h_{u}^{k-1},h_{N(u)}^{k}))italic_h start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ← italic_σ ( italic_W start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ⋅ ( italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_h start_POSTSUBSCRIPT italic_N ( italic_u ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) ); hukhuk/huk2subscriptsuperscript𝑘𝑢subscriptsuperscript𝑘𝑢subscriptdelimited-∥∥subscriptsuperscript𝑘𝑢2h^{k}_{u}\leftarrow h^{k}_{u}/\lVert h^{k}_{u}\rVert_{2}italic_h start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ← italic_h start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT / ∥ italic_h start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT;
       end for
      
end for
for k=K1𝑘𝐾1k=K...1italic_k = italic_K … 1 do
       for uNGH[u,k]𝑢𝑁𝐺𝐻𝑢𝑘u\in NGH[u,k]italic_u ∈ italic_N italic_G italic_H [ italic_u , italic_k ] do
             hN(u)k𝒜𝒢𝒢𝒢𝒜𝒯k{𝒢𝒜𝒯(huk1,Wk)uNk(u)}subscriptsuperscript𝑘𝑁𝑢𝒜𝒢𝒢𝒢𝒜𝒯subscript𝑘𝒢𝒜𝒯subscriptsuperscript𝑘1superscript𝑢superscript𝑊𝑘for-allsuperscript𝑢subscript𝑁𝑘𝑢h^{k}_{N(u)}\leftarrow\mathcal{AGGREGATE}_{k}\{\mathcal{GAT}(h^{k-1}_{u^{% \prime}},W^{k})\,\forall u^{\prime}\in N_{k}(u)\}italic_h start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N ( italic_u ) end_POSTSUBSCRIPT ← caligraphic_A caligraphic_G caligraphic_G caligraphic_R caligraphic_E caligraphic_G caligraphic_A caligraphic_T caligraphic_E start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT { caligraphic_G caligraphic_A caligraphic_T ( italic_h start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , italic_W start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) ∀ italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_N start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_u ) };
       end for
      
end for
zu𝒞𝒪𝒩(hui),irange(1,K+1)formulae-sequencesubscript𝑧𝑢𝒞𝒪𝒩superscriptsubscript𝑢𝑖for-all𝑖𝑟𝑎𝑛𝑔𝑒1𝐾1z_{u}\leftarrow\mathcal{COMBINE}(h_{u}^{i}),\forall i\in range(1,K+1)italic_z start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ← caligraphic_C caligraphic_O caligraphic_M caligraphic_B caligraphic_I caligraphic_N caligraphic_E ( italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) , ∀ italic_i ∈ italic_r italic_a italic_n italic_g italic_e ( 1 , italic_K + 1 )
Algorithm 2 K-hop Diffusion Attention Layer Implementation

In our Graph Neural Network framework, the implementation of the 𝒞𝒪𝒩𝒞𝒪𝒩\mathcal{COMBINE}caligraphic_C caligraphic_O caligraphic_M caligraphic_B caligraphic_I caligraphic_N caligraphic_E function plays a pivotal role. It can be expressed mathematically as:

zuWihui,irange(1,K+1)formulae-sequencesubscript𝑧𝑢subscript𝑊𝑖superscriptsubscript𝑢𝑖for-all𝑖range1𝐾1z_{u}\leftarrow W_{i}\cdot h_{u}^{i},\quad\forall i\in\text{range}(1,K+1)italic_z start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ← italic_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , ∀ italic_i ∈ range ( 1 , italic_K + 1 ) (4)

where zusubscript𝑧𝑢z_{u}italic_z start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT represents the combined feature vector for a node u𝑢uitalic_u, and huisuperscriptsubscript𝑢𝑖h_{u}^{i}italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT denotes the feature vector of the node at the i𝑖iitalic_i-th hop. Wisubscript𝑊𝑖W_{i}italic_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the weight matrix corresponding to the i𝑖iitalic_i-th hop, ensuring that features from different hops are weighted differently in the aggregation process.

Similarly, the 𝒢𝒜𝒯𝒢𝒜𝒯\mathcal{GAT}caligraphic_G caligraphic_A caligraphic_T function, pivotal in our model for attention mechanism, is formulated as:

αij=LeakyReLU(aT[Whi^||Whj^])kNiLeakyReLU(aT[Whi^||Whk^])\alpha_{ij}=\frac{\text{LeakyReLU}(\vec{a}^{T}[W\hat{h_{i}}\,||\,W\hat{h_{j}}]% )}{\sum_{k\in N_{i}}\text{LeakyReLU}(\vec{a}^{T}[W\hat{h_{i}}\,||\,W\hat{h_{k}% }])}italic_α start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = divide start_ARG LeakyReLU ( over→ start_ARG italic_a end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT [ italic_W over^ start_ARG italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG | | italic_W over^ start_ARG italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ] ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_k ∈ italic_N start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT LeakyReLU ( over→ start_ARG italic_a end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT [ italic_W over^ start_ARG italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG | | italic_W over^ start_ARG italic_h start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG ] ) end_ARG (5)

Here, αijsubscript𝛼𝑖𝑗\alpha_{ij}italic_α start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is the attention coefficient between nodes i𝑖iitalic_i and j𝑗jitalic_j, calculated using the LeakyReLU activation function. a𝑎\vec{a}over→ start_ARG italic_a end_ARG is the attention vector and W𝑊Witalic_W is the weight matrix applied to the feature vectors hi^^subscript𝑖\hat{h_{i}}over^ start_ARG italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG and hj^^subscript𝑗\hat{h_{j}}over^ start_ARG italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG of the nodes i𝑖iitalic_i and j𝑗jitalic_j. The attention mechanism effectively captures the importance of each neighbor’s features in the aggregation process.

In this paper, we adopt a structure that considers both k-hop sampling and k-hop message passing within the graph diffusion network framework. Striking a balance between computational efficiency (time) and model performance, our approach involves static preprocessing for initial k-hop neighborhood sampling, followed by dynamic resampling in subsequent iterations. This methodology ensures that our model remains efficient while effectively capturing the complex dependencies in the graph structure across multiple hops.

4.2 Distance Recomputator and Topology Reconstructor Implementation

Input : Graph 𝒢(V,E)𝒢𝑉𝐸\text{$\mathcal{G}$}(V,E)caligraphic_G ( italic_V , italic_E );
input features: xvsubscript𝑥𝑣x_{v}italic_x start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT, vfor-all𝑣\forall v\in\mathcal{B}∀ italic_v ∈ caligraphic_B;
depth K𝐾Kitalic_K: weight matrices Wksuperscript𝑊𝑘W^{k}italic_W start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT
non-linearity: σ𝜎\sigmaitalic_σ
differentiable aggregator functions AGGREGATEksubscriptAGGREGATE𝑘\text{AGGREGATE}_{k}AGGREGATE start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, k{1,,K}for-all𝑘1𝐾\forall k\in\{1,...,K\}∀ italic_k ∈ { 1 , … , italic_K }
neighborhood sampling functions, Nk:v2v:subscript𝑁𝑘𝑣superscript2𝑣N_{k}:v\longrightarrow 2^{v}italic_N start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT : italic_v ⟶ 2 start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT, k{1,,K}for-all𝑘1𝐾\forall k\in\{1,...,K\}∀ italic_k ∈ { 1 , … , italic_K }
recompute distance upper-bound: α𝛼\alphaitalic_α
recompute distance lower-bound: β𝛽\betaitalic_β
Sample factor: γ𝛾\gammaitalic_γ
Init Global NGH = [0]V×(k×V)subscriptdelimited-[]0𝑉𝑘𝑉[0]_{V\times(k\times V)}[ 0 ] start_POSTSUBSCRIPT italic_V × ( italic_k × italic_V ) end_POSTSUBSCRIPT
Init Global W = [δij]k×Fsubscriptdelimited-[]subscript𝛿𝑖𝑗𝑘𝐹[\delta_{ij}]_{k\times F}[ italic_δ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_k × italic_F end_POSTSUBSCRIPT //learnable matrix
Output : Vector Representation zvsubscript𝑧𝑣z_{v}italic_z start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT for all v𝑣v\in\mathcal{B}italic_v ∈ caligraphic_B
NGH = K-Hop-Sample(𝒢𝒢\mathcal{G}caligraphic_G)
for Each Propagation do
       NGH = NGH \cup Resample(𝒢𝒢\mathcal{G}caligraphic_G, ResampleNum)NGHcomputecompute{}_{\text{compute}}start_FLOATSUBSCRIPT compute end_FLOATSUBSCRIPT = Sample(NGH, γ𝛾\gammaitalic_γ)Zvsubscript𝑍𝑣Z_{v}italic_Z start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = GKHDDRA(𝒢𝒢\mathcal{G}caligraphic_G, NGHcompute𝑁𝐺subscript𝐻computeNGH_{\text{compute}}italic_N italic_G italic_H start_POSTSUBSCRIPT compute end_POSTSUBSCRIPT)DRM = mask(𝒢𝒜𝒯𝒢𝒜𝒯\mathcal{GAT}caligraphic_G caligraphic_A caligraphic_T(𝒢𝒢\mathcal{G}caligraphic_G, Zvsubscript𝑍𝑣{Z_{v}}italic_Z start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT), NGH) for i = (K-1), …, 0 do
             DRM[i+1][:, :] \leftarrow DRM[i+1][:, :] + DRM[i][DRM[i] ¿ α𝛼\alphaitalic_α]DRM[i][:, :] \leftarrow DRM[i][:, :] - DRM[i][DRM[i] ¿ α𝛼\alphaitalic_α]
       end for
      
      for i = 1, …, K do
             DRM[i][:, :] \leftarrow DRM[i][:, :] + DRM[i-1][DRM[i-1] ¡ β𝛽\betaitalic_β]DRM[i-1][:, :] \leftarrow DRM[i-1][:, :] - DRM[i-1][DRM[i-1] ¡ β𝛽\betaitalic_β]
       end for
      
end for
Algorithm 3 Distance Recomputator and Asynchronous Aggregator Implementation

Distance Recomputator and Topology Reconstructor is meticulously designed to dynamically reconfigure graph topology and recalibrate node distances, thereby significantly enhancing the representational capacity and performance of GNNs.

Initial Setup and K-Hop Sampling: The algorithm begins with an initial setup phase where it prepares the graph 𝒢(𝒱,)𝒢𝒱\mathcal{G(V,E)}caligraphic_G ( caligraphic_V , caligraphic_E ) with input features xvsubscript𝑥𝑣x_{v}italic_x start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT for all vertices v𝑣vitalic_v in the batch \mathcal{B}caligraphic_B. Each node is associated with a depth K𝐾Kitalic_K, represented by weight matrices Wksuperscript𝑊𝑘W^{k}italic_W start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT, and a non-linearity function σ𝜎\sigmaitalic_σ. The model incorporates differentiable aggregator functions 𝒜𝒢𝒢𝒢𝒜𝒯k𝒜𝒢𝒢𝒢𝒜𝒯subscript𝑘\mathcal{AGGREGATE}_{k}caligraphic_A caligraphic_G caligraphic_G caligraphic_R caligraphic_E caligraphic_G caligraphic_A caligraphic_T caligraphic_E start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT for each depth k𝑘kitalic_k, along with neighborhood sampling functions Nksubscript𝑁𝑘N_{k}italic_N start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. These components collectively form the foundation for k-hop neighborhood sampling, a crucial step in our model’s operation.

Dynamic Resampling and Recomputation: At the heart of our model lies a dynamic resampling process, which is executed in each propagation phase. This process adapts to the evolving graph structure by resampling a smaller subset of k-hop neighbors for each node. The resampled data, represented by NGH (Neighborhood Graph), is then refined through a series of computations to update the node representations effectively.

Distance Recomputator Mechanism: Central to our model is the Distance Recomputator (DRM), which recalculates the distances between nodes based on certain criteria, including recompute distance bounds α𝛼\mathcal{\alpha}italic_α and β𝛽\mathcal{\beta}italic_β, and a sampling factor γ𝛾\gammaitalic_γ. This mechanism is crucial for adjusting the topological structure of the graph, ensuring that nodes are positioned optimally based on their relational context within the graph.

Graph Attention Network (GAT) Integration: The model integrates the Graph Attention Network (GAT) to compute attention coefficients between nodes. This integration allows the model to weigh the importance of each neighbor’s features in the aggregation process, further refining the distance recomputation and neighborhood sampling.

Asynchronous Aggregation: Another key aspect of our model is the implementation of an asynchronous aggregation approach. Unlike traditional methods that aggregate information simultaneously across all nodes, our model allows for selective and time-staggered aggregation based on the dynamic resampling and recomputation results. This approach ensures a more nuanced and efficient processing of graph data, crucial for handling large-scale and complex networks.

Application and Efficiency: The proposed algorithm is not only theoretically robust but also demonstrates practical efficiency in handling GNN tasks. By balancing the computational demands (time complexity) and model performance, our approach offers a feasible solution for real-world applications requiring in-depth graph analysis and dynamic topology reconstruction.

In summary, our algorithm presents a novel framework that combines k-hop sampling with dynamic resampling and recomputation, all underpinned by an asynchronous aggregation strategy. This comprehensive approach addresses key challenges in GNNs, such as handling complex graph topologies and efficiently updating node representations, thereby setting a new standard in the field of graph neural network research.

Refer to caption
Figure 1: An instance of The Distance Computation and Topology Reconstruction

4.3 Experimental Analysis

Our experimental study focused on evaluating the performance of the proposed models - GKHDA, GDRA, and GKHDDRA - alongside typical scalable Graph Neural Network (GNN) methods, including GCN, SGC, S2GC, and APPNP. The experiments concentrated on graph structure learning and were conducted on benchmark datasets: Cora, Pubmed, and Citeseer.

Performance Overview: The results, as presented in Table LABEL:tab:widgets, showcase the efficacy of our models in comparison with established GNN methods like GAT, GraphSAGE, GCN, SGC, SSGC, and APPNP. Notably, our models - particularly when integrated with existing GNN structures (GCN, SGC, SSGC, and APPNP) - demonstrate superior performance across all datasets.

Dataset-Specific Analysis:

  • Cora: In the Cora dataset, the highest performance was observed with the APPNP+GKHDDRA combination, achieving an accuracy of 84.6%. This result indicates the robustness of GKHDDRA when combined with APPNP’s propagation scheme, which effectively leverages long-range dependencies in the graph.

  • Pubmed: The SGC+GKHDDRA combination outperformed other models with an accuracy of 82.5%. This suggests that the structured sparsity imposed by SGC, coupled with the advanced hop-wise learning capability of GKHDDRA, is particularly effective for the Pubmed dataset’s topology.

  • Citeseer: For Citeseer, the SSGC+GKHDDRA combination achieved the highest accuracy at 75.6%. This underscores the effectiveness of incorporating k-hop sampling and dynamic resampling in dealing with the dataset’s complex graph structure.

Comparative Assessment: The integration of our models with existing GNN frameworks consistently improved performance across datasets. For instance, GCN, when enhanced with GDRA and GKHDDRA, saw notable improvements in accuracy, emphasizing the value added by our distance recomputation and dynamic resampling mechanisms. Similarly, the integration with SGC and SSGC yielded significant performance boosts, highlighting the synergy between our models and scalable GNN methods in handling large-scale graph structures.

Model-Specific Contributions:

  • GKHDA showcased consistent improvements in graph representation learning, particularly in combination with SSGC and APPNP.

  • GDRA excelled in recalibrating the graph topology, which was evident from its strong performance, especially when combined with SGC and SSGC.

  • GKHDDRA emerged as a versatile model, enhancing both graph structure learning and node representation, as reflected in its top-tier results across all datasets, particularly when combined with APPNP.

Conclusion: The experimental results validate the effectiveness of our proposed models in enhancing the learning capabilities of GNNs. The integration of GKHDA, GDRA, and GKHDDRA with existing scalable GNN methods not only improved performance but also demonstrated their adaptability and compatibility with different graph structures and datasets. These findings indicate that our models are not only theoretically sound but also practically potent in a variety of real-world graph learning scenarios.

Refer to caption
Refer to caption
Figure 2: K-Hop SGC (Max Test Acc: 74.2%)
Refer to caption
Refer to caption
Figure 3: GDRA + K-Hop SGC (Max Test Acc: 75.8%)

References

  • [FCL+22] Jiarui Feng, Yixin Chen, Fuhai Li, Anindya Sarkar, and Muhan Zhang. How powerful are k-hop message passing graph neural networks. 05 2022.
  • [GZB22] Chengcheng Guo, Bo Zhao, and Yanbing Bai. Deepcore: A comprehensive library for coreset selection in deep learning, 2022.
  • [HYL17] William L. Hamilton, Rex Ying, and Jure Leskovec. Inductive representation learning on large graphs. CoRR, abs/1706.02216, 2017.
  • [KWG19] Johannes Klicpera, Stefan Weißenberger, and Stephan Günnemann. Diffusion improves graph learning. 2019.
  • [LJZ+22] Songtao Liu, Shixiong **g, Tong Zhao, Zengfeng Huang, and Dinghao Wu. Enhancing multi-hop connectivity for graph convolutional networks. 2022.
  • [LLC+23] Zemin Liu, Yuan Li, Nan Chen, Qian Wang, Bryan Hooi, and Bingsheng He. A survey of imbalanced learning on graphs: Problems, techniques, and future directions, 2023.
  • [LTBZ17] Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. Gated graph sequence neural networks, 2017.
  • [SGS22] S. Schaefer, D. Gehrig, and D. Scaramuzza. Aegnn: Asynchronous event-based graph neural networks. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12361–12371, Los Alamitos, CA, USA, jun 2022. IEEE Computer Society.
  • [TDKF16] Dorina Thanou, Xiaowen Dong, Daniel Kressner, and Pascal Frossard. Learning heat diffusion graphs, 2016.
  • [VCC+18] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph attention networks. 2018.
  • [WYHL21] Guangtao Wang, Rex Ying, **g Huang, and Jure Leskovec. Multi-hop attention graph neural network. 2021.
  • [YWZL23] Tianjun Yao, Yingxu Wang, Kun Zhang, and Shangsong Liang. Improving the expressiveness of k-hop message-passing gnns by injecting contextualized substructure information. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023.
  • [ZLZW22] Tianxiang Zhao, Dongsheng Luo, Xiang Zhang, and Suhang Wang. Topoimb: Toward topology-level imbalance in learning from graphs. 2022.
  • [ZYS+22] Wentao Zhang, Ziqi Yin, Zeang Sheng, Yang Li, Wen Ouyang, Xiaosen Li, Yangyu Tao, Zhi Yang, and Bin Cui. Graph attention multi-layer perceptron. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’22. ACM, August 2022.

Appendix A Appendix

Cora Pubmed Citeseer
GAT 82.1% 77.7% 69%
GraphSAGE 81.5% 70.3% 74%
GDRA 82.5% 77.8% 69.8%
Table 1: Comparision of GDRA and its basic two components (GAT and GraphSAGE)
Cora Pubmed Citeseer
GKHDA 82.3% 78.6% 70.7%
GDRA 82.5% 77.8% 69.8%
GKHDDRA 82.4% 79.4% 71.2%
Table 2: Comparison of 3 Basic Model of DRTR
Cora Pubmed Citeseer
GCN 81.2% 79.3% 70.9%
GCN+GDRA 82.6% 80.1% 71.3%
GCN+GKHDA 82.4% 80.5% 71.7%
GCN+GKHDDRA 82.7% 80.9% 72.3%
Table 3: GCN and SGC + DRTR + Diff comparison
Cora Pubmed Citeseer
SGC 74.2% 78.2% 71.5%
SGC+GDRA 75.8% 81.2% 73.1%
SGC+GKHDA 75.1% 81.6% 73.4%
SGC+GKHDDRA 77.4% 82.5% 74.6%
Table 4: SGC and SGC + DRTR + Diff comparison
Cora Pubmed Citeseer
SSGC 83.0% 73.6% 75.6%
SSGC+GDRA 83.2% 74.2% 76.4%
SSGC+GKHDA 84.3% 74.5% 76.1%
SSGC+GKHDDRA 84.1% 74.7% 77.6%
Table 5: SSGC and SSGC + DRTR + Diff comparison
Cora Pubmed Citeseer
APPNNP 82.3% 71.5% 75.2%
APPNP+GDRA 83.5% 73.6% 74.4%
APPNP+GKHDA 83.8% 74.1% 74.5%
APPNP+GKHDDRA 84.6% 74.5% 75.3%
Table 6: APPNP and APPNP + DRTR + Diff comparison
Dataset Nodes Edges Features Classes
PubMed 19,717 44,338 500 3
Cora 2,708 5,429 1,433 7
CiteSeer 3,312 4,732 3,703 6
Table 7: Comparison of PubMed, Cora, and CiteSeer in Terms of Nodes, Edges, Features, and Classes
Models DR TR k_hop_resampling Heat_Diffusion_Propagation
GDRA \checkmark \checkmark
GKHDA \checkmark \checkmark
GKHDDRA \checkmark \checkmark \checkmark \checkmark
Table 8: GDRA, GKHDRA, GKHDDRA Modules (DR: Distance Recomputator; TR: Topology Reconstructor)
Table 9: Experimental Settings
Learning Rate Weight Decay Epochs Patience
0.005 0.001 1000 100