A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

Gao, Zhangyang; Dong, Daize; Tan, Cheng; Xia, Jun; Hu, Bozhen; Li, Stan Z.

Computer Science > Machine Learning

arXiv:2402.02464v2 (cs)

[Submitted on 4 Feb 2024 (v1), revised 19 Mar 2024 (this version, v2), latest version 29 May 2024 (v3)]

Title:A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

Authors:Zhangyang Gao, Daize Dong, Cheng Tan, Jun Xia, Bozhen Hu, Stan Z. Li

View PDF HTML (experimental)

Abstract:Can we model non-Euclidean graphs as pure language or even Euclidean vectors while retaining their inherent information? The non-Euclidean property have posed a long term challenge in graph modeling. Despite recent GNN and Graphformer efforts encoding graphs as Euclidean vectors, recovering original graph from the vectors remains a challenge. We introduce GraphsGPT, featuring a Graph2Seq encoder that transforms non-Euclidean graphs into learnable graph words in a Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from graph words to ensure information equivalence. We pretrain GraphsGPT on 100M molecules and yield some interesting findings: (1) Pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on 8/9 graph classification and regression tasks. (2) Pretrained GraphGPT serves as a strong graph generator, demonstrated by its ability to perform both unconditional and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known non-Euclidean challenge. (4) Our proposed novel edge-centric GPT pretraining task is effective in graph fields, underscoring its success in both representation and generation.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI)
Cite as:	arXiv:2402.02464 [cs.LG]
	(or arXiv:2402.02464v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2402.02464

Submission history

From: Zhangyang Gao [view email]
[v1] Sun, 4 Feb 2024 12:29:40 UTC (14,051 KB)
[v2] Tue, 19 Mar 2024 05:27:08 UTC (14,051 KB)
[v3] Wed, 29 May 2024 05:40:35 UTC (13,537 KB)

Computer Science > Machine Learning

Title:A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators