FlowGNN: A Dataflow Architecture for Real-Time Workload-Agnostic Graph Neural Network Inference

Sarkar, Rishov; Abi-Karam, Stefan; He, Yuqi; Sathidevi, Lakshmi; Hao, Cong

Abstract:Graph neural networks (GNNs) have recently exploded in popularity thanks to their broad applicability to graph-related problems such as quantum chemistry, drug discovery, and high energy physics. However, meeting demand for novel GNN models and fast inference simultaneously is challenging due to the gap between develo** efficient accelerators and the rapid creation of new GNN models. Prior art focuses on accelerating specific classes of GNNs, such as Graph Convolutional Networks (GCN), but lacks generality to support a wide range of existing or new GNN models. Furthermore, most works rely on graph pre-processing to exploit data locality, making them unsuitable for real-time applications. To address these limitations, in this work, we propose a generic dataflow architecture for GNN acceleration, named FlowGNN, which is generalizable to the majority of message-passing GNNs. The contributions are three-fold. First, we propose a novel and scalable dataflow architecture, which generally supports a wide range of GNN models with message-passing mechanism. The architecture features a configurable dataflow optimized for simultaneous computation of node embedding, edge embedding, and message passing, which is generally applicable to all models. We also propose a rich library of model-specific components. Second, we deliver ultra-fast real-time GNN inference without any graph pre-processing, making it agnostic to dynamically changing graph structures. Third, we verify our architecture on the Xilinx Alveo U50 FPGA board and measure the on-board end-to-end performance. We achieve a speed-up of up to 24-254x against CPU (6226R) and 1.3-477x against GPU (A6000) (with batch sizes 1 through 1024); we also outperform the SOTA GNN accelerator I-GCN by 1.26x speedup and 1.55x energy efficiency over four datasets. Our implementation code and on-board measurement are publicly available on GitHub.

Comments:	13 pages, 10 figures. Accepted at HPCA 2023
Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
Cite as:	arXiv:2204.13103 [cs.DC]
	(or arXiv:2204.13103v2 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.2204.13103

Computer Science > Distributed, Parallel, and Cluster Computing

Title:FlowGNN: A Dataflow Architecture for Real-Time Workload-Agnostic Graph Neural Network Inference

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators