TAPA: A Scalable Task-Parallel Dataflow Programming Framework for Modern FPGAs with Co-Optimization of HLS and Physical Design
Authors:
Licheng Guo,
Yuze Chi,
Jason Lau,
Linghao Song,
Xingyu Tian,
Moazin Khatti,
Weikang Qiao,
Jie Wang,
Ecenur Ustun,
Zhenman Fang,
Zhiru Zhang,
Jason Cong
Abstract:
In this paper, we propose TAPA, an end-to-end framework that compiles a C++ task-parallel dataflow program into a high-frequency FPGA accelerator. Compared to existing solutions, TAPA has two major advantages. First, TAPA provides a set of convenient APIs that allow users to easily express flexible and complex inter-task communication structures. Second, TAPA adopts a coarse-grained floorplanning…
▽ More
In this paper, we propose TAPA, an end-to-end framework that compiles a C++ task-parallel dataflow program into a high-frequency FPGA accelerator. Compared to existing solutions, TAPA has two major advantages. First, TAPA provides a set of convenient APIs that allow users to easily express flexible and complex inter-task communication structures. Second, TAPA adopts a coarse-grained floorplanning step during HLS compilation for accurate pipelining of potential critical paths. In addition, TAPA implements several optimization techniques specifically tailored for modern HBM-based FPGAs. In our experiments with a total of 43 designs, we improve the average frequency from 147 MHz to 297 MHz (a 102% improvement) with no loss of throughput and a negligible change in resource utilization. Notably, in 16 experiments we make the originally unroutable designs achieve 274 MHz on average. The framework is available at https://github.com/UCLA-VAST/tapa and the core floorplan module is available at https://github.com/UCLA-VAST/AutoBridge.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.