Proceedings of the 36th Annual ACM Symposium on Applied Computing | 2021

Improving CNN performance on FPGA clusters through topology exploration

 
 
 
 
 
 

Abstract


Field Programmable Gate Array (FPGA) platform has been a popular choice for deploying Convolution Neural Networks (CNNs) as a result of its high parallelism and low energy consumption. Due to the limited on-chip computation and storage resources, FPGA clusters are becoming promising candidates to improve CNN throughputs. In this paper, we first put forward strategies to optimize the inter-board resource allocation in FPGA clusters. Then we model the multi-board cluster problem based on dynamic programming to get the optimal topology of the FPGA clusters. Experimental results show that typical well-known CNNs with our proposed FPGA cluster topology obtains an average throughput 4.33X than single-board solutions and 1.87X than other state-of-the-art multi-board solutions.

Volume None
Pages None
DOI 10.1145/3412841.3441893
Language English
Journal Proceedings of the 36th Annual ACM Symposium on Applied Computing

Full Text