50th International Conference on Parallel Processing Workshop | 2021

Impact of AVX-512 Instructions on Graph Partitioning Problems.

 
 

Abstract


Graph analysis now percolates society with applications ranging from advertising and transportation to medical research. The structure of graphs is becoming more complex every day while they are getting larger. The increasing size of graph networks has made many of the classical algorithms reasonably slow. Fortunately, CPU architectures have evolved to adjust to new and more complex problems in terms of core-level parallelism and vector-level parallelism (SIMD-level). In this paper, we are exploring how the modern vector architecture of CPUs can help with community detection, partitioning, and coloring kernels by studying two representatives algorithms. We consider the Intel SkylakeX and Cascade Lake architectures, which support gather and scatter instructions on 512-bit vectors. The existing vectorized graph algorithms of classic graph problems, such as BFS and PageRank, do not apply well to community detection; we show the support of gather and scatter are necessary. In particular for the implementation of the reduce-scatter patterns. We evaluate the performances achieved on the two architectures and conclude that good hardware support for scatter instructions is necessary to fully leverage the vector processing for graph partitioning problems.

Volume None
Pages None
DOI 10.1145/3458744.3473362
Language English
Journal 50th International Conference on Parallel Processing Workshop

Full Text