IEEE Access | 2021

Design, Implementation, and Analysis of High-Speed Single-Stage N-Sorters and N-Filters

 
 

Abstract


There is strong interest in developing high-performance hardware sorting systems which can sort a set of elements as quickly as possible. The fastest of the current FPGA systems are sorting networks, in which sets of 2-sorters operate in parallel in each series stage of a multi-stage sorting process. A 2-sorter is a single-stage hardware block which sorts two values, so any list with more than 2 values must be sorted with a series network of 2-sorters. A primary contribution of this work is to provide a general methodology for the design of stable single-stage hardware sorters which sort more than 2 values simultaneously. This general methodology for <inline-formula> <tex-math notation= LaTeX >$N$ </tex-math></inline-formula>-sorter design, with <inline-formula> <tex-math notation= LaTeX >$N{>}2$ </tex-math></inline-formula>, is then adapted for use in modern FPGAs, where it is shown that single-stage 3-sorters up to 9-sorters have speedup ratios from 2.0 to 3.5 versus the comparable state-of-the-art 2-sorter networks. A design system modification is shown to produce even faster single-stage <inline-formula> <tex-math notation= LaTeX >$N$ </tex-math></inline-formula>-max and <inline-formula> <tex-math notation= LaTeX >$N$ </tex-math></inline-formula>-min filters. When used for max pooling 32-bit data in the fastest analyzed FPGA, a single 9-max filter will process 500 million 9-pixel groups per second (4K:3840x2160 at 500 frames/second). The single-stage 9-median filter using this design methodology, useful in image processing, is shown to have speedup ratios of 3.0 to 4.1 versus state-of-the-art FPGA network implementations, even though its resource usage is comparable to, often better than, the network implementations. Ten 8-bit 9-median filters operating in parallel in the fastest FPGA will process over 5.4 billion pixels/sec (4K at over 600 frames/second).

Volume 9
Pages 2576-2591
DOI 10.1109/ACCESS.2020.3047594
Language English
Journal IEEE Access

Full Text