Keol Cho | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Keol Cho is active.

Explore More

Publication

Featured researches published by Keol Cho.

IEEE Transactions on Consumer Electronics | 2010

Stage-based frame-partitioned parallelization of H.264/AVC decoding

Won-Jin Kim; Keol Cho; Ki-Seok Chung

Strong demands for high resolution video services lead to active studies on high speed video processing. Especially, widespread deployment of multi-core systems accelerates researches on high resolution video processing based on parallelization of multimedia software. In this paper, we propose a novel parallel H.264/AVC decoding scheme on a homogeneous multi-core platform. Parallelization of H.264/AVC decoding is challenging not only because parallelization may incur significant synchronization overhead but also because software may have complicated dependencies. To overcome such issues, we propose a novel approach called Stage-based Frame-Partitioned Parallelization (SFPP). In SFPP, we divide a frame into multiple partitions, and execute them in a pipelined fashion. To reduce synchronization overhead, a separate thread is allocated to each stage in the pipeline. In addition, an efficient memory reuse technique is used to reduce the memory requirement. To verify the effectiveness of the proposed approach, we parallelized FFmpeg H.264/AVC decoder with the proposed technique using OpenMP, and carried out experiments on an Intel Quad-Core platform. The proposed design performs better than FFmpeg H.264/AVC decoder before parallelization by 53%. We also reduced the amount of memory usage by 65% and 81% for a high-definition (HD) and a full high-definition (FHD) video, respectively compared with that of a popular existing method.

IEEE Transactions on Consumer Electronics | 2011

Multi-threaded syntax element partitioning for parallel entropy decoding

Won-Jin Kim; Keol Cho; Ki-Seok Chung

Strong demand for high resolution video services leads to active studies on high speed video processing. Especially, widespread deployment of multi-core systems accelerates researches on high resolution video processing based on parallelization of multimedia software. Even if parallelization of other decoding steps on a multi-core platform may improve performance, entropy decoding often becomes a performance bottleneck since it should be processed sequentially. To resolve this concern, parallel entropy coding algorithms have been proposed. Syntax element partitioning is an algorithm for parallelization of Context Adaptive Binary Arithmetic Coding (CABAC). In this paper, we propose Multi-Threaded Syntax Element Partitioning (MT-SEP) for parallel entropy decoding. One major advantage of software parallel video decoding over hardware implementations will be that versatile video codecs can be implemented flexibly. We parallelized the KTA 2.7 decoder with the proposed technique on an Intel Quad-Core platform. We achieved up to 56% performance improvement using the proposed version of syntax element partitioning.

Journal of Electrical Engineering & Technology | 2017

Simplified 2-Dimensional Scaled Min-Sum Algorithm for LDPC Decoder

Keol Cho; Wang-Heon Lee; Ki-Seok Chung

Among various decoding algorithms of low-density parity-check (LDPC) codes, the minsum (MS) algorithm and its modified algorithms are widely adopted because of their computational simplicity compared to the sum-product (SP) algorithm with slight loss of decoding performance. In the MS algorithm, the magnitude of the output message from a check node (CN) processing unit is decided by either the smallest or the next smallest input message which are denoted as min1 and min2, respectively. It has been shown that multiplying a scaling factor to the output of CN message will improve the decoding performance. Further, Zhong et al. have shown that multiplying different scaling factors (called a 2-dimensional scaling) to min1 and min2 much increases the performance of the LDPC decoder. In this paper, the simplified 2-dimensional scaled (S2DS) MS algorithm is proposed. In the proposed algorithm, we figure out a pair of the most efficient scaling factors which multiplications can be replaced with combinations of addition and shift operations. Furthermore, one scaling operation is approximated by the difference between min1 and min2. The simulation results show that S2DS achieves the error correcting performance which is close to or outperforms the SP algorithm regardless of coding rates, and its computational complexity is the lowest comparing to modified versions of MS algorithms.

ieee region 10 conference | 2016

Implementation of an LDPC decoder on a heterogeneous FPGA-CPU platform using SDSoC

Si-Dong Roh; Keol Cho; Ki-Seok Chung

As modern hardware architectures are complicated, designing hardware systems is challenging. High level synthesis (HLS) has emerged as an effective hardware synthesis method that saves the engineering cost and the design time. Meanwhile, field programmable gate array (FPGA) devices have been improved significantly in terms of both performance and power efficiency, and therefore, they are often considered as an alternative hardware implementation to application specific integrated circuits (ASICs). SDSoC is a C/C++ development environment which enables developers to leverage both configurable hardware and software implementations. This paper introduces a hardware-software co-design of low density parity check (LDPC) decoding synthesized by SDSoC for a heterogeneous FPGA and central processing unit (CPU) platform. The LDPC code is one of the strongest error correcting codes. In order to optimize performance, the LDPC decoding process is divided into several stages. Then, either software or FPGA implementation is selected based on algorithmic characteristics and data dependencies of each stage. For stages which are implemented on the FPGA device, loop unrolling and loop pipelining techniques are applied. Compared to a pure software decoder, the proposed LDPC decoder achieved a speed-up of 4.41 while maintaining the software decoders BER performance and flexibility for various standards.

international symposium on consumer electronics | 2014

An efficient check node operation circuit for Min-Sum based LDPC decoder

Keol Cho; Ki-Seok Chung

This paper presents a low power and area-efficient check node operation circuit for LDPC decoders based on Min-Sum algorithm. By improving a heavily used comparator circuit, our proposed check node unit reduces area and power consumption by 8% and 13%, respectively, without decoding speed degradation compared to conventional LDPC decoders.

Ksii Transactions on Internet and Information Systems | 2017