Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kyungah Kim is active.

Publication


Featured researches published by Kyungah Kim.


IEEE Transactions on Parallel and Distributed Systems | 2015

Dynamic Load Balancing of Parallel SURF with Vertical Partitioning

Deokho Kim; Minwoo Kim; Kyungah Kim; Minyong Sung; Won Woo Ro

The demand for real-time processing of robust feature detection is one of the major issues in the computer vision field. In order to comply with the requirements, in this paper a parallelization and optimization method to effectively accelerate SURF is proposed. The proposed parallelization method is developed based on a workload analysis of SURF in terms of various aspects, focusing in particular on the load balancing problem. First, the average parallel workload is divided into identical portions using the vertical partitioning method. Then, the load imbalance problem is further resolved using the dynamic partition balancing method. In addition, an optimization method is proposed together with the parallelization method to find and exclude redundant operations in SURF, thus effectively accelerating the feature detection operation when the proposed parallelization method is applied. The proposed method shows a maximum speedup of 19.21 compared to the single threaded performance on a 24-core system, achieving a maximum of 83.80 fps in a real-machine experiment, enabling real-time processing.


IEEE Transactions on Circuits and Systems for Video Technology | 2016

Exploiting Thread-Level Parallelism on HEVC by Employing a Reference Dependency Graph

Minwoo Kim; Deokho Kim; Kyungah Kim; Won Woo Ro

This paper presents an optimized parallel algorithm for the next-generation video codec High Efficiency Video Coding (HEVC). The proposed method provides maximized parallel scalability by exploiting two levels of parallelism: 1) frame level and 2) task level. Frame-level parallelism is exploited using a graph that efficiently provides a parallel coding order of the frames with complex reference dependencies. The proposed reference dependency graph is generated at runtime by a novel construction algorithm that dynamically analyzes the configuration of the HEVC codec. Task-level parallelism is exploited to provide further scalability to frame-level parallelization. A pipelined execution is allowed for independent tasks, which are defined by dividing and categorizing a single coding process into multiple types of tasks. The proposed parallel encoder and decoder do not suffer from loss in coding efficiency because neither constraints nor modification in coding options are required. The proposed parallel methods result in an average encoding speedup of 1.75 and the aggressive method that exploits additional frame-level parallelism achieved 6.52 speedup using eight physical cores.


Multimedia Tools and Applications | 2018

Contents-aware partitioning algorithm for parallel high efficiency video coding

Kyungah Kim; Won Woo Ro

We introduce a new parallelization method for high-efficiency video coding (HEVC), which resolves the shortcomings of the existing tile-based parallel processing method. The parallel HEVC performs encoding by dividing a frame into numerous parallel units. This decreases the compression efficiency compared with sequential HEVC, because it artificially breaks the data correlation within a frame, which is called the parallelization overhead. The traditional parallel techniques such as Tiles and wavefront parallel processing (WPP) inherently introduce a high parallelization overhead because they simply divide a frame statically without considering the contents of the frame. The proposed new parallel encoding scheme resolves such problems by partitioning a frame based on the meaningful contents. In order to analyze the correlations within a frame and define the contents, the features within a frame are first extracted and clustered. In the feature clustering algorithm, two factors are considered to balance the workload between parallel units: (1) the number of features in each cluster and (2) the number of coding tree units (CTU) occupied by each cluster. The frame is partitioned based on the result of clustering, and the partitions are encoded in parallel. The proposed scheme achieves a bit-saving of up to 7.21%, with an average of 3.71%, along with an average time-saving of 20.50% compared to the Tiles technique.


international conference on image processing | 2015

True motion compensation with feature detection for frame rate up-conversion

Kyungah Kim; Minwoo Kim; Deokho Kim; Won Woo Ro

This paper presents a feature-based frame rate up-conversion algorithm which provides more comfortable visual experience by exploiting true motion of the objects. By considering the movement of the objects rather than the pixel values, the proposed method can create interpolated frames to reflect true movement of the video contents. We first find local features within a frame by using a feature detection algorithm. Then, the local features are matched between adjacent frames and are clustered to form an object region. The interpolated frame is created by using the perspective transformation, which enables to adequately track the dynamic movement of the defined objects. The proposed scheme efficiently resolves the blocking artifact problem and presents outstanding visual quality compared to the conventional block-based motion compensated interpolation algorithm.


Archive | 2014

Efficient Descriptor-Filtering Algorithm for Speeded Up Robust Features Matching

Minwoo Kim; Deokho Kim; Kyungah Kim; Won Woo Ro

This paper presents an efficient descriptor filtering algorithm for the feature matching process of SURF. The matching algorithm used in OpenSURF compares each and every feature descriptors by calculating the root-mean-square error of the descriptor vectors. The proposed instant-termination and Bloom filtering algorithm pre-compares the feature descriptors and decides whether the compared descriptor pairs should be further inspected. The proposed pre-comparison process compares the most significant bits of the descriptor for early decision. Also, the descriptor bits are interleaved to adapt to the Bloom filter, increasing the reliability of the filtering process. Our proposed filtering algorithm effectively reduces the number of root-mean-square error calculations.


Archive | 2013

VIDEO ENCODING METHOD AND APPARATUS FOR PARALLEL PROCESSING USING REFERENCE PICTURE INFORMATION, AND VIDEO DECODING METHOD AND APPARATUS FOR PARALLEL PROCESSING USING REFERENCE PICTURE INFORMATION

Young-O Park; Kwang-Pyo Choi; Chan-Yul Kim; Byeong-Doo Choi; Won-woo Ro; Kyungah Kim; Deokho Kim; Minwoo Kim


IEEE Transactions on Circuits and Systems for Video Technology | 2018

Fast CU Depth Decision for HEVC using Neural Networks

Kyungah Kim; Won Woo Ro


international soc design conference | 2017

Characterizing convolutional neural network workloads on a detailed GPU simulator

Kwanghee Chang; Minsik Kim; Kyungah Kim; Won Woo Ro


international conference on consumer electronics | 2016

Measuring error-tolerance in SRAM architecture on hardware accelerated neural network

Sangheon Kwon; Kyungmin Lee; Yoon-Soo Kim; Kyungah Kim; Changmin Lee; Won Woo Ro


IEEE Transactions on Circuits and Systems for Video Technology | 2016

参照依存性グラフを用いたHEVC(高効率映像符号化に対するスレッドレベル並列性の利用【Powered by NICT】

Minwoo Kim; Deokho Kim; Kyungah Kim; Won Woo Ro

Collaboration


Dive into the Kyungah Kim's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge