Gyeonghoon Kim | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gyeonghoon Kim is active.

Explore More

Publication

Featured researches published by Gyeonghoon Kim.

international solid-state circuits conference | 2012

A 320 mW 342 GOPS Real-Time Dynamic Object Recognition Processor for HD 720p Video Streams

Jinwook Oh; Gyeonghoon Kim; Jun-Young Park; Injoon Hong; Seungjin Lee; Joo-Young Kim; Jeong-Ho Woo; Hoi-Jun Yoo

Moving object recognition in a video stream is crucial for applications such as unmanned aerial vehicles (UAVs) and mobile augmented reality that require robust and fast recognition in the presence of dynamic camera noise. Devices in such applications suffer from severe motion/camera blur noise in low-light conditions due to low-sensitivity CMOS image sensors, and therefore require higher computing power to obtain robust results vs. devices used in still image applications. Moreover, HD resolution has become so universal today that even smartphones support applications with HD resolution. However, many object recognition processors and accelerators reported for mobile applications only support SD resolution due to the computational complexity of object recognition algorithms. This paper presents a moving-target recognition processor for HD video streams. The processor is based on a context-aware visual attention model (CAVAM).

international solid-state circuits conference | 2013

A 646GOPS/W multi-classifier many-core processor with cortex-like architecture for super-resolution recognition

Jun-Young Park; Injoon Hong; Gyeonghoon Kim; Youchang Kim; Kyuho Jason Lee; Seong-Wook Park; Kyeongryeol Bong; Hoi-Jun Yoo

Object recognition processors have been reported for the applications of autonomic vehicle navigation, smart surveillance and unmanned air vehicles (UAVs) [1-3]. Most of the processors adopt a single classifier rather than multiple classifiers even though multi-classifier systems (MCSs) offer more accurate recognition with higher robustness [4]. In addition, MCSs can incorporate the human vision system (HVS) recognition architecture to reduce computational requirements and enhance recognition accuracy. For example, HMAX models the exact hierarchical architecture of the HVS for improved recognition accuracy [5]. Compared with SIFT, known to have the best recognition accuracy based on local features extracted from the object [6], HMAX can recognize an object based on global features by template matching and a maximum-pooling operation without feature segmentation. In this paper we present a multi-classifier many-core processor combining the HMAX and SIFT approaches on a single chip. Through the combined approach, the system can: 1) pay attention to the target object directly with global context consideration, including complicated background or camouflaging obstacles, 2) utilize the super-resolution algorithm to recognize highly blurred or small size objects, and 3) recognize more than 200 objects in real-time by context-aware feature matching.

international solid-state circuits conference | 2014

10.4 A 1.22TOPS and 1.52mW/MHz augmented reality multi-core processor with neural network NoC for HMD applications

Gyeonghoon Kim; Youchang Kim; Kyuho Jason Lee; Seong-Wook Park; Injoon Hong; Kyeongryeol Bong; Dongjoo Shin; Sungpill Choi; Jinwook Oh; Hoi-Jun Yoo

Augmented reality (AR) is being investigated in advanced displays for the augmentation of images in a real-world environment. Wearable systems, such as head-mounted display (HMD) systems, have attempted to support real-time AR as a next generation UI/UX [1-2], but have failed, due to their limited computing power. In a prior work, a chip with limited AR functionality was reported that could perform AR with the help of markers placed in the environment (usually 1D or 2D bar codes) [3]. However, for a seamless visual experience, 3D objects should be rendered directly on the natural video image without any markers. Unlike marker-based AR, markerless AR requires natural feature extraction, general object recognition, 3D reconstruction, and camera-pose estimation to be performed in parallel. For instance, markerless AR for a VGA input-test video consumes ~1.3W power at 0.2fps throughput, with TIs OMAP4430, which exceeds power limits for wearable devices. Consequently, there is a need for a high-performance energy-efficient markerless AR processor to realize a real-time AR system, especially for HMD applications.

international solid-state circuits conference | 2011

A 57mW embedded mixed-mode neuro-fuzzy accelerator for intelligent multi-core processor

Jinwook Oh; Jun-Young Park; Gyeonghoon Kim; Seungjin Lee; Hoi-Jun Yoo

Artificial intelligence (AI) functions are becoming important in smartphones, portable game consoles, and robots for such intelligent applications as object detection, recognition, and human-computer interfaces (HCI). Most of these functions are realized in software with neural networks (NN) and fuzzy systems (FS), but due to power and speed limitations, a hardware solution is needed. For example, software implementations of object-recognition algorithms like SIFT consume ∼10W and ∼1s delay even on a 2.4GHz PC CPU. Previously, GPGPUs or ASICs were used to realize AI functions [1–2]. But GPGPUs just emulate NN/FS with many processing elements to speed up the software, while still consuming a large amount of power. On the other hand, low-power ASICs have been mostly dedicated stand-alone processors, not suitable to be ported into many different systems [2].

IEEE Journal of Solid-state Circuits | 2013

An 86 mW 98GOPS ANN-Searching Processor for Full-HD 30 fps Video Object Recognition With Zeroless Locality-Sensitive Hashing

Gyeonghoon Kim; Jinwook Oh; Seungjin Lee; Hoi-Jun Yoo

Approximate nearest neighbor (ANN) searching is an essential task in object recognition. The ANN-searching stage, however, is the main bottleneck in the object recognition process due to increasing database size and massive dimensions of keypoint descriptors. In this paper, a high throughput ANN-searching processor is proposed for high-resolution (full-HD) and real-time (30 fps) video object recognition. The proposed ANN-searching processor adopts an interframe cache architecture as a hardware-oriented approach and a zeroless locality-sensitive-hashing (zeroless-LSH) algorithm as a software-oriented approach to reduce the external memory bandwidth required in nearest neighbor searching. A four-way set associative on-chip cache has a dedicated architecture to exploit data correlation at the frame-level. Zeroless-LSH minimizes data transactions from external memory at the vector-level. The proposed ANN-searching processor is fabricated as part of an object recognition SoC using a 0.13 μm 6 metal CMOS technology. It achieves 62 720 vectors/s throughput and 1140 GOPS/W power efficiency, which are 1.45 and 1.37 times higher than the state-of-the-art, respectively, enabling real-time object recognition for full-HD 30 fps video streams.

international solid-state circuits conference | 2016

14.2 A 502GOPS and 0.984mW dual-mode ADAS SoC with RNN-FIS engine for intention prediction in automotive black-box system

Kyuho Jason Lee; Kyeongryeol Bong; Chang-Hyeon Kim; Jaeeun Jang; Hyunki Kim; Jihee Lee; Kyoung-Rog Lee; Gyeonghoon Kim; Hoi-Jun Yoo

Advanced driver-assistance systems (ADAS) are being adopted in automobiles for forward-collision warning, advanced emergency braking, adaptive cruise control, and lane-keeping assistance. Recently, automotive black boxes are installed in cars for tracking accidents or theft. In this paper, a dual-mode ADAS SoC is proposed to support both high-performance ADAS functionality in driving-mode (d-mode) and an ultra-low-power black box in parking-mode (p-mode). By operating in p-mode, surveillance recording can be triggered intelligently with the help of our intention-prediction engine (IPE), instead of always-on recording to extend battery life and prevent discharge.

IEEE Journal of Solid-state Circuits | 2013

A 57 mW 12.5 µJ/Epoch Embedded Mixed-Mode Neuro-Fuzzy Processor for Mobile Real-Time Object Recognition

Jinwook Oh; Gyeonghoon Kim; Byeong-Gyu Nam; Hoi-Jun Yoo

A digital/analog mixed-mode processor is proposed to realize low-power and real-time neuro-fuzzy system for mobile object recognition. It integrates 1024 highly-parallel analog processing element for high dimensional inference operation, and accurate and fast digital accelerator for cascaded learning operation of neuro-fuzzy network. A neuro-fuzzy controller is proposed to manage the mixed-mode operations as a host processor while reducing extra processing delay and power consumption on inter-domain communications. To solve the conventional problems of a large dimensional mixed-mode VLSI system such as throughput degradation due to long channel delay, limited functionality of fixed analog circuits, and mismatches from process variation, the proposed processor adopts 2-stage asynchronous mixed-mode pipeline, flexible channel configuration of each domain, and learning-based calibration technologies respectively. As a result, the processor only consumes 57 mW on average and obtains 12.5 μJ/epoch energy efficiency for on-line learning mixed-mode neuro-fuzzy system with 50 fuzzy rules.

international symposium on circuits and systems | 2014

An 1.61mW mixed-signal column processor for BRISK feature extraction in CMOS image sensor

Kyeongryeol Bong; Gyeonghoon Kim; Injoon Hong; Hoi-Jun Yoo

In mobile object recognition (OR) applications, the power consumption of image sensor and data communication between image sensor and digital OR processor becomes crucial as digital OR processor consumes less power in deep sub-micron process. To reduce the amount of data transaction from image sensor to digital OR processor, digital/analog mixed-signal focal-plane processing of Binary Robust Invariant Scalable Keypoints (BRISK) feature extraction in CMOS image sensor (CIS) is proposed. The proposed CIS processor sends BRISK feature vectors instead of the whole image pixel data, resulting in 79% reduction of data communication. In this work, mixed-signal processing of corner detection and successive approximation register (SAR)-based scoring are implemented for BRISK feature point detection. To achieve scale-invariance in object recognition, scale-space is generated and stored in analog line memory. In addition, noise reduction scheme is integrated in column processing chain to remove salt and pepper noise, which degrades recognition accuracy. In a post layout simulation, the proposed system achieves 0.70pW/pixel*frame*feature at 30fps in a 130nm CMOS technology, which is 13.6% lower than the state-of-the-art.

IEEE Transactions on Circuits and Systems | 2014

Intelligent Network-on-Chip With Online Reinforcement Learning for Portable HD Object Recognition Processor

Junyoung Park; Injoon Hong; Gyeonghoon Kim; Byeong-Gyu Nam; Hoi-Jun Yoo

An intelligent Reinforcement Learning (RL) Network-on-Chip (NoC) is proposed as a communication architecture of a heterogeneous many-core processor for portable HD object recognition. The proposed RL NoC automatically learns bandwidth adjustment and resource allocation in the heterogeneous many-core processor without explicit modeling. By regulating the bandwidth and reallocating cores, the throughput performances of feature detection and description are increased by 20.4% and 11.5%, respectively. As a result, the overall execution time of the object recognition is reduced by 38%. The proposed processor with RL NoC is implemented in a 65 nm CMOS process, and it successfully demonstrates the real-time object recognition for a 720 p HD video stream while consuming 235 mW peak power at 200 MHz, 1.2 V.

IEEE Journal of Solid-state Circuits | 2015

A Vocabulary Forest Object Matching Processor With 2.07 M-Vector/s Throughput and 13.3 nJ/Vector Per-Vector Energy for Full-HD 60 fps Video Object Recognition

Kyuho Jason Lee; Gyeonghoon Kim; Jun-Young Park; Hoi-Jun Yoo

Approximate nearest neighbor searching has been studied as the keypoint matching algorithm for object recognition systems, and its hardware realization has reduced the external memory access which is the main bottleneck in object recognition process. However, external memory access reduction alone cannot satisfy the ever-increasing memory bandwidth requirement due to the rapid increase of the image resolution and frame rate of many recent applications such as advanced driver assistance system. In this paper, vocabulary forest (VF) processor is proposed that achieves both high accuracy and high speed by integrating on-chip database (DB) to remove external memory access. The area-efficient reusable-vocabulary tree architecture is proposed to reduce area, and the propagate-and-compute-array architecture is proposed to enhance the processing speed of the VF. The proposed VF processor can speed up the object matching stage by 16.4x compared with the state-of-the-art matching processor [Hong et al., Symp. VLSIC, 2013] for high resolution (Full-HD) and real-time (60 fps) video object recognition. It is fabricated using 65 nm CMOS technology and integrated into an object recognition SoC. The proposed VF chip achieves 2.07 M-vector/s throughput and 13.3 nJ/vector per-vector energy with 95.7% matching accuracy for 100 objects.

Explore More