Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jinwook Oh is active.

Publication


Featured researches published by Jinwook Oh.


international solid-state circuits conference | 2009

A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine

Joo-Young Kim; Minsu Kim; Seungjin Lee; Jinwook Oh; Kwanho Kim; Sejong Oh; Jeong-Ho Woo; Dong-Hyun Kim; Hoi-Jun Yoo

A 201.4 GOPS real-time multi-object recognition processor is presented with a three-stage pipelined architecture. Visual perception based multi-object recognition algorithm is applied to give multiple attentions to multiple objects in the input image. For human-like multi-object perception, a neural perception engine is proposed with biologically inspired neural networks and fuzzy logic circuits. In the proposed hardware architecture, three recognition tasks (visual perception, descriptor generation, and object decision) are directly mapped to the neural perception engine, 16 SIMD processors including 128 processing elements, and decision processor, respectively, and executed in the pipeline to maximize throughput of the object recognition. For efficient task pipelining, proposed task/power manager balances the execution times of the three stages based on intelligent workload estimations. In addition, a 118.4 GB/s multi-casting network-on-chip is proposed for communication architecture with incorporating overall 21 IP blocks. For low-power object recognition, workload-aware dynamic power management is performed in chip-level. The 49 mm2 chip is fabricated in a 0.13 ¿m 8-metal CMOS process and contains 3.7 M gates and 396 KB on-chip SRAM. It achieves 60 frame/sec multi-object recognition up to 10 different objects for VGA (640 × 480) video input while dissipating 496 mW at 1.2 V. The obtained 8.2 mJ/frame energy efficiency is 3.2 times higher than the state-of-the-art recognition processor.


international solid-state circuits conference | 2010

A 345 mW Heterogeneous Many-Core Processor With an Intelligent Inference Engine for Robust Object Recognition

Seungjin Lee; Jinwook Oh; Jun-Young Park; Joonsoo Kwon; Minsu Kim; Hoi-Jun Yoo

Fast and robust object recognition of cluttered scenes presents two main challenges: (1) the large number of features to process requires high computational power, and (2) false matches from background clutter can degrade recognition accuracy. Previously, saliency based bottom-up visual attention [1,2] increased recognition speed by confining the recognition processing only to the salient regions. But these schemes had an inherent problem: the accuracy of the attention itself. If attention is paid to the false region, which is common when saliency cannot distinguish between clutter and object, recognition accuracy is degraded. In order to improve the attention accuracy, we previously reported an algorithm, the Unified Visual Attention Model (UVAM) [3], which incorporates the familiarity map on top of the saliency map for the search of attentive points. It can cross-check the accuracy of attention deployment by combining top-down attention, searching for “meaningful objects”, and bottom-up attention, just looking for conspicuous points. This paper presents a heterogeneous many-core (note: we use the term “many-core” instead of “multi-core” to emphasize the large number of cores) processor that realizes the UVAM algorithm to achieve fast and robust object recognition of cluttered video sequences.


international solid-state circuits conference | 2012

A 320 mW 342 GOPS Real-Time Dynamic Object Recognition Processor for HD 720p Video Streams

Jinwook Oh; Gyeonghoon Kim; Jun-Young Park; Injoon Hong; Seungjin Lee; Joo-Young Kim; Jeong-Ho Woo; Hoi-Jun Yoo

Moving object recognition in a video stream is crucial for applications such as unmanned aerial vehicles (UAVs) and mobile augmented reality that require robust and fast recognition in the presence of dynamic camera noise. Devices in such applications suffer from severe motion/camera blur noise in low-light conditions due to low-sensitivity CMOS image sensors, and therefore require higher computing power to obtain robust results vs. devices used in still image applications. Moreover, HD resolution has become so universal today that even smartphones support applications with HD resolution. However, many object recognition processors and accelerators reported for mobile applications only support SD resolution due to the computational complexity of object recognition algorithms. This paper presents a moving-target recognition processor for HD video streams. The processor is based on a context-aware visual attention model (CAVAM).


international solid-state circuits conference | 2014

10.4 A 1.22TOPS and 1.52mW/MHz augmented reality multi-core processor with neural network NoC for HMD applications

Gyeonghoon Kim; Youchang Kim; Kyuho Jason Lee; Seong-Wook Park; Injoon Hong; Kyeongryeol Bong; Dongjoo Shin; Sungpill Choi; Jinwook Oh; Hoi-Jun Yoo

Augmented reality (AR) is being investigated in advanced displays for the augmentation of images in a real-world environment. Wearable systems, such as head-mounted display (HMD) systems, have attempted to support real-time AR as a next generation UI/UX [1-2], but have failed, due to their limited computing power. In a prior work, a chip with limited AR functionality was reported that could perform AR with the help of markers placed in the environment (usually 1D or 2D bar codes) [3]. However, for a seamless visual experience, 3D objects should be rendered directly on the natural video image without any markers. Unlike marker-based AR, markerless AR requires natural feature extraction, general object recognition, 3D reconstruction, and camera-pose estimation to be performed in parallel. For instance, markerless AR for a VGA input-test video consumes ~1.3W power at 0.2fps throughput, with TIs OMAP4430, which exceeds power limits for wearable devices. Consequently, there is a need for a high-performance energy-efficient markerless AR processor to realize a real-time AR system, especially for HMD applications.


symposium on vlsi circuits | 2010

A 1.2mW on-line learning mixed mode intelligent inference engine for robust object recognition

Jinwook Oh; Seungjin Lee; Minsu Kim; Joonsoo Kwon; Jun-Young Park; Joo-Young Kim; Hoi-Jun Yoo

An intelligent inference engine (IIE) is proposed as a controller for low power high speed robust object recognition processor. It contains analog digital mixed mode neuro-fuzzy circuits for the on-line learning to increase attention efficiency. It is implemented in 0.13um CMOS process and achieves 1.2mW power consumption with 94% average classification accuracy within 1us operation. The 0.765mm2 IIE achieves 76% attention efficiency, and reduces power and processing delay of the 50mm2 recognition processor by up to 37% and 28%, respectively, with 96 % recognition accuracy.


IEEE Journal of Solid-state Circuits | 2010

A 118.4 GB/s Multi-Casting Network-on-Chip With Hierarchical Star-Ring Combined Topology for Real-Time Object Recognition

Joo-Young Kim; Jun-Young Park; Seungjin Lee; Minsu Kim; Jinwook Oh; Hoi-Jun Yoo

A 118.4 GB/s multi-casting network-on-chip (MC-NoC) is proposed as communication platform for a real-time object recognition processor. For application-specific NoC design, target traffic patterns are elaborately analyzed. Through topology exploration, we derive a hierarchical star and ring (HS-R) combined architecture for low latency and inter-processor communication. Multi-casting protocol and router are developed to accelerate one-to-many (1-to-N) data transactions. With these two main features, the proposed MC-NoC reduces data transaction time and energy consumption for the target object recognition traffic by 20% and 23%, respectively. The 350 k MC-NoC fabricated in a 0.13 CMOS process consumes 48 mW at 400 MHz, 1.2 V.


international solid-state circuits conference | 2011

A 57mW embedded mixed-mode neuro-fuzzy accelerator for intelligent multi-core processor

Jinwook Oh; Jun-Young Park; Gyeonghoon Kim; Seungjin Lee; Hoi-Jun Yoo

Artificial intelligence (AI) functions are becoming important in smartphones, portable game consoles, and robots for such intelligent applications as object detection, recognition, and human-computer interfaces (HCI). Most of these functions are realized in software with neural networks (NN) and fuzzy systems (FS), but due to power and speed limitations, a hardware solution is needed. For example, software implementations of object-recognition algorithms like SIFT consume ∼10W and ∼1s delay even on a 2.4GHz PC CPU. Previously, GPGPUs or ASICs were used to realize AI functions [1–2]. But GPGPUs just emulate NN/FS with many processing elements to speed up the software, while still consuming a large amount of power. On the other hand, low-power ASICs have been mostly dedicated stand-alone processors, not suitable to be ported into many different systems [2].


IEEE Journal of Solid-state Circuits | 2013

An 86 mW 98GOPS ANN-Searching Processor for Full-HD 30 fps Video Object Recognition With Zeroless Locality-Sensitive Hashing

Gyeonghoon Kim; Jinwook Oh; Seungjin Lee; Hoi-Jun Yoo

Approximate nearest neighbor (ANN) searching is an essential task in object recognition. The ANN-searching stage, however, is the main bottleneck in the object recognition process due to increasing database size and massive dimensions of keypoint descriptors. In this paper, a high throughput ANN-searching processor is proposed for high-resolution (full-HD) and real-time (30 fps) video object recognition. The proposed ANN-searching processor adopts an interframe cache architecture as a hardware-oriented approach and a zeroless locality-sensitive-hashing (zeroless-LSH) algorithm as a software-oriented approach to reduce the external memory bandwidth required in nearest neighbor searching. A four-way set associative on-chip cache has a dedicated architecture to exploit data correlation at the frame-level. Zeroless-LSH minimizes data transactions from external memory at the vector-level. The proposed ANN-searching processor is fabricated as part of an object recognition SoC using a 0.13 μm 6 metal CMOS technology. It achieves 62 720 vectors/s throughput and 1140 GOPS/W power efficiency, which are 1.45 and 1.37 times higher than the state-of-the-art, respectively, enabling real-time object recognition for full-HD 30 fps video streams.


IEEE Journal of Solid-state Circuits | 2012

A 92-mW Real-Time Traffic Sign Recognition System With Robust Illumination Adaptation and Support Vector Machine

Jun-Young Park; Joonsoo Kwon; Jinwook Oh; Seungjin Lee; Joo-Young Kim; Hoi-Jun Yoo

A low-power real-time traffic sign recognition system that is robust under various illumination conditions is proposed. It is composed of a Retinex preprocessor and an SVM processor. The Retinex preprocessor performs the Multi-Scale Retinex (MSR) algorithm for robust light and dark adaptation under harsh illumination environments. In the Retinex preprocessor, the recursive Gaussian engine (RGE) and reflectance engine (RE) exploit parallelism of the MSR tasks with a two-stage pipeline, and a mixed-mode scale generator (SG) with adaptive neuro-fuzzy inference system (ANFIS) performs parameter optimizations for various scene conditions. The SVM processor performs the SVM algorithm for robust traffic sign classification. The proposed algorithm-optimized small-sized kernel cache and memory controller reduce power consumption and memory redundancy by 78% and 35%, respectively. The proposed system is implemented as two separated ICs in a 0.13-μm CMOS process, and the two chips are connected using network-on-chip off-chip gateway. The system achieves robust sign recognition operation with 90% sign recognition accuracy under harsh illumination conditions while consuming just 92 mW at 1.2 V.


asian solid state circuits conference | 2010

A 92m W 76.8GOPS vector matching processor with parallel Huffman decoder and query re-ordering buffer for real-time object recognition

Seungjin Lee; Joonsoo Kwon; Jinwook Oh; Jun-Young Park; Hoi-Jun Yoo

A vector matching processor with memory bandwidth optimizations is proposed to achieve real-time matching of 128 dimensional SIFT features extracted from VGA video. The main bottleneck of feature-vector matching is the off-chip database access. We employ the locality sensitive hashing (LSH) algorithm which reduces the number of database comparisons required to match each query. In addition, database compression using Huffman coding increases the effective external bandwidth. Dedicated parallel Huffman decoder hardware ensures fast decompression of the database. A flexible query re-ordering buffer exploits overlapping accesses between queries by enabling out-of-order query processing to minimize redundant off-chip access. As a result, the 76.8 GOPS feature matching processor implemented in a 0.13um CMOS process achieves 43200 queries/second on a 100 object database while consuming peak power of 92mW.

Collaboration


Dive into the Jinwook Oh's collaboration.

Researchain Logo
Decentralizing Knowledge