Soo-Ik Chae | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Soo-Ik Chae is active.

Explore More

Publication

Featured researches published by Soo-Ik Chae.

asia and south pacific design automation conference | 2005

A fast VLSI architecture for full-search variable block size motion estimation in MPEG-4 AVC/H.264

Minho Kim; Ingu Hwang; Soo-Ik Chae

We describe a fast VLSI architecture for full-search motion estimation for the blocks with 7 different sizes in MPEG-4 AVC/H.264. The proposed variable block size motion estimation (VBSME) architecture consists of a 16x16 PE array, an adder tree and comparators to find all 41 motion vectors and their minimum SADs for the blocks of 16x16, 16x8, 8x16, 8x8, 8x4, 4x8 and 4x4. It employs a 2-D datapath and its control of the search area data is simple and regular. The proposed VBSME can achieve 100% PE utilization by employing a preload register and a search data buffer inside each PE and allow real-time processing of 4CIF(704x576) video with 15 fps at 100 Mhz for a search range of [-32~+31].

IEEE Transactions on Very Large Scale Integration Systems | 2001

Partial bus-invert coding for power optimization of application-specific systems

Youngsoo Shin; Soo-Ik Chae; Kiyoung Choi

This paper presents two bus coding schemes for power optimization of application-specific systems: partial pus-invert coding and its extension to multiway partial bus-invert coding. In the first scheme, only a selected subgroup of bus lines is encoded to avoid unnecessary inversion of relatively inactive and/or uncorrelated bus lines which are not included in the subgroup. In the extended scheme, we partition a bus into multiple subbuses by clustering highly correlated bus lines and then encode each subbus independently. We describe a heuristic algorithm of partitioning a bus into subbuses for each encoding scheme. Experimental results for various examples indicate that both encoding schemes are highly efficient for application-specific systems.

international symposium on low power electronics and design | 1998

Partial bus-invert coding for power optimization of system level bus

Youngsoo Shin; Soo-Ik Chae; Kiyoung Choi

We presen t a partial bus-in vertcoding scheme for po wer optim ization of system level bus. In the proposed sch eme, we select a su b-group of bus lines involved in b us encoding to a void unnecessary inversion of b us lines not in the sub-group thereby redu cing th e total number of bus transitions. We propose a heuristic algorithm that selects the sub-grou p of bus lines for b us encoding. Ex periments on benchmark examples in dicate that the partial bus-in vert coding reduces the tot al bus tran sitions b y 62.6% on the av erage, compared to that of the unencoded patterns.

IEEE Journal of Solid-state Circuits | 1999

A 16-bit carry-lookahead adder using reversible energy recovery logic for ultra-low-energy systems

Joonho Lim; Dong-Gyu Kim; Soo-Ik Chae

In this paper, we describe an energy-efficient carry-lookahead adder using reversible energy recovery logic (RERL), which is a new dual-rail reversible adiabatic logic. We also describe an eight-phase, clocked power generator that requires an off-chip inductor. For the energy-efficient design of reversible logic, we explain how to control the overhead of reversibility with a self-energy-recovery circuit. A test chip was implemented with a 0.8 /spl mu/m CMOS technology, which included two 16-bit carry-lookahead adders to allow fair comparison: an RERL one and a static CMOS one. Experimental results showed that the RERL adder had substantial advantages in energy consumption over the static CMOS one at low operating frequencies. We also confirmed that we could minimize the energy consumption in the RERL circuit by reducing the operating frequency until adiabatic and leakage losses were equal.

IEEE Journal of Solid-state Circuits | 2000

nMOS reversible energy recovery logic for ultra-low-energy applications

Joonho Lim; Dong-Gyu Kim; Soo-Ik Chae

We propose a new fully reversible adiabatic logic, nMOS reversible energy recovery logic (nRERL), which uses nMOS transistors only and a simpler 6-phase clocked power. Its area overhead and energy consumption are smaller, compared with the other fully adiabatic logics. We employed bootstrapped nMOS switches to simplify the nRERL circuits. With the simulation results for a full adder, we confirmed that the nRERL circuit consumed substantially less energy than the other adiabatic logic circuits at low-speed operation. We evaluated a test chip implemented with 0.8-/spl mu/m CMOS technology, which included a chain of nRERL inverters integrated with a clocked power generator. The nRERL inverter chain of 2400 stages consumed the minimum energy at V/sub dd/=3.5 V at 55 kHz, where the adiabatic and leakage losses are about equal, which is only 4.50% of the dissipated energy of its corresponding CMOS circuit at V/sub dd/=0.9 V. In conclusion, nRERL is more suitable than the other adiabatic logic circuits for the applications that do not require high performance but low energy consumption.

rapid system prototyping | 2000

Emulator environment based on an FPGA prototyping board

Kyung-Soo Oh; Sangyong Yoon; Soo-Ik Chae

We describe an emulator environment based on an FPGA prototyping board. This emulator environment is for functional verification of a multimedia processor we are developing and for software development and debugging of its application programs. For these purposes, the emulator environment includes a debugging network and provides virtual wires and some utilities, board control functions, and a virtual FPGA board. With this environment we verify the functionality of a multimedia processor and implements its cycle level simulator.

design automation conference | 2007

Simulink-based MPSoC design flow: case study of Motion-JPEG and H.264

Kai Huang; Sang-Il Han; Katalin Popovici; Lisane B. de Brisolara; Xavier Guerin; Lei Li; Xiaolang Yan; Soo-Ik Chae; Luigi Carro; Ahmed Amine Jerraya

System-level design methodologies have been introduced as a solution to handle the design complexity of embedded multiprocessor SoC (MPSoC) systems. In this paper we describe a system-level design flow starting from Simulink specification, focusing on concurrent hardware and software design and verification at four different abstraction levels: Simulink Combined Algorithm and Architecture Model (CAAM), Virtual Architecture, Transaction-accurate Model and Virtual Prototype. We used two multimedia applications, Motion-JPEG and H.264, to evaluate this design flow. Experimental results show that our design flow can generate various MPSoC architectures from Simulink CAAM correctly and efficiently, allowing processor and task design space exploration at different abstraction levels.

design automation conference | 2004

An efficient scalable and flexible data transfer architecture for multiprocessor SoC with massive distributed memory

Sang-Il Han; Amer Baghdadi; Marius Petru Bonaciu; Soo-Ik Chae; Ahmed Amine Jerraya

Massive data transfer encountered in emerging multimedia embedded applications requires architecture allowing both highly distributed memory structure and multiprocessor computation to be handled. The key issue that needs to be solved is then how to manage data transfers between large numbers of distributed memories. To overcome this issue, our paper proposes a scalable Distributed Memory Server (DMS) for multiprocessor SoC (MPSoC). The proposed DMS is composed of: (1) high-performance and flexible memory service access points (MSAPs), which execute data transfers without intervention of the processing elements, (2) data network, and (3) control network. It can handle direct massive data transfer between the distributed memories of an MPSoC. The scalability and flexibility of the proposed DMS are illustrated through the implementation of an MPEG4 video encoder for QCIF and CIF formats. The experiments show clearly how DMS can be adapted to accommodate different SoC configurations requiring various data transfer bandwidths. Synthesis results show that bandwidth can scale up to 28.8 GB/sec.

IEEE Journal of Solid-state Circuits | 2004

A 2.4-GHz 0.25-/spl mu/m CMOS dual-mode direct-conversion transceiver for bluetooth and 802.11b

Yeon-Jae Jung; Hoesam Jeong; Eunseok Song; Jungho Lee; Seung-Wook Lee; Donghyeon Seo; Inho Song; Sanghun Jung; Joonbae Park; Deog-Kyoon Jeong; Soo-Ik Chae; Wonchan Kim

A dual-mode transceiver integrates the transmitter of 0-dBm output power and the receiver for both Bluetooth with -87 dBm sensitivity and 802.11b with -86 dBm sensitivity in a single chip. A direct-conversion architecture enables the maximum reuse and the optimal current consumption of the various building blocks in each mode for a low-cost and low-power solution. A single-ended power-amplifer (PA) driver transmits the nominal output power of 0 dBm with 18-dB gain control in 3-dB steps. Only little area overhead is required in the baseband active filter and programmable gain amplifier (PGA) to provide the dual-mode capability with optimized current consumption. The DC-offset cancellation scheme coupled with PGAs implements the very low high-pass cutoff frequency with a smaller area than required by a simple coupling capacitor. Fabricated in 0.25-/spl mu/m CMOS process, the die area is 8.4 mm/sup 2/ including pads, and current consumption in RX is 50 mA for Bluetooth and 65 mA for 802.11b from a 2.7-V supply.

design automation conference | 2006

Buffer memory optimization for video codec application modeled in Simulink

Sang-Il Han; Xavier Guerin; Soo-Ik Chae; Ahmed Amine Jerraya

Reduction of the on-chip memory size is a key issue in video codec system design. Because video codec applications involve complex algorithms that are both data-intensive and control-dependent, memory optimization based on global and precise analysis of data and control dependency is required. We generate a memory-efficient C code from a restricted Simulink model, which can represent both data and control dependency explicitly, by applying two buffer memory optimization techniques: copy removal and buffer sharing. Copy removal is performed while parsing the Simulink model. Buffer sharing requires global scheduling and formal lifetime analysis. Experimental results on an H.264 video decoder show that the buffer memory size and execution time of the C code generated by the proposed method are 71% and 32% less than those of the C code produced by Simulinks C code generator, respectively. When compared to the hand written C code, the memory size was reduced by 27% while its execution time was increased by only 3%

Explore More