Myung Hoon Sunwoo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Myung Hoon Sunwoo is active.

Explore More

Publication

Featured researches published by Myung Hoon Sunwoo.

IEEE Transactions on Circuits and Systems | 2005

New continuous-flow mixed-radix (CFMR) FFT Processor using novel in-place strategy

Byung G. Jo; Myung Hoon Sunwoo

The paper proposes a new continuous-flow mixed-radix (CFMR) fast Fourier transform (FFT) processor that uses the MR (radix-4/2) algorithm and a novel in-place strategy. The existing in-place strategy supports only a fixed-radix FFT algorithm. In contrast, the proposed in-place strategy can support the MR algorithm, which allows CF FFT computations regardless of the length of FFT. The novel in-place strategy is made by interchanging storage locations of butterfly outputs. The CFMR FFT processor provides the MR algorithm, the in-place strategy, and the CF FFT computations at the same time. The CFMR FFT processor requires only two N-word memories due to the proposed in-place strategy. In addition, it uses one butterfly unit that can perform either one radix-4 butterfly or two radix-2 butterflies. The CFMR FFT processor using the 0.18 /spl mu/m SEC cell library consists of 37,000 gates excluding memories, requires only 640 clock cycles for a 512-point FFT and runs at 100 MHz. Therefore, the CFMR FFT processor can reduce hardware complexity and computation cycles compared with existing FFT processors.

IEEE Transactions on Very Large Scale Integration Systems | 2006

New degree computationless modified euclid algorithm and architecture for Reed-Solomon decoder

Jae Hyun Baek; Myung Hoon Sunwoo

This paper proposes a new degree computationless modified Euclid (DCME) algorithm and its dedicated architecture for Reed-Solomon (RS) decoder. This architecture has low hardware complexity compared with conventional modified Euclid (ME) architectures, since it can completely remove the degree computation and comparison circuits. The architecture employing a systolic array requires only the latency of 2t clock cycles to solve the key equation without initial latency. In addition, the DCME architecture using 3t+2 basic cells has regularity and scalability since it uses only one processing element. Hence, the proposed DCME architecture provides the short latency and low-cost RS decoding. The DCME architecture has been synthesized using the 0.25-mum Faraday CMOS standard cell library and operates at 200 MHz. The gate count of the DCME architecture is 21 760. Hence, the RS decoder using the proposed DCME architecture can reduce the total gate count by at least 23% and the total latency to at least 10% compared with conventional ME decoders

IEEE Transactions on Circuits and Systems for Video Technology | 2014

New Frame Rate Up-Conversion Algorithms With Low Computational Complexity

Un Seob Kim; Myung Hoon Sunwoo

This paper proposes a new frame rate up-conversion (FRUC) algorithm to reduce the computational complexity and to improve the peak signal-to-noise ratio (PSNR) performance. The proposed FRUC algorithm includes prediction-based motion vector smoothing (PMOS), partial average-based motion compensation (PAMC), and intrapredicted hole interpolation (IPHI). PMVS can efficiently remove outliers using motion vectors of neighboring blocks and PAMC performs motion compensation with the region-based partial average to reduce blocking artifacts of the interpolated frames. For hole interpolation, IPHI uses intraprediction of H.264/AVC to eliminate blurring and also uses the fixed weights implemented using only shift operations, which result in low computational complexity. Compared to the existing algorithms, which use bilateral motion estimation, the proposed algorithm improves the average PSNR of the interpolated frames by 3.44 dB and lowers PSNR performance only by 0.13 dB than the existing algorithm that employs unilateral ME; however, it can significantly reduce the computational complexity of FRUC about 89.3% based on the absolute difference.

international symposium on circuits and systems | 2002

A high-speed FFT processor for OFDM systems

Byung S. Son; Byung G. Jo; Myung Hoon Sunwoo; Yong Serk Kim

This paper proposes a high-speed FFT processor for orthogonal frequency-division multiplexing (OFDM) systems. The proposed architecture uses a single-memory for a small hardware size and uses a radix-4 algorithm for high speed. Its memory is partitioned into four banks for high-speed computation. It uses an in-place memory strategy that stores butterfly outputs in the same memory location used by butterfly inputs. The architecture has been modeled by VHDL and logic synthesis has been performed using the Samsung/sup /spl trade// 0.5/spl mu/m SOG cell library (KG80). The implemented FFT processor consists of 98,326 gates excluding RAM. The processor can operate at 42MHz and calculate a 256-point complex FFT in 6/spl mu/s.

signal processing systems | 2001

Design of new DSP instructions and their hardware architecture for high-speed FFT

Jae-Sung Lee; Young Seop Jeon; Myung Hoon Sunwoo

This paper presents new DSP (Digital Signal Processor) instructions and their hardware architecture for high-speed FFT. The instructions perform new operation flows, which are different from the MAC (Multiply and Accumulate) operation on which existing DSP chips heavily depend. This paper proposes the DPU (Data Processing Unit) supporting the instructions and shows it to be two times faster than existing DSP chips for FFT. The architecture has been modeled by the Verilog HDL and logic synthesis has been performed using the 0.35 μm standard cell library. The maximum operating clock frequency is about 144.5 MHz and the architecture will be employed on an application-specific DSP chip.

international symposium on circuits and systems | 2004

Implementation of application-specific DSP for OFDM systems

Jeong Hoo Lee; Jong Ha Moon; Kyung Lan Heo; Myung Hoon Sunwoo; Seung Keun Oh; In Ho Kim

This paper describes implementation of an application-specific DSP for OFDM modem systems. The proposed instructions can support computations of various blocks in OFDM systems. The data ALU is specially designed and optimized for the proposed instruction. Performance comparisons show that the number of clock cycles improves over 10% compared with Carmel DSP and over 50% compared with TMS320C62X for FFT computation. However, the size of the DSP is much smaller than existing DSPs. The proposed architecture is implemented using iPROVE FPGA board and verification is performed using assembly programs that implements most of OFDM blocks. The SQNR value of FFT output is about 71 dB.

international symposium on circuits and systems | 2007

Simplified Degree Computationless Modified Euclid's Algorithm and its Architecture

Jaehyun Baek; Myung Hoon Sunwoo

This paper proposes a new simplified degree computationless modified Euclids algorithm (S-DCME) and its architecture for Reed-Solomon decoders. The proposed S-DCME algorithm uses the new initial conditions, and thus, it can combine the data path for loading initial values and the data path for a switching operation. Hence, the S-DCME algorithm can reduce the number of multiplexers and has high performance compared with the existing DCME algorithm and the RiBM algorithm. The gate count using the MagnaChip HSI 0.25mum standard cell library is 17,800.

signal processing systems | 2008

ASIP Approach for Implementation of H.264/AVC

Sung Dae Kim; Myung Hoon Sunwoo

This paper presents an Application Specific Instruction Set Processor (ASIP) for implementation of H.264/AVC, called Video Specific Instruction-set Processor (VSIP). The proposed VSIP has novel instructions and optimized hardware architectures for specific applications, such as intra prediction, in-loop deblocking filter, integer transform, etc. Moreover, VSIP has coprocessors for computation intensive parts in video signal processing, such as inter prediction and entropy coding. The proposed VSIP has much smaller area and can dramatically reduce the number of memory access compared with commercial DSP chips, which result in low power consumption. Moreover, the proposed hardware accelerators have small size, consume low power consumption, and thus, they can support real-time video processing. VSIP has been thoroughly verified using an FPGA board having the Xilinx™ Virtex II. VSIP can implement a real-time H.264/AVC decoder. The proposed VSIP is one of promising solutions for video signal processing.

asia and south pacific design automation conference | 2006

ASIP approach for implementation of H.264/AVC

Sung Dae Kim; Jeong Hoo Lee; Chung Jin Hyun; Myung Hoon Sunwoo

This paper presents an application-specific instruction set processor (ASIP) approach for implementation of H.264/AVC. The proposed ASIP has special instructions for intra prediction, deblocking filter, integer transform, etc. In addition, the proposed ASIP has hardware accelerators for inter prediction and entropy coding. Performance comparisons show a significant improvement compared with existing DSPs. The proposed hardware accelerators have small size and can support real-time video processing. Moreover, the proposed ASIP can handle various multimedia standards. The results indicate that the ASIP approach is one of promising solutions for H.264/AVC

signal processing systems | 2002

Design of a high speed OFDM modem system for powerline communications

Kyung Lan Heo; Sung M. Cho; J.W. Lee; Myung Hoon Sunwoo; Seong Keun Oh

This paper proposes a high speed OFDM modem architecture for powerline communications. The proposed modem using symmetric communication is designed to be compatible with the HomePlug standard. The HomePlug standard adopts from DC to 25 MHz frequency bandwidth, 128 subcarriers for OFDM transmission, and BPSK, DBPSK, and DQPSK modulations for each subcarrier. In particular, this paper proposes algorithms and the associated architectures for the signal detection, AGC and frame synchronization. The AGC and frame synchronization algorithms are based on the symbol power ratio and the sliding cross-correlation of preamble, respectively. The frame is then synchronized with a position of the minimum correlation value. In addition, an area-efficient integrate-and-dump architecture for frame synchronization is proposed. The proposed architectures have been validated using SPW/spl trade/ and implemented with Verilog-XL. We show that the BER performance of the proposed modem under an AWGN channel is similar to the theoretical one.

Explore More