Is this you? Create Your Porfile

Chien-Ming Wu

National Yunlin University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chien-Ming Wu is active.

Explore More

Publication

Featured researches published by Chien-Ming Wu.

IEEE Transactions on Computers | 2004

High-speed, low-complexity systolic designs of novel iterative division algorithms in GF(2/sup m/)

Chien-Hsing Wu; Chien-Ming Wu; Ming-Der Shieh; Yin-Tsung Hwang

We extend the binary algorithm invented by Stein and propose novel iterative division algorithms over GF(2/sup m/) for systolic VLSI realization. While algorithm EBg is a basic prototype with guaranteed convergence in at most 2m - 1 iterations, its variants, algorithms EBd and EBdf, are designed for reduced complexity and fixed critical path delay, respectively. We show that algorithms EBd and EBdf can be mapped to parallel-in parallel-out systolic circuits with low area-time complexities of O(m/sup 2/loglogm) and O(m/sup 2/), respectively. Compared to the systolic designs based on the extended Euclids algorithm, our circuits exhibit significant speed and area advantages.

international symposium on circuits and systems | 2001

Design of an efficient FFT processor for DAB system

Hsin-Fu Lo; Ming-Der Shieh; Chien-Ming Wu

This paper describes the design of Fast Fourier Transform (FFT) for the Eureka-147 DAB system. We investigate several possible FFT implementations based on the single butterfly architecture, including an in-place memory structure, to minimize the hardware requirement. We also describe a unified approach toward partitioning the whole memory into several banks so as to increase the equivalent memory bandwidth between the memory unit and the butterfly unit, which can be implemented in either radix-2 or high-radix arithmetic. Implementation results demonstrate the applicability of our work to the targeted channel demodulator and the advantages over previous solutions.

international conference on consumer electronics | 1999

Design and implementation of a DAB channel decoder

Ming-Der Shieh; Chien-Ming Wu; Hsiao-Hsing Chou; Min-Hui Chen; Chia-Liang Liu

This paper describes the design of the de-interleaver and punctured Viterbi decoder for the Eureka-147 DAB system and their corresponding VLSI implementations. We emphasize how to efficiently handle four DAB transmission modes, time/frequency de-interleaving and path metric/survivor memory management in our development. Results show that our implementation has the characteristics of modular design, consuming less silicon area, and facilitating the extension for high transmission rate requirement. The core size of the resulting chip implementation is 4990/spl times/4930 /spl mu/m/sup 2/ based on the Taiwan Semiconductor Manufacturing Company (TSMC) 0.6 /spl mu/m single-polysilicon-triple-metal CMOS process.

international symposium on circuits and systems | 2001

Systolic VLSI realization of a novel iterative division algorithm over GF(2/sup m/): a high-speed, low-complexity design

Chien-Hsing Wu; Chien-Ming Wu; Ming-Der Shieh; Yin-Tsung Hwang

We present a parallel-in parallel-out systolic division circuit over GF(2/sup m/) based on the novel extended Steins algorithm that provides guaranteed convergence in 2/sup m/-1 iterations. The area-time (AT) complexity of our design is O(m/sup 2/) and the achievable maximum clock rate is 1 GHz based on the 0.6 /spl mu/m technology. Compared to the best systolic design known to date based on the extended Euclids algorithm the proposed circuit exhibits significant area and speed advantages.

IEEE Transactions on Very Large Scale Integration Systems | 2005

VLSI architectural design tradeoffs for sliding-window log-MAP decoders

Chien-Ming Wu; Ming-Der Shieh; Chien-Hsing Wu; Yin-Tsung Hwang; Jun-Hong Chen

Turbo codes have received tremendous attention and have commenced their practical applications due to their excellent error-correcting capability. Investigation of efficient iterative decoder realizations is of particular interest because the underlying soft-input soft-output decoding algorithms usually lead to highly complicated implementation. This paper describes the architectural design and analysis of sliding-window (SW) Log-MAP decoders in terms of a set of predetermined parameters. The derived mathematical representations can be applied to construct a variety of VLSI architectures for different applications. Based on our development, a SW-Log-MAP decoder complying with the specification of third-generation mobile radio systems is realized to demonstrate the performance tradeoffs among latency, average decoding rate, area/computation complexity, and memory power consumption. This paper thus provides useful and general information on practical implementation of SW-Log-MAP decoders.

international symposium on circuits and systems | 1999

An area-efficient versatile Reed-Solomon decoder for ADSL

Jin-Chuan Huang; Chien-Ming Wu; Ming-Der Shieh; Chien-Hsing Wu

We present an area-efficient, bit-serial VLSI architecture for the t-error-correcting, (n, k)-scalable Reed-Solomon (RS) decoder in GF(2/sup M/) based on the modified Euclidean algorithm. With its ability to support a variety of (n, k) RS codes, this RS decoder is suitable for applications such as the asymmetric digital subscriber line (ADSL) and cable modems.

international symposium on circuits and systems | 1998

Efficient management of in-place path metric update and its implementation for Viterbi decoders

Ming-Der Shieh; Ming-Hwa Sheu; Chien-Ming Wu; Wann-Shyang Ju

The in-place path metric scheduling is known as an efficient approach for sequential processing of the trellis, where the number of add compare select (ACS) units or processors is less than the number of states. In this paper, a systematic approach to partitioning a centralized memory into several banks to increase the memory bandwidth for in-place path metric update in Viterbi decoders is presented. Similar concepts can be extended to distribute the memory banks into ACS units if the ACS units are scheduled correspondingly to keep the interconnection minimal. Implementation results show that in terms of trade-off between hardware overhead and required memory bandwidth, an expected performance improvement can be achieved based on the proposed technique, especially for the trellis with a long constraint length.

international symposium on circuits and systems | 2003

Implementation of channel demodulator for DAB system

Chien-Ming Wu; Ming-Der Shieh; Hsin-Fu Lo; Min-Hsiung Hu

This paper describes the VLSI implementation of Fast Fourier Transform (FFT) for the Eureka-147 Digital Audio Broadcasting (DAB) system. We emphasize how to minimize the hardware requirement and efficiently manage the memory to meet the DAB requirement. Implementation results demonstrate the applicability of our work with the characteristics of modular design, consuming less silicon area, and facilitating the extension for high transmission rate applications. The core size of the resulting chip implementation is 2086/spl times/1806 /spl mu/m/sup 2/ based on the TSMC 0.35 /spl mu/m 1P4M CMOS process. Performance evaluation reveals that our design for the targeted channel demodulator outperform previous solutions.

international symposium on circuits and systems | 2002

Memory arrangements in turbo decoders using sliding-window BCJR algorithm

Chien-Ming Wu; Ming-Der Shieh; Chien-Hsing Wu

Turbo coding is a powerful encoding and decoding technique that can provide highly reliable data transmission at extremely low signal-to-noise ratios. According to the computational complexity of the employed decoding algorithm, the realization of turbo decoders usually takes a large amount of memory spaces and potentially long decoding delay. Therefore, an efficient memory management strategy becomes one of the key factors toward successfully designing turbo decoders. This paper focuses on the development of general formulas for efficient memory management of turbo decoders employing the sliding-window BCJR algorithm. Three simple but general results are presented to evaluate the required memory size, throughput rate, and latency based on the speed and the number of adopted processors. The results thus provide useful and general information on practical implementations of turbo decoders.

international symposium on circuits and systems | 2001

VLSI architecture of extended in-place path metric update for Viterbi decoders

Chien-Ming Wu; Ming-Der Shieh; Chien-Hsing Wu; Ming-Hwa Sheu

Efficient memory management is always the key technique for successfully designing the Viterbi decoders. In this paper, a novel and efficient in-place scheduling approach of path metric update and its hardware implementation are developed to increase the equivalent memory bandwidth with limited hardware overhead. The resulting architecture has the following characteristics: (I) The whole memory call be systematically partitioned into several sets of banks and each set can be treated as a local memory of a specific add compare select (ACS) unit. (II) The interconnects between the memory banks and ACS units as well as those between adjacent ACS units an regular and simple such that it is very suitable for VLSI array implementation. Our approach can not only provide a methodology for designing high-performance Viterbi decoders, but also give the trade-off between hardware requirement and computation time for updating path metrics, especially for the convolutional code with larger memory order.

Explore More