Shiann Rong Kuang
National Cheng Kung University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shiann Rong Kuang.
IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 1999
Jer Min Jou; Shiann Rong Kuang; Ren Der Chen
In this work, two designs of low-error fixed-width sign-magnitude parallel multipliers and twos-complement parallel multipliers for digital signal processing applications are presented. Given two n-bit inputs, the fixed-width multipliers generate n-bit (instead of 2 n-bit) products with low product error, but use only about half the area and less delay when compared with a standard parallel multiplier. In them, cost-effective carry-generating circuits are designed, respectively, to make the products generated more accurately and quickly. Applying the same approach, a low error reduced-width multiplier with output bit-width between n- and 2n has also been designed. Experimental results show that the proposed fixed-width and reduced-width multipliers have lower error than all other fixed-width multipliers and are still cost effective. Due to these properties, they are very suitable for use in many multimedia and digital signal processing applications such as digital filtering, arithmetic coding, wavelet transformation, echo cancellation, etc.
IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 2001
Shiann Rong Kuang; Jer Min Jou; Ren Der Chen; Yeu Horng Shiau
Arithmetic coding is an attractive technique for lossless data compression but it tends to be slow. In this paper, a dynamic pipelined very large scale integration architecture with high performance for on-line adaptive binary arithmetic coding is presented. To obtain a high throughput pipelined architecture, we first analyze the computation flow of the coding algorithm and modify the operations whose data and/or control dependencies cause the difficulties in pipelining. Then, a novel technique called dynamic pipelining is developed to pipeline the coding process with variant (or run-time determined) pipeline latencies (or data initialization intervals) efficiently. As for data path design, a systematic design methodology of high level synthesis and a lower-area but faster fixed-width multiplier are applied, which implement the architecture with a little additional hardware. The dynamic pipelined architecture has been designed and simulated in Verilog HDL, and its layout has also been implemented with the 0.8-/spl mu/m SPDM CMOS process and the ITRI-CCL cell library. Its simulated compression speeds under working frequencies of 25 and 50 MHz are about 6 and 12.5 Mb/s, respectively. About two times the speedup with 30% hardware overhead relative to the original sequential realisation is achieved.
IEEE Transactions on Circuits and Systems I-regular Papers | 2002
Jer Min Jou; Yeu Horng Shiau; Pei Y. Chen; Shiann Rong Kuang
Taking advantage of the prediction ability provided by Gray system theory, the Gray prediction search (GrPS) algorithm can determine the motion vectors of image blocks correctly and quickly. A dedicated GrPS chip, which is with low cost and has the properties of regular-data-flow computations, is proposed in this paper to support the MPEG video resolution in real time. With 0.6-/spl mu/m CMOS technology, the proposed chip needs a die size of 2.8/spl times/2.9 mm/sup 2/ with about 54-K transistors, and can work with a clock rate of 66 MHz. Since GrPS performs better than other fast search algorithms, such as TSS, CS, PHODS, FSS, and SES, this low-cost GrPS chip is a good candidate for real-time motion estimation.
IEEE Transactions on Circuits and Systems I-regular Papers | 1999
Jer Min Jou; Shiann Rong Kuang; Ren Der Chen
Color correction, which nonlinearly converts the color coordinates of a scanner into that of a printer, is important and difficult for multimedia applications. An efficient tree-based fuzzy logic approach has been proposed to solve the problem. However, the algorithm design and the implementation are slow, costly, and of poor color quality, due to the use of maximum criteria for defuzzification. In this paper, a new fuzzy tree inference algorithm for color correction is proposed. It is very simple and efficient and is suitable for adopting the center-average method for defuzzification to obtain better color quality. Therefore, a fast and cost-efficient implementation with good correction effects for tree-based fuzzy color correction is achieved.
international symposium on circuits and systems | 2002
Jer Min Jou; Shiann Rong Kuang; Kuang Ming Wu
A key aspect of an IP cores marketability is its ability to be easily integrated across a wide variety of interfaces. In this paper, we propose an efficient hierarchical interface design methodology and models so that a designer can quickly design an IP cores interface, which can be easily integrated into any interface/bus architecture. The proposed methodology and models have been applied to design an MP3 decoder with different interfaces: an ISA bus interface and a PCI bus interface. The results demonstrate that the methodology and models result in easy IP integration and only a little performance overhead.
IEEE Transactions on Very Large Scale Integration Systems | 2002
Jer Min Jou; Shiann Rong Kuang; Yeu Horng Shiau; Ren Der Chen
Color correction, which nonlinearly converts the color coordinates of an input device such as the scanner and digital camera into that of an output device such as the color laser printer, is important for multimedia applications. In this work, we present a novel dynamic pipelined VLSI architecture for the fuzzy color correction algorithm (FCC) proposed by Jou et al. (see IEEE Trans. Circuits Syst. I, vol.46, p.773-775, June 1998) to meet the speed requirement of time-critical applications. To promote the performance, the presented architecture is dynamically pipelined with unfixed or run-time determined latencies (or data initiation intervals) and the speculation technique is also applied, then the problems of arduous pipelining, due to the variant execution time of each iteration and slower executing of FCC are solved efficiently. As for data path design, a systematic design methodology of high-level synthesis is used. As a result, a significant (about 2 times) speedup of the dynamic pipelined architecture with a slight hardware overhead relative to the sequential one has been achieved.
international symposium on circuits and systems | 1996
Jer Min Jou; Shiann Rong Kuang; Yuh Lin Chen; Chung Yuan Chiang
This paper describes the design and implementation of a CMOS VLSI chip for data compression and decompression using adaptive binary arithmetic codes. During the design process, the systematic design methodology of high level synthesis is applied so that both of the minimum of hardware resource and the maximum of processing speed about the chip are compromised soundly. The chip implements a new flexible modeler which estimates the probabilities of binary symbols efficiently using the table-look-up approach with 1024 bytes SRAM and 288 bytes ROM. An asynchronous interface circuit for I/O communication of the chip is designed thus the I/O operation and compression operation in the chip can be done in parallel. The concept of design for testability is used and a full scan is implemented in the chip. A prototype 0.8-micro chip has been designed and verified, and fabricated by CIC, it occupies 4.2*4.5 mm/sup 2/ of silicon area. The chip can yield a compression and decompression rate of 3 Mbits/sec with a clock rate of 25 MHz.
Electronics Letters | 1997
Jer Min Jou; Shiann Rong Kuang
IEEE Transactions on Circuits and Systems I-regular Papers | 1998
Shiann Rong Kuang; Jer Min Jou; Yuh Lin Chen
Electronics Letters | 2006
Shiann Rong Kuang; J.-P. Wang