Shao-Yi Chien | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shao-Yi Chien is active.

Explore More

Publication

Featured researches published by Shao-Yi Chien.

IEEE Transactions on Circuits and Systems for Video Technology | 2002

Efficient moving object segmentation algorithm using background registration technique

Shao-Yi Chien; Shyh-Yih Ma; Liang-Gee Chen

An efficient moving object segmentation algorithm suitable for real-time content-based multimedia communication systems is proposed in this paper. First, a background registration technique is used to construct a reliable background image from the accumulated frame difference information. The moving object region is then separated from the background region by comparing the current frame with the constructed background image. Finally, a post-processing step is applied on the obtained object mask to remove noise regions and to smooth the object boundary. In situations where object shadows appear in the background region, a pre-processing gradient filter is applied on the input image to reduce the shadow effect. In order to meet the real-time requirement, no computationally intensive operation is included in this method. Moreover, the implementation is optimized using parallel processing and a processing speed of 25 QCIF fps can be achieved on a personal computer with a 450-MHz Pentium III processor. Good segmentation performance is demonstrated by the simulation results.

IEEE Transactions on Circuits and Systems for Video Technology | 2006

Analysis and architecture design of an HDTV720p 30 frames/s H.264/AVC encoder

Tung-Chien Chen; Shao-Yi Chien; Yu-Wen Huang; Chen-Han Tsai; Ching-Yeh Chen; To-Wei Chen; Liang-Gee Chen

H.264/AVC significantly outperforms previous video coding standards with many new coding tools. However, the better performance comes at the price of the extraordinarily huge computational complexity and memory access requirement, which makes it difficult to design a hardwired encoder for real-time applications. In addition, due to the complex, sequential, and highly data-dependent characteristics of the essential algorithms in H.264/AVC, both the pipelining and the parallel processing techniques are constrained to be employed. The hardware utilization and throughput are also decreased because of the block/MB/frame-level reconstruction loops. In this paper, we describe our techniques to design the H.264/AVC video encoder for HDTV applications. On the system design level, in consideration of the characteristics of the key components and the reconstruction loops, the four-stage macroblock pipelined system architecture is first proposed with an efficient scheduling and memory hierarchy. On the module design level, the design considerations of the significant modules are addressed followed by the hardware architectures, including low-bandwidth integer motion estimation, parallel fractional motion estimation, reconfigurable intrapredictor generator, dual-buffer block-pipelined entropy coder, and deblocking filter. With these techniques, the prototype chip of the efficient H.264/AVC encoder is implemented with 922.8 K logic gates and 34.72-KB SRAM at 108-MHz operation frequency.

IEEE Transactions on Circuits and Systems | 2006

Analysis and architecture design of variable block-size motion estimation for H.264/AVC

Ching-Yeh Chen; Shao-Yi Chien; Yu-Wen Huang; Tung-Chien Chen; Tu-Chih Wang; Liang-Gee Chen

Variable block-size motion estimation (VBSME) has become an important video coding technique, but it increases the difficulty of hardware design. In this paper, we use inter-/intra-level classification and various data flows to analyze the impact of supporting VBSME in different hardware architectures. Furthermore, we propose two hardware architectures that can support traditional fixed block-size motion estimation as well as VBSME with less chip area overhead compared to previous approaches. By broadcasting reference pixel rows and propagating partial sums of absolute differences (SADs), the first design has the fewer reference pixel registers and a shorter critical path. The second design utilizes a two-dimensional distortion array and one adder tree with the reference buffer that can maximize the data reuse between successive searching candidates. The first design is suitable for low resolution or a small search range, and the second design has advantages of supporting a high degree of parallelism and VBSME. Finally, we propose an eight-parallel SAD tree with a shared reference buffer for H.264/AVC integer motion estimation (IME). Its processing ability is eight times of the single SAD tree, but the reference buffer size is only doubled. Moreover, the most critical issue of H.264 IME, which is huge memory bandwidth, is overcome. We are able to save 99.9% off-chip memory bandwidth and 99.22% on-chip memory bandwidth. We demonstrate a 720-p, 30-fps solution at 108 MHz with 330.2k gate count and 208k bits on-chip memory

IEEE Transactions on Circuits and Systems for Video Technology | 2006

Analysis and complexity reduction of multiple reference frames motion estimation in H.264/AVC

Yu-Wen Huang; Bing-Yu Hsieh; Shao-Yi Chien; Shyh-Yih Ma; Liang-Gee Chen

In the new video coding standard H.264/AVC, motion estimation (ME) is allowed to search multiple reference frames. Therefore, the required computation is highly increased, and it is in proportion to the number of searched reference frames. However, the reduction in prediction residues is mostly dependent on the nature of sequences, not on the number of searched frames. Sometimes the prediction residues can be greatly reduced, but frequently a lot of computation is wasted without achieving any better coding performance. In this paper, we propose a context-based adaptive method to speed up the multiple reference frames ME. Statistical analysis is first applied to the available information for each macroblock (MB) after intra-prediction and inter-prediction from the previous frame. Context-based adaptive criteria are then derived to determine whether it is necessary to search more reference frames. The reference frame selection criteria are related to selected MB modes, inter-prediction residues, intra-prediction residues, motion vectors of subpartitioned blocks, and quantization parameters. Many available standard video sequences are tested as examples. The simulation results show that the proposed algorithm can maintain competitively the same video quality as exhaustive search of multiple reference frames. Meanwhile, 76 %-96 % of computation for searching unnecessary reference frames can be avoided. Moreover, our fast reference frame selection is orthogonal to conventional fast block matching algorithms, and they can be easily combined to achieve further efficient implementations.

IEEE Transactions on Multimedia | 2004

Fast video segmentation algorithm with shadow cancellation, global motion compensation, and adaptive threshold techniques

Shao-Yi Chien; Yu-Wen Huang; Bing-Yu Hsieh; Shyh-Yih Ma; Liang-Gee Chen

Automatic video segmentation plays an important role in real-time MPEG-4 encoding systems. Several video segmentation algorithms have been proposed; however, most of them are not suitable for real-time applications because of high computation load and many parameters needed to be set in advance. This paper presents a fast video segmentation algorithm for MPEG-4 camera systems. With change detection and background registration techniques, this algorithm can give satisfying segmentation results with low computation load. The processing speed of 40 QCIF frames per second can be achieved on a personal computer with an 800 MHz Pentium-III processor. Besides, it has shadow cancellation mode, which can deal with light changing effect and shadow effect. A fast global motion compensation algorithm is also included in this algorithm to make it applicable in slight moving camera situations. Furthermore, the required parameters can be decided automatically, which can enhance the proposed algorithm to have adaptive threshold ability. It can be integrated into MPEG-4 videophone systems and digital cameras.

multimedia signal processing | 2008

Fast image segmentation based on K-Means clustering with histograms in HSV color space

Tse-Wei Chen; Yi-Ling Chen; Shao-Yi Chien

A fast and efficient approach for color image segmentation is proposed. In this work, a new quantization technique for HSV color space is implemented to generate a color histogram and a gray histogram for K-Means clustering, which operates across different dimensions in HSV color space. Compared with the traditional K-Means clustering, the initialization of centroids and the number of cluster are automatically estimated in the proposed method. In addition, a filter for post-processing is introduced to effectively eliminate small spatial regions. Experiments show that the proposed segmentation algorithm achieves high computational speed, and salient regions of images can be effectively extracted. Moreover, the segmentation results are close to human perceptions.

IEEE Transactions on Circuits and Systems for Video Technology | 2007

Fast Algorithm and Architecture Design of Low-Power Integer Motion Estimation for H.264/AVC

Tung-Chien Chen; Yu-Han Chen; Sung-Fang Tsai; Shao-Yi Chien; Liang-Gee Chen

In an H.264/AVC video encoder, integer motion estimation (IME) requires 74.29% computational complexity and 77.49% memory access and becomes the most critical component for low-power applications. According to our analysis, an optimal low-power IME engine should be a parallel hardware architecture supporting fast algorithms and efficient data reuse (DR). In this paper, a hardware-oriented fast algorithm is proposed with the intra-/inter-candidate DR considerations. In addition, based on the systolic array and 2-D adder tree architecture, a ladder-shaped search window data arrangement and an advanced searching flow are proposed to efficiently support inter-candidate DR and reduce latency cycles. According to the implementation results, 97% computational complexity is saved by the proposed fast algorithm. In addition, 77.6% memory bandwidth is further saved with the proposed DR techniques at architecture level. In the ultra-low-power mode, the power consumption is 2.13 mW for real-time encoding CIF 30-fps videos at 13.5-MHz operating frequency

international conference on multimedia and expo | 2008

Content-aware image resizing using perceptual seam carving with human attention model

Daw-Sen Hwang; Shao-Yi Chien

In this paper, a new image resizing technique, perceptual seam carving, is proposed. With considering both face map and saliency map as human attention model in the energy function, it can keep important information in perceptual when the image is downsized. Moreover, a switching scheme between seam carving and resampling is also proposed to avoid excessively distorting the images. Experiments show that the proposed algorithm can generate more desirable resized images than cropping, resampling, and conventional seam carving techniques.

IEEE Transactions on Circuits and Systems for Video Technology | 2003

Predictive watershed: a fast watershed algorithm for video segmentation

Shao-Yi Chien; Yu-Wen Huang; Liang-Gee Chen

The watershed transform is a key operator in video segmentation algorithms. However, the computation load of watershed transform is too large for real-time applications. In this paper, a new fast watershed algorithm, named P-watershed, for image sequence segmentation is proposed. By utilizing the temporal coherence property of the video signal, this algorithm updates watersheds instead of searching watersheds in every frame, which can avoid a lot of redundant computation. The watershed process can be accelerated, and the segmentation results are almost the same as those of conventional algorithms. Moreover, an intra-inter watershed scheme (IP-watershed) is also proposed to further improve the results. Experimental results show that this algorithm can save 20%-50% computation without degrading the segmentation results. This algorithm can be combined with any video segmentation algorithm to give more precise segmentation results. An example is also shown by combining a background registration and change-detection-based segmentation algorithm with P-Watershed. This new video segmentation algorithm can give accurate object masks with acceptable computation complexity.

IEEE Transactions on Circuits and Systems for Video Technology | 2004

Global elimination algorithm and architecture design for fast block matching motion estimation

Yu-Wen Huang; Shao-Yi Chien; Bing-Yu Hsieh; Liang-Gee Chen

The critical path of the hardware for the global elimination algorithm (GEA) is too long to meet the real-time constraints for high-end applications. In this paper, we propose a new parallel GEA and its corresponding architecture. By dividing candidate blocks into independent groups and finding the most probable candidates of each group in parallel, instead of sequentially searching within the whole search range, parallel design can be developed as an array of GEA processing elements with much shorter critical path. Besides, the GEA processing element is optimized to reduce 30% of the gates, and the 2D data reuse is organized to save 80% of the SRAM bandwidth, which also reduces a lot of power. Simulation results show that our implementation can achieve real time processing of D1 30 Hz video with search range as H[-64, +63.5] V[-32, +31.5] while the operating frequency is 70 MHz, and the gate count is 113 K. Compared with full search, our gate count is six times smaller under the same frequency, and the PSNR loss is at most 0.1-0.2 dB.

Explore More