Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chuan-Yung Tsai is active.

Publication


Featured researches published by Chuan-Yung Tsai.


international symposium on circuits and systems | 2005

Architecture design of H.264/AVC decoder with hybrid task pipelining for high definition videos

To-Wei Chen; Yu-Wen Huang; Tung-Chien Chen; Yu-Han Chen; Chuan-Yung Tsai; Liang-Gee Chen

The most critical issue of an H.264/AVC decoder is the system architecture design with balanced pipelining schedules and proper degrees of parallelism. In this paper, a hybrid task pipelining scheme is first presented to greatly reduce the internal memory size and bandwidth. Block-level, macroblock-level, and macroblock/frame-level pipelining schedules are arranged for CAVLD/IQ/IT/INTRA/spl I.bar/PRED, INTER/spl I.bar/PRED, and DEBLOCK, respectively. Appropriate degrees of parallelism for each pipeline task are also proposed. Moreover, efficient modules are contributed. The CAVLD unit smoothly decodes the bitstream into symbols without bubble cycles. The INTER/spl I.bar/PRED unit highly exploits the data reuse between interpolation windows of neighboring blocks to save 60% of external memory bandwidth. The DEBLOCK unit doubles the processing capability of our previous work with only 35.3% of logic gate count overhead. The proposed baseline profile decoder architecture can support up to 2048/spl times/1024 30 fps videos with 217 K logic gates, 10 KB SRAMs, and 528.9 MB/s bus bandwidth when operating at 120 MHz.


midwest symposium on circuits and systems | 2005

Bandwidth optimized motion compensation hardware design for H.264/AVC HDTV decoder

Chuan-Yung Tsai; Tung-Chien Chen; To-Wei Chen; Liang-Gee Chen

Design of H.264/AVC motion compensation (MC) is very challenging through the high memory bandwidth and low hardware utilization caused by the new functionalities of variable block size and 6-tap interpolation filter. In this paper, the vertically integrated double Z (VIDZ) schedule, and interpolation window reuse (IWR) and interpolation window classification (IWC) bandwidth reduction schemes are proposed to keep the MC highly utilized and save 60-80% memory bandwidth. The hardware of proposed MC is implemented at 120MHz with 47K logic gates and can support 2048 times 1024 30fps H.264/AVC HDTV decoder with less than 200MB/s memory bandwidth


IEEE Transactions on Circuits and Systems Ii-express Briefs | 2006

Architecture Design of Context-Based Adaptive Variable-Length Coding for H.264/AVC

Tung-Chien Chen; Yu-Wen Huang; Chuan-Yung Tsai; Bing-Yu Hsieh; Liang-Gee Chen

Context-based adaptive variable-length coding (CAVLC) is a new and important feature of the latest video coding standard, H.264/AVC. The direct VLSI implementation of CAVLC modified from the conventional run-length coding architecture will lead to low throughput and utilization. In this brief, an efficient CAVLC design is proposed. The main concept is the two-stage block pipelining scheme for parallel processing of two 4 times 4 blocks. When one block is processed by the scanning engine to collect the required symbols, its previous block is handled by the coding engine to translate symbols into bitstream. Our dual-block-pipelined architecture doubles the throughput and utilization of CAVLC at high bit rates. Moreover, a zero skipping technique is adopted to reduce up to 90% of cycles at low bit rates. Last but not least, Exp-Golomb coding for other general symbols and bitstream encapsulation for the network abstraction layer are integrated with CAVLC as a complete H.264/AVC baseline profile entropy coder. Simulation shows that our design is capable of real-time processing for 1920 times 1088 30-fps videos with 23.6 K logic gates at 100 MHz


symposium on vlsi circuits | 2007

2.8 to 67.2mW Low-Power and Power-Aware H.264 Encoder for Mobile Applications

Tung-Chien Chen; Yu-Han Chen; Chuan-Yung Tsai; Sung-Fang Tsai; Shao-Yi Chien; Liang-Gee Chen

A 2.8 to 67.2 mW H.264 encoder is implemented on a 12.8 mm2 die with 0.18 mum CMOS technology. The proposed parallel architectures along with fast algorithms and data reuse schemes enable 77.9% power savings. The power awareness is provided through a flexible system hierarchy that supports content-aware algorithms and module-wise gated clock.


IEEE Transactions on Circuits and Systems for Video Technology | 2007

Single Reference Frame Multiple Current Macroblocks Scheme for Multiple Reference Frame Motion Estimation in H.264/AVC

Tung-Chien Chen; Chuan-Yung Tsai; Yu-Wen Huang; Liang-Gee Chen

Due to the multiple reference frame motion estimation (MRF-ME), an H.264/AVC encoder requires ultrahigh memory bandwidth. Conventional multiple reference frames single current macroblock (MRSC) scheme only considers the data reuse within one frame, and requires on-chip memory size and off-chip memory bandwidth in proportional to the reference frame number. In this paper, a single reference frame multiple current macroblocks (SRMC) scheme is presented to further exploit the data reuse at frame level. With frame-level rescheduling of the motion estimation ME procedures in different reference frames, one loaded search window can be utilized by multiple current MBs in different original frames. The demanded on-chip memory size and off-chip memory bandwidth for MRF-ME can thus be reduced to those supporting only one reference frame. Moreover, based on SRMC scheme, an architecture prototype with two-stage mode decision flow is proposed. For HDTV specifications, 62.21 KB (74.8%) of SRAM and 364.3 MB/s (62.6%) of system bandwidth are saved in comparison with the MRSC scheme


IEEE Transactions on Circuits and Systems for Video Technology | 2009

Algorithm and Architecture Design of Power-Oriented H.264/AVC Baseline Profile Encoder for Portable Devices

Yu-Han Chen; Tung-Chien Chen; Chuan-Yung Tsai; Sung-Fang Tsai; Liang-Gee Chen

Because video services are becoming popular on portable devices, power becomes the primary design issue for video coders nowadays. H.264/AVC is an emerging video coding standard which can provide outstanding coding performance and thus is suitable for mobile applications. In this paper, we target a power-efficient H.264/AVC encoder. The main power consumption in an H.264/AVC encoding system is induced by data access of motion estimation (ME). At first, we propose hardware-oriented algorithms and corresponding parallel architectures of integer ME (IME) and fractional ME (FME) to achieve memory access power reduction. Then, a parameterized encoding system and flexible system architecture are proposed to provide power scalability and hardware efficiency, respectively. Finally, our design is implemented under TSMC 0.18 mum CMOS technology with 12.84 mm2 core area. The required hardware resources are 452.8 K logic gates and 16.95 KB SRAMs. The power consumption ranges from 67.2 to 43.5 mW under D1 (720 x 480) 30 frames/s video encoding, and more than 128 operating configurations are provided.


international symposium on circuits and systems | 2006

Low power and power aware fractional motion estimation of H.264/AVC for mobile applications

Tung-Chien Chen; Yu-Han Chen; Chuan-Yung Tsai; Liang-Gee Chen

In this paper, the low power design techniques from algorithm to architecture levels are proposed for fractional motion estimation in H.264/AVC. The proposed AMPD algorithm can reduce 50.8% power with up to 0.1 dB quality drop. The proposed parallel architecture with efficient memory hierarchy can efficiently reuse data and save 61.6% power. Furthermore, the power aware functionality is included. Our design can gracefully vary the quality degradation of 0.1-3.9 dB in response to the 22.58-1.64 mW power consumption. This power-oriented design is very efficient for different mobile applications in various power situations


international conference on multimedia and expo | 2009

Single-iteration full-search fractional motion estimation for quad full HD H.264/AVC encoding

Pei-Kuei Tsung; Wei-Yin Chen; Li-Fu Ding; Chuan-Yung Tsai; Tzu-Der Chuang; Liang-Gee Chen

Fractional motion estimation (FME) is widely used in video compression standards. In H.264/AVC, the precision of motion vector is down to quarter pixels to improve the coding efficiency. However, FME occupies over 45% of the computation complexity in an H.264 encoder and this high complexity limits the processing capability. In this paper, a single-iteration full search FME is proposed. By the algorithm and architecture co-optimization, the bandwidth to the frame buffer is reduced by 31%. Furthermore, 82% of circuit area for the Hadamard transformation and subtraction are saved from the direct implementation. Compared with prior arts, the proposed design supports 3.39 × higher throughput with only 0.02 dB PSNR drop. Thus, the specification of 4096 × 2160 quad full high definition H.264/AVC FME processing can be achieved.


international conference on multimedia and expo | 2006

Low Power Entropy Coding Hardware Design for H.264/AVC Baseline Profile Encoder

Chuan-Yung Tsai; Tung-Chien Chen; Liang-Gee Chen

Low power hardware design for entropy coding of H.264/AVC baseline profile encoder is urgent for the increasing mobile applications. However, previous works are poor in the power performance. In this paper, the first low power context-based adaptive variable length coding (CAVLC) scheme named the side information aided (SIA) symbol look ahead (SLA) one-pass CAVLC is proposed, with the non-zero and abs-one SIA flags. A reconfigurable architecture for the SLA module is also proposed to support the low power CAVLC scheme efficiently. The resultant hardware power is reduced by 69% to only 3.7 mW at 27 MHz and 1.8 V for CIF-sized video coding. The total logic gate count is 27 K gates


international conference on acoustics, speech, and signal processing | 2007

Low Power Cache Algorithm and Architecture Design for Fast Motion Estimation in H.264/AVC Encoder System

Chuan-Yung Tsai; Chen-Han Chung; Yu-Han Chen; Tung-Chien Chen; Liang-Gee Chen

Low power motion estimation (ME) of H.264/AVC is an important research issue because of the growing mobile applications of H.264/AVC encoder. In this paper, low power cache algorithm and architecture for fast ME of H.264/AVC is proposed in order to replace the conventional search range (SR) memory. With the block translation (BT) cache architecture, search trajectory prediction (STP) prefetching algorithm, and ultra low power cache miss hiding (CMH) strategy, 35% SR memory writing power and 67% SR memory static power are reduced for D1 videos. Combining fast ME with the proposed cache provides the total solution for low power ME hardware.

Collaboration


Dive into the Chuan-Yung Tsai's collaboration.

Top Co-Authors

Avatar

Liang-Gee Chen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Tung-Chien Chen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Yu-Han Chen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Yu-Wen Huang

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

To-Wei Chen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Sung-Fang Tsai

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Chen-Han Tsai

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Chung-Jr Lian

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Hung-Chi Fang

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Yu-Ju Lee

National Taiwan University

View shared research outputs
Researchain Logo
Decentralizing Knowledge