Yao-Chang Yang
National Chung Cheng University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yao-Chang Yang.
international solid-state circuits conference | 2006
Chien-Chang Lin; Jia-Wei Chen; Hsiu-Cheng Chang; Yao-Chang Yang; Yi-Huan Ou Yang; Ming-Chih Tsai; Jiun-In Guo; Jinn-Shyan Wang
In this paper, a low-cost H.264/AVC video decoder design is presented for high definition television (HDTV) applications. Through optimization from algorithmic and architectural perspectives, the proposed design can achieve real-time H.264 video decoding on HD1080 video (1920 times 1088@30 Hz) when operating at 120 MHz with 320 mW power dissipation. Fabricated by using the TSMC one-poly six-metal 0.18 mum CMOS technology, the proposed design occupies 2.9times2.9 mm2 silicon area with the hardware complexity of 160K gates and 4.5K bytes of local memory
international solid-state circuits conference | 2007
Hsiu-Cheng Chang; Jia-Wei Chen; Ching-Lung Su; Yao-Chang Yang; Yao Li; C.W. Chang; Ze-Min Chen; Wei-Sen Yang; Chien-Chang Lin; Ching-Wen Chen; Jinn-Shan Wang; Jiun-In Quo
A dynamic quality-scalable H.264 video encoder is presented for power-adaptive video encoding. In 0.13mum CMOS technology, it requires 470kgates/13.3kB SRAM and consumes 7mW/183mW in encoding 30fps CIF/HD720 video. Compared to the state-of-the-art design for real-time HD720 video encoding, a 49% reduction in gate count and a 61% reduction in internal memory is achieved
IEEE Transactions on Circuits and Systems for Video Technology | 2009
Yao-Chang Yang; Jiun-In Guo
In this letter we propose a high-throughput VLSI architecture design for H.264 high-profile context-based adaptive binary arithmatic coding (HP CABAC) decoding for HDTV applications. To speed up the inherent sequential CABAC decoding, we eliminate the bottleneck by proposing a look-ahead decision parsing technique on the grouped context table with cache registers, which reduces 62% of cycle count on average as compared with the original CABAC decoding. In addition, the proposed design supports the macroblock adaptive frame field coding tools in H.264 main profile coding and 8 times 8 transform in H.264 high-profile coding. It achieves the real-time processing for H.264 CABAC decoding up to L4.1@30 frames/s with maximum 60 Mbits/s when operating at 105 MHz.
international conference on multimedia and expo | 2006
Yao-Chang Yang; Chien-Chang Lin; Hsui-Cheng Chang; Ching-Lung Su; Jiun-In Guo
In this paper we present a high throughput VLSI architecture design for context-based adaptive binary arithmetic decoding (CABAD) in MPEG-4 AVC/H.264. To speed-up the inherent sequential operations in CABAD, we break down the processing bottleneck by proposing a look-ahead codeword parsing technique on the segmenting context tables with cache registers, which averagely reduces up to 53% of cycle count. Based on a 0.18 mum CMOS technology, the proposed design outperforms the existing design by both reducing 40% of hardware cost and achieving about 1.6 times data throughput at the same time
signal processing systems | 2007
C.W. Chang; Jia-Wei Chen; Hsiu-Cheng Chang; Yao-Chang Yang; Jinn-Shyan Wang; Jiun-In Guo
In this paper, we propose a quality scalable H.264/AVC baseline intra encoder with two hardware sharing mechanisms and three timing optimizing schemes. The proposed hardware sharing schemes share the common terms among intra prediction of different modes to reduce the hardware cost. The proposed timing optimizing schemes are used to improve the data throughput rate. The proposed design supports different clock rates of 26/33/47 MHz and 70/85 MHz to encode SD and HD720 video sequences with 30fps respectively with different qualities. According to a 0.13¿m CMOS technology, the proposed design costs 170K gates and 4.43 KB of internal SRAM at clock rate of 130MHz.
international conference on multimedia and expo | 2010
Chih-Chuan Yang; Kheng-Joo Tan; Yao-Chang Yang; Jiun-In Guo
Fractional motion estimation (FME) searches subpixels for blocks of various sizes to find out the best matching candidate, which improves the compression efficiency but leads to high computational complexity. In this paper, we propose a low complexity FME design supporting adaptive mode selection (AMS). Exploiting both the mode correlation among neighboring macroblocks and integer motion estimation result, we can select FME modes adaptively to reduce complexity with good video quality. The simulation result shows that the average number of processing modes in FME is reduced to be 2 for HD1080 video instead of processing 7 modes in JM reference software. With tiny PSNR drop (~0.026dB), the proposed design achieves 71% reduction in computational complexity when compared to JM14.2. At operating frequency of 300MHz, the proposed design could support the real-time processing for H.264 videos with resolution up to 4K × 2K.
asia pacific conference on circuits and systems | 2006
Ching-Lung Su; Wei-Sen Yang; Ya-Li Chen; Yao-Chang Yang; Ching-Wen Chen; Jiun-In Guo; Shau-Yin Tseng
In this paper, a low complexity high quality motion estimation architecture design was proposed for MPEG-4 AVC/H.264 video coding applications. The proposed design is based on a low complexity algorithm that reduces over 90% of complexity at the cost of 0.06968dB and 0.08296dB PSNR drop as compared to JM9.3 full search with a plusmn32 search range at CIF and D1 formats, respectively. Besides, the algorithm provides a capacity of scalable search range. We have also exploited an on-chip memory rotation scheme and a configurable summation of absolute difference processor to reduce the on-chip memory bandwidth and the hardware cost. According to the TSMC 0.18mum CMOS technology, the proposed design costs 47.9K gates, 4K bits of Cur./Ref. pixel buffer and 22 Kbits SRAM with the maximum working frequency of 125MHz. The proposed design can achieve realtime motion estimation on D1 video and HD720 video when operating at 40MHz and 105MHz, respectively
asia and south pacific design automation conference | 2011
Cheng-An Chien; Yao-Chang Yang; Hsiu-Cheng Chang; Jia-Wei Chen; Cheng-Yen Chang; Jiun-In Guo; Jinn-Shyan Wang; Ching-Hwa Cheng
This paper proposes a dual mode video decoder with 4-level temporal/spatial scalability and 32/64-bit adjustable memory bus width. A design automation environment for simulation and verification is established to automatically verify the correctness and completeness of the proposed design. Using a 0.13 um CMOS technology, it comprises 439Kgates/10.9KB SRAM and consumes 2∼328mW in decoding CIF∼HD1080 videos at 3.75∼30fps when operating at 1∼150MHz, respectively.
international soc design conference | 2008
Jui-Chin Chu; Liang-Fei Su; Yao-Chang Yang; Jiun-In Guo; Ching-Lung Su
In this paper a low cost and low power multi-mode entropy decoder is proposed. The proposed design is compatible to the entropy decoding for JPEG, MPEG-1/2/4, H.264 and VC-1 video coding standards. It adopts the code-word tables merging and sharing, and integrates the various entropy decoding into a single programmable design. To reduce the required memory space, a generic look-up table partition strategy covering various video coding standards is proposed. Besides, the low power concept of high probability data path with lower capacitance is also taken into account. The proposed multi-mode entropy decoder is implemented using TSMC 0.13 mum at the cost of 113,884 gates and 0.54 KB SRAM. Its maximum operating frequency achieves 166 MHz, which can support entropy decoding on high definition video larger than HD1080@48 fps.
asian solid state circuits conference | 2009
Cheng-An Chien; Yao-Chang Yang; Hsiu-Cheng Chang; Jiun-In Guo; Jia-Wei Chen; Jinn-Shan Wang; Chin-Hsien Wang; Hsiang-Hui Huang; Ching-Hwa Cheng
The first dual mode video decoder with 4-level temporal/spatial scalability and 32/64-bit adjustable memory bus width is proposed. A design automation environment of simulation and verification is established to automatically verify the correctness and completeness of the proposed design. Using a 0.13 μm CMOS technology, it comprises 439Kgates/10.9KB SRAM and consumes 2~328mW in decoding CIF~HD1080 videos at 3.75~30fps when operating at 1~150MHz, respectively.