Zong-Yi Chen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zong-Yi Chen is active.

Explore More

Publication

Featured researches published by Zong-Yi Chen.

international conference on consumer electronics | 2011

Low power architecture design and hardware implementations of deblocking filter in H.264/AVC

Hua-Chang Chung; Zong-Yi Chen; Pao-Chi Chang

An adaptive in-loop deblocking filter (DF) is standardized in H.264/AVC to reduce blocking artifacts and improve compression efficiency. This paper proposes a low power DF architecture with hybrid and intelligent edge skip filtering order. We further adopt a four-stage pipeline to boost the speed of DF process and the proposed Horizontal Edge Skip Processing Architecture (HESPA) offers an edge skip aware mechanism for filtering the horizontal edges that not only reduces power consumption but also reduces the filtering processes down to 100 clock cycles per macroblock (MB). In addition, the architecture utilizes the buffers efficiently to store the temporary data without affecting the standarddefined data dependency by a reasonable strategy of edge filtering order to enhance the reusability of the intermediate data. The system throughput can then be improved and the power consumption can also be reduced. Simulation results show that more than 34% of logic power measured in FPGA can be saved when the proposed HESPA is enabled. Furthermore, the proposed architecture is implemented on a 0.18μm standard cell library, which consumes 19.8K gates at a clock frequency of 200 MHz, which compares competitively with other state-of-the-art works in terms of hardware cost.

international symposium on circuits and systems | 2009

An H.264 spatio-temporal hierarchical fast motion estimation algorithm for high-definition video

Yu-Shin Cheng; Zong-Yi Chen; Pao-Chi Chang

In the advanced H.264 video coding, the computation complexity is much higher than the previous video coding standards due to the variable block size and multi-reference frame features which are used in the motion compensation process. This paper proposes a hierarchical H.264 fast motion estimation algorithm to decrease the coding complexity in both spatial and temporal domains for encoding high-definition videos. In the spatial domain, we utilize the fast search method with a hierarchical-subsampling structure to decrease the memory access bandwidth of search points. In the temporal domain, we employ the linear motion model to further reduce the search ranges of multiple reference frames. This search algorithm is particularly suitable for being implemented in the parallel-processing architecture with the limited hardware resources. Simulation results show that the proposed algorithm can reduce up to 98.2% computation complexity of Full Search in JM with less than 0.1 dB video quality degradation.

Journal of Visual Communication and Image Representation | 2017

Rough mode costbased fast intra coding for high-efficiency video coding

Zong-Yi Chen; Pao-Chi Chang

A fast intra coding algorithm based on rough mode cost (RMC) is proposed for HEVC.Image complexity and gradient are considered.RMC values for fast CU depth decisions and RMC ratio for fast PU intra mode decisions.Fast TU depth decisions using the TU partition of the mode with the least RMC outperform the default method. The quadtree-based coding unit (CU) and transform unit (TU) structure, as well as various prediction units (PUs) of HEVC, considerably increase encoding complexity in intra coding and inter coding. This paper proposes a rough mode cost (RMC)-based algorithm for accelerating CU/TU depth decisions and PU mode decisions in HEVC intra coding. For CU depth decisions, RMC values are used for the fast determination of CU partition. In the case of PU mode decisions, modes with higher RMCs are removed from the candidate list to reduce the number of test modes. For TU depth decisions, the TU partition of the mode with the least RMC is used to determine the TU partitions of remaining modes. The proposed TU partitioning method demonstrates superior performance to the default method in reference software. The proposed algorithm can reduce encoding time by approximately 51% on average, with a 0.69% increase in the Bjntegaard-Delta (BD) rate.

data compression conference | 2015

SVM-Based Fast Intra CU Depth Decision for HEVC

Yen-Chun Liu; Zong-Yi Chen; Jiunn-Tsair Fang; Pao-Chi Chang

In this paper, a fast CU depth decision algorithm based on support vector machine (SVM) is proposed to reduce the computational complexity of HEVC intra coding. It is systematic to develop the criterion of early CU splitting and termination by applying SVM. Appropriate features for training SVM models are extracted from spatial domain and pixel domain. Artificial neural network is used to analyze the impact of each extracted feature on CU size decision, and different weights are assigned to the output of SVMs. The experimental results show that the proposed fast algorithm provides 58.9% encoding time saving at most, and 46.5% time saving on average compared with HM 12.1.

international conference on multimedia and expo | 2010

Fast inter-layer motion estimation algorithm on spatial scalability in H.264/AVC scalable extension

Zong-Yi Chen; Jhe-Wei Syu; Pao-Chi Chang

This paper proposes a fast inter-layer motion estimation algorithm on spatial scalability for scalable video coding extension of H.264/AVC. In the enhancement layer motion estimation, we utilize the relation between two motion vector predictors from the base layer and the enhancement layer respectively to reduce the number of searches. Additionally, we utilize the mode correlations of temporal direction motion estimation to save more encoding time. The simulation results show that the proposed algorithm can save the computation time up to 67.4% compared with JSVM9.12 with less than 0.0476dB video quality degradation.

Archive | 2014

An Efficient Fast CU Depth and PU Mode Decision Algorithm for HEVC

Zong-Yi Chen; He-Yan Chen; Pao-Chi Chang

High Efficiency Video Coding (HEVC) is a new video coding standard, which improves the coding efficiency significantly. To achieve the best performance, HEVC encoder evaluates all possible candidates to determine the best depth of coding unit (CU) and mode of prediction unit (PU). This increases substantial computational complexity that might become an obstacle for practical applications. This paper proposes a fast algorithm for CU and PU to reduce the encoding time of HEVC. By referring spatial and temporal depth information of CU and motion/texture characteristics of PU, the proposed algorithm skips rarely used depths and modes in certain situations. The experimental results show that our proposed method averagely achieves 57 % time saving in high efficiency configuration and 61 % in low complexity configuration with negligible rate-distortion loss compared with the reference software.

Journal of Visual Communication and Image Representation | 2016

Computational complexity allocation and control for inter-coding of high efficiency video coding with fast coding unit split decision

Jiunn-Tsair Fang; Zong-Yi Chen; Chang-Rui Lai; Pao-Chi Chang

A computational complexity allocation and control method for the low-delay P-frame configuration of the HEVC encoder.The complexity allocation includes the group of pictures layer, the frame layer, and the CU layer in the HEVC encoder.Motion vector estimation information is applied for CU complexity allocation and depth split determination. HEVC provides the quadtree structure of the coding unit (CU) with four coding-tree depths to facilitate high coding efficiency. However, compared with previous standards, the HEVC encoder increases computational complexity considerably, thus making it inappropriate for applications in power-constrained devices. This study therefore proposes a computational complexity allocation and control method for the low-delay P-frame configuration of the HEVC encoder. The complexity allocation includes the group of pictures (GOP) layer, the frame layer, and the CU layer in the HEVC encoder. Each layer involved uses individual method to distribute the complexity. In particular, motion vector estimation information is applied for CU complexity allocation and depth split determination. The total computational complexity can thus be reduced to 80% and 60% or even lower. Experiment results revealed that the average BD-PSNR exhibited a decrease of approximately 0.1dB and a BD-bitrate increment of 2% when the target complexity was reduced to 60%.

international conference on image processing | 2008

Coding-gain-based complexity control for H.264 video encoder

Ming-Chen Chien; Zong-Yi Chen; Pao-Chi Chang

The allowable computational complexity of video encoding is limited in a power-constrained system. Different video frames are associated with different motions and contexts, and so are associated with different computational complexities if no complexity control is utilized. Variation in computational complexity leads to encoding delay jittering. Typically motion estimation (ME) consumes much more computational complexity than other encoding tools. This work proposes a practical complexity control method based on the complexity analysis of an H.264 video encoder to determine the coding gain of each encoding tool in the video encoder. Experiments performed on a programming optimized source code show that the computational complexity associated with each frame is well controlled below a given limit with very little R-D performance degradation under a reasonable constraint comparing to the unconstrained case.

Journal of Electronic Imaging | 2014

Computation reduction in high-efficiency video coding based on the similarity of transform unit blocks

Zong-Yi Chen; Jiunn-Tsair Fang; Chung-Shian Chiang; Pao-Chi Chang

Abstract. The new video coding standard, high-efficiency video coding, adopts a quadtree structure to provide variable transform sizes in the transform coding process. The heuristic examination of transform unit (TU) modes substantially increases the computational complexity, compared to previous video coding standards. Thus, efficiently reducing the TU candidate modes is crucial. In the proposed similarity-check scheme, sub-TU blocks are categorized into a strongly similar case or a weakly similar case, and the early TU termination or early TU splitting procedure is performed. For the strongly similar case, a property called zero-block inheritance combined with a zero-block detection technique is applied to terminate the TU search process early. For the weakly similar case, the gradients of residuals representing the similarity of coefficients are used to skip the current TU mode or stop the TU splitting process. In particular, the computation time is further reduced because all the required information for the proposed mode decision criteria is derived before performing the transform coding. The experimental results revealed that the proposed algorithm can save ~64% of the TU encoding time on average in the interprediction, with a negligible rate-distortion loss.

international symposium on consumer electronics | 2013

Low power multi-lane MIPI CSI-2 receiver design and hardware implementations

Yueh-Chuan Lu; Zong-Yi Chen; Pao-Chi Chang

This paper proposes a low power multi-Lane Mobile Industry Processor Interface (MIPI) Camera Serial Interface 2 (CSI-2) receiver architecture which adopts an 8-Byte parallel CSI protocol layer for hardware implementations. The proposed scheme can work in environment with 4 data Lanes and 1 Gb/s per data Lane, i.e. with maximum data rate 4 Gb/s, at 62.5 MHz which increases logic operations from 8 ns (125 MHz) to 16 ns (62.5 MHz) without throughput degradation. Therefore, the supply voltage (1.2 V) can be reduced and the power consumption can also be reduced. The proposed architecture is implemented by 0.13 μm CMOS technology and the total gate count is 32.7 K. It not only reduces the operating clock rate but also reduces more than 37%~43% logic power consumption measured in chip.

Explore More