Xingyu Zhang
Hong Kong University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xingyu Zhang.
multimedia signal processing | 2013
Yongfang Shi; Oscar Chi Lim Au; Hong Zhang; Xingyu Zhang; Luheng Jia; Wei Dai; Wenjing Zhu
The High Efficiency Video Coding (HEVC) is the next generation video coding standard beyond H.264/AVC. Compared with only up to 9 modes for intra prediction in H.264/AVC, HEVC provides 35 intra prediction modes (IPM) to improve coding efficiency, which inevitably poses a huge complexity burden to the encoder. To speed up the HEVC encoder, a novel fast mode decision (FMD) algorithm for HEVC intra prediction is proposed. In the proposed algorithm, we analyzed the costs generated by rough mode decision (RMD), which has already been incorporated in the HM software. We found that the RMD costs listed by mode number generally follow the same trend with the rate-distortion optimization (RDO) costs. Further, the local salient modes, whose RMD costs have a significant drop compared with adjacent modes, tend to be promising competitors for the optimal mode. Based on these observations, we further reduced the number of the candidates for the RDO process. Experimental results show that our proposed algorithm achieves averagely 19.0% (up to 33.6%) encoding time saving whilst causing negligible RD performance loss (0.4% BD-Rate increase on average) compared with HM 7.0 anchor.
IEEE Transactions on Image Processing | 2014
Xingyu Zhang; Christophe Gisquet; Edouard Francois; Feng Zou; Oscar Chi Lim Au
In this paper, we investigate a new inter-channel coding mode called LM mode proposed for the next generation video coding standard called high efficiency video coding. This mode exploits inter-channel correlation using reconstructed luma to predict chroma linearly with parameters derived from neighboring reconstructed luma and chroma pixels at both encoder and decoder to avoid overhead signaling. In this paper, we analyze the LM mode and prove that the LM parameters for predicting original chroma and reconstructed chroma are statistically the same. We also analyze the error sensitivity of the LM parameters. We identify some LM mode problematic situations and propose three novel LM-like modes called LMA, LML, and LMO to address the situations. To limit the increase in complexity due to the LM-like modes, we propose some fast algorithms with the help of some new cost functions. We further identify some potentially-problematic conditions in the parameter estimation (including regression dilution problem) and introduce a novel model correction technique to detect and correct those conditions. Simulation results suggest that considerable BD-rate reduction can be achieved by the proposed LM-like modes and model correction technique. In addition, the performance gain of the two techniques appears to be essentially additive when combined.
international symposium on circuits and systems | 2013
Yongfang Shi; Oscar Chi Lim Au; Xingyu Zhang; Hong Zhang; Rui Ma; Luheng Jia
The nested quadtree based partitioning scheme of HEVC contributes a lot to the coding efficiency improvement, however, it adds significant complexity to the encoder. This paper introduces a fast prediction unit (PU) level quadtree depth decision (FPDD) algorithm. It is achieved by making use of the inherited correlation of PU quadtree structure between current largest coding unit (LCU) and its spatial and temporal neighbors. To reduce error propagation, we also propose a confidence grading scheme to prevent LCUs with bad prediction from being referred to by others. Results show that our proposed algorithm provides averagely 20.0% (up to 39.3%) encoding time reduction whilst causing negligible RD performance loss (0.2% BD-Rate increase on average) compared with HM 7.0.
IEEE Transactions on Circuits and Systems for Video Technology | 2013
Chao Pang; Oscar Chi Lim Au; Feng Zou; Jingjing Dai; Xingyu Zhang; Wei Dai
In this paper, an analytical framework for frame-level dependent bit allocation (DBA) in hybrid video coding is proposed. First, the dependency of neighboring frames is quantitatively measured with the proposed inter-frame dependency model (IFDM). Based on the proposed IFDM, the problem of frame-level DBA among a number of frames of different frame types is studied, and the optimal solution is achieved through successive convex optimization. The prove the validity of the proposed framework, a case study of current state-of-the-art standard H.264/AVC is conducted. Experimental results show that significant gain of up to 0.9dB in PSNR can be obtained.
IEEE Journal of Selected Topics in Signal Processing | 2013
Feng Zou; Oscar C. Au; Chao Pang; Jingjing Dai; Xingyu Zhang; Lu Fang
The directional intra prediction (IP) in H.264/AVC and HEVC tends to cause the residue to be anisotropic. To transform the IP residue, Mode Dependent Directional Transform (MDDT) based on Karhunen Loève transform (KLT) can achieve better energy compaction than DCT, with one transform assigned to each prediction mode. However, due to the data variation, different residue blocks with the same IP mode may not have the same statistical properties. Instead of constraining one transform for each IP mode, in this paper, we propose a novel rate-distortion optimized transform (RDOT) scheme which allows a set of specially trained transforms to be available to all modes, and each block can choose its preferred transform to minimize the rate-distortion (RD) cost. We define a cost function which is an estimate of the true RD cost and use the Lloyd-type algorithm (a sequence of transform optimization and data reclassification alternately) to find the optimal set of transforms. The proposed RDOT scheme is implemented in HM9.0 software of HEVC. Experimental results suggest that RDOT effectively achieves 1.6% BD-Rate reduction under the Intra Main condition and 1.6% BD-Rate reduction under the Intra High Efficiency (HE) 10bit condition.
international symposium on circuits and systems | 2013
Hong Zhang; Oscar Chi Lim Au; Yongfang Shi; Xingyu Zhang; Ketan Tang; Yuanfang Guo
High-Efficiency Video Coding (HEVC) is the newest video coding standard which can significantly reduce the bit rate by 50% compared with existing standards. The key features and new tools in HEVC are designed for natural video sequences captured by a real camera. Different from natural videos, screen content contain much more edges in text and icon regions. The current video coding standards may blur or even remove low contrast edges, which are very important in screen content for human eyes to recognize the character and the icon. Therefore, this paper proposes an effective modification on HEVC to preserve the low contrast edges in screen content. First, discrete laplacian filter is adopted for edge detection, and then we adaptively adjust QPs for low contrast edge regions, which can be detected based on our designed measurement for edge contrast. Experimental results show that nearly all the regions containing low contrast edges can be detected, and the adjustment of QPs for these regions can greatly protect the edges with no RD performance reduction.
international conference on multimedia and expo | 2013
Xingyu Zhang; Oscar Chi Lim Au; Chao Pang; Wei Dai; Yuanfang Guo; Lu Fang
Transform coefficients take up a significant portion of the transmitted bitstream generated by video codec. For every non-zero coefficient, the sign information is sent separately from the amplitude. According to the statistical patterns, the signs of the transform coefficients are rarely predictable, thus bypass mode is usually used in the following CABAC engine to encode the sign information. In H.265/HEVC, a novel multiple sign bits hiding (MSBH) technique is adopted to hide one sign for each selected CG. In this paper, we propose to further perform one additional sign bit hiding (ASBH) for the selected TBs while the MSBH can coexist. To verify the effectiveness of our proposed ASBH, we implemented it on the state-of-the-art HM9.0 test model of H.265/HEVC. Experimental results show that the proposed ASBH always outperforms the anchor for all tested sequences with very trivial complexity added at the decoder side.
international symposium on circuits and systems | 2013
Yuanfang Guo; Oscar C. Au; Ketan Tang; Jiahao Pang; Wenxiu Sun; Lingfeng Xu; Jiali Li; Xingyu Zhang
Halftone image watermarking has been explored and developed rapidly over the past decade. However, there are still issues to be studied. This paper presents a data hiding method called Data Hiding by Dual Color Conjugate Error Diffusion (DHDCCED) to hide a binary secret pattern into two error diffused color halftone images, such that when the two color halftone images are overlaid, the secret pattern will be revealed. The experimental results show that DHDCCED can significantly improve the performances when comparing both the correct decoding rate and the visual quality of the revealed secret pattern to the existing method Color Conjugate Error Diffusion (CCED).
international conference on acoustics, speech, and signal processing | 2014
Xingyu Zhang; Ming-Ting Sun; Lu Fang; Oscar Chi Lim Au
Most digital cameras use a single sensor coupled with a Color Filter Array (CFA) to capture images, and apply demosaicking to interpolate the full color images. In reality, the CFA image is noisy, which causes problems in the demosaicking process. This paper proposes a Joint Denoising and Demo-saicking based on inter-Color correlation (JDDC) scheme. We propose a new framework that linearly combines an extracted luminance image and a low-passed RGB images to get a full color image. Given the noise in the extracted luminance image and the low-passed RGB images are non-stationary and partially correlated, we modify the classical Non-Local Means (NLM) filter to denoise the extracted luminance image and the low-passed RGB images before the combination. Experimental results verify the effectiveness of the proposed scheme both objectively and subjectively.
visual communications and image processing | 2013
Wenjing Zhu; Oscar Chi Lim Au; Haitao Yang; Wei Dai; Hong Zhang; Xingyu Zhang
Scalable video coding (SVC), which is an extension of H.264/AVC video coding standard, was introduced to provide scalability in different dimensions for adaptation to heterogeneous network and terminals. After the finalization of the new video coding standard called High Efficiency Video Coding (HEVC), the effort of the standardization committee has been redirected to the investigation of the scalable extension of HEVC. In addition to the basic inter-layer texture prediction mechanism, several other coding tools were proposed for coding performance improvement. Among those coding tools, the one called generalized residual prediction (GRP) scheme achieves most significant coding gain while the consumption of computational power is also huge. In this paper, the GRP mechanism is formulated and analyzed in detail. In addition, the combination of GRP mechanism with merge mode in HEVC is proposed for simplification of the existing GRP mechanism. Results show a better trade-off could be achieved by greatly reducing computational complexity while maintaining most of the coding gain.