Huanqiang Zeng
Huaqiao University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Huanqiang Zeng.
IEEE Transactions on Circuits and Systems for Video Technology | 2009
Huanqiang Zeng; Canhui Cai; Kai-Kuang Ma
The intra-mode and inter-mode predictions have been made available in H.264/AVC for effectively improving coding efficiency. However, exhaustively checking for all the prediction modes for identifying the best one (commonly referred to as exhaustive mode decision) greatly increases computational complexity. In this paper, a fast mode decision algorithm, called the motion activity-based mode decision (MAMD), is proposed to speed up the encoding process by reducing the number of modes required to be checked in a hierarchical manner, and is as follows. For each macroblock, the proposed MAMD algorithm always starts with checking the rate-distortion (RD) cost computed at the SKIP mode for a possible early termination, once the RD cost value is below a predetermined ldquolowrdquo threshold. On the other hand, if the RD cost exceeds another ldquohighrdquo threshold, then this indicates that only the intra modes are worthwhile to be checked. If the computed RD cost falls between the above-mentioned two thresholds, the remaining seven modes, which are classified into three motion activity classes in our work, will be examined, and only one of the three classes will be chosen for further mode checking. The above-mentioned motion activity can be quantitatively measured, which is equal to the maximum city-block length of the motion vector taken from a set of adjacent macroblocks (i.e., region of support, ROS). This measurement is then used to determine the most possible motion-activity class for the current macroblock. Experimental results have shown that, on average, the proposed MAMD algorithm reduces the computational complexity by 62.96%, while incurring only 0.059 dB loss in PSNR (peak signal-to-noise ratio) and 0.19% increment on the total bit rate compared to that of exhaustive mode decision, which is a default approach set in the JM reference software.
IEEE Transactions on Circuits and Systems for Video Technology | 2010
Huanqiang Zeng; Kai-Kuang Ma; Canhui Cai
The intra mode prediction via exhaustive mode decision exploited in the H.264/advanced video coding effectively improves the coding efficiency, but at the expense of yielding higher computational complexity. In this letter, a fast intra mode decision algorithm, called the hierarchical intra mode decision (HIMD), is proposed to speed up the mode decision process by reducing the number of modes required to be checked for each macroblock. The novelty of the proposed HIMD algorithm lies at the following accounts. 1) An early decision with adaptive thresholding is developed for the mode decision of the luma component. 2) The candidate modes are selected according to their Hadamard distances and prediction directions. 3) Only one of the hierarchical paths will be chosen to compute its least rate-distortion cost. Experimental results have shown that the proposed HIMD algorithm achieves a reduction of 85.75% computational complexity on average, while incurring only 0.164 dB loss in peak signal-to-noise ratio (PSNR) and 2.336% increment on the total bit rate compared with that of exhaustive mode decision, which is a default approach set in the joint model reference software.
IEEE Transactions on Circuits and Systems for Video Technology | 2017
Jianqing Zhu; Huanqiang Zeng; Shengcai Liao; Zhen Lei; Canhui Cai; Li Xin Zheng
Person re-identification (Re-ID) aims to match person images captured from two non-overlapping cameras. In this paper, a deep hybrid similarity learning (DHSL) method for person Re-ID based on a convolution neural network (CNN) is proposed. In our approach, a light CNN learning feature pair for the input image pair is simultaneously extracted. Then, both the elementwise absolute difference and multiplication of the CNN learning feature pair are calculated. Finally, a hybrid similarity function is designed to measure the similarity between the feature pair, which is realized by learning a group of weight coefficients to project the elementwise absolute difference and multiplication into a similarity score. Consequently, the proposed DHSL method is able to reasonably assign complexities of feature learning and metric learning in a CNN, so that the performance of person Re-ID is improved. Experiments on three challenging person Re-ID databases, QMUL GRID, VIPeR, and CUHK03, illustrate that the proposed DHSL method is superior to multiple state-of-the-art person Re-ID methods.
IEEE Transactions on Circuits and Systems for Video Technology | 2014
Huanqiang Zeng; Xiaolan Wang; Canhui Cai; Jing Chen; Yan Zhang
The multiview video coding (MVC) adopts hierarchical B picture prediction structure and offers many prediction modes to effectively remove the spatial, temporal, and inter-view redundancies inherited in multiview video (MVV), but at the price of extremely high computational complexity. To address this problem, a fast MVC method by jointly using adaptive prediction structure (APS) and hierarchical mode decision (HMD) is proposed in this paper. The complexity reduction is achieved by: 1) designing four APSs for different MVV contents based on the fact that the contribution of the inter-view prediction varies from sequence to sequence and 2) developing an HMD scheme based on the observation that the relationship between the rate distortion (RD) cost and size of prediction mode is a unimodal function. In particular, for the current group of picture of the input MVV, the prediction structure is adaptively selected based on its characteristic, which is measured by the ratio of the average RD cost of the base view frames to the sum of the average RD cost of the base view frames and that of anchor frames in nonbase views, and then an HMD scheme is further performed to skip the checking process of those unlikely modes. The experimental results have shown that compared with the exhaustive mode decision in the MVC, the proposed algorithm achieves a reduction of the computational complexity by 83.49% on average, whereas incurring only a 0.086 dB loss in Bjontegaard delta peak signal-to-noise ratio and 2.97% increment on the total Bjontegaard delta bit rate.
international conference on image processing | 2016
Zhangkai Ni; Lin Ma; Huanqiang Zeng; Canhui Cai; Kai-Kuang Ma
Since the human visual system (HVS) is highly sensitive to edges, a novel image quality assessment (IQA) metric for assessing screen content images (SCIs) is proposed in this paper. The turnkey novelty lies in the use of an existing parametric edge model to extract two types of salient attributes - namely, edge contrast and edge width, for the distorted SCI under assessment and its original SCI, respectively. The extracted information is subject to conduct similarity measurements on each attribute, independently. The obtained similarity scores are then combined using our proposed edge-width pooling strategy to generate the final IQA score. Hopefully, this score is consistent with the judgment made by the HVS. Experimental results have shown that the proposed IQA metric produces higher consistency with that of the HVS on the evaluation of the image quality of the distorted SCI than that of other state-of-the-art IQA metrics.
Neurocomputing | 2016
Huanqiang Zeng; Jing Chen; Xiaolin Cui; Canhui Cai; Kai-Kuang Ma
This paper proposes a new local texture descriptor, called quad binary pattern (QBP). Compared with local binary pattern (LBP), the QBP is with stronger robustness for feature extraction under complex scene (e.g., luminance change, similar target and background color) and with lower computational complexity. To demonstrate its effectiveness, the proposed QBP is further applied on the mean-shift tracking, in which a joint color-QBP model is developed to effectively represent the color and texture characteristics of the target region. Extensive simulation results have demonstrated that the proposed algorithm is able to improve the tracking speed and accuracy, compared with the standard mean-shift tracking and joint color-LBP model based mean-shift tracking.
IEEE Transactions on Image Processing | 2017
Zhangkai Ni; Lin Ma; Huanqiang Zeng; Jing Chen; Canhui Cai; Kai-Kuang Ma
In this paper, an accurate full-reference image quality assessment (IQA) model developed for assessing screen content images (SCIs), called the edge similarity (ESIM), is proposed. It is inspired by the fact that the human visual system (HVS) is highly sensitive to edges that are often encountered in SCIs; therefore, essential edge features are extracted and exploited for conducting IQA for the SCIs. The key novelty of the proposed ESIM lies in the extraction and use of three salient edge features—i.e., edge contrast, edge width, and edge direction. The first two attributes are simultaneously generated from the input SCI based on a parametric edge model, while the last one is derived directly from the input SCI. The extraction of these three features will be performed for the reference SCI and the distorted SCI, individually. The degree of similarity measured for each above-mentioned edge attribute is then computed independently, followed by combining them together using our proposed edge-width pooling strategy to generate the final ESIM score. To conduct the performance evaluation of our proposed ESIM model, a new and the largest SCI database (denoted as SCID) is established in our work and made to the public for download. Our database contains 1800 distorted SCIs that are generated from 40 reference SCIs. For each SCI, nine distortion types are investigated, and five degradation levels are produced for each distortion type. Extensive simulation results have clearly shown that the proposed ESIM model is more consistent with the perception of the HVS on the evaluation of distorted SCIs than the multiple state-of-the-art IQA methods.
signal-image technology and internet-based systems | 2014
Wenjie Xiang; Canhui Cai; Zhangxin Wang; Huanqiang Zeng; Jing Chen
The intra prediction in HEVC exploits the new techniques, such as the tree coding structure and up to 35 intra prediction modes, etc., to achieve high coding efficiency but incur heavy computational complexity as well. To reduce the computational complexity, a novel fast mode decision algorithm for intra prediction in HEVC is proposed in this paper. In the proposed algorithm, by using the texture characteristic of the current prediction unit (PU) and the similarity between neighboring PUs, the proposed rough and refine decision schemes are sequentially utilized to reduce the number of candidate prediction modes required for checking so that the computational complexity can be greatly saved. Experimental results have demonstrated that the proposed algorithm can reduce, on average, 25% computational complexity with negligible loss of coding efficiency, compared with the fast intra mode decision method implemented in HEVC reference software.
conference on industrial electronics and applications | 2014
Zhangxin Wang; Huanqiang Zeng; Jing Chen; Canhui Cai
High Efficiency Video Coding (HEVC) is the newest video coding standard, which is developed by the Joint Collaborative Team on Video Coding (JCT-VC) consisting of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG). To improve the coding efficiency, HEVC adopts many new key techniques, such as coding tree unit, prediction unit, transform unit, multi-angle intra prediction, variable PU motion estimation, sample adaptive offset, and so on. Compared with its predecessor-H.264/AVC, HEVC is able to save about 50% bit rate while maintaining almost the same perceptual video quality. This paper aims to illustrate the key techniques of the HEVC standard and preview its extension in the near future.
Multidimensional Systems and Signal Processing | 2017
Aisheng Yang; Huanqiang Zeng; Jing Chen; Jianqing Zhu; Canhui Cai
With the advances in understanding perceptual properties of the human visual system, perceptual video coding, which aims to incorporate human perceptual mechanisms into video coding for maximizing the perceptual coding efficiency, becomes an essential research topic. Since the newest video coding standard—high efficiency video coding (HEVC) does not fully consider the perceptual characteristic of the input video, a perceptual feature guided rate distortion optimization (RDO) method is presented to improve its perceptual coding performance in this paper. In the proposed method, for each coding tree unit, the spatial perceptual feature (i.e., gradient magnitude ratio) and the temporal perceptual feature (i.e., gradient magnitude similarity deviation ratio) are extracted by considering the spatial and temporal perceptual correlations. These perceptual features are then utilized to guide the RDO process by perceptually adjusting the corresponding Lagrangian multiplier. By incorporating the proposed method into the HEVC, extensive simulation results have demonstrated that the proposed approach can significantly improve the perceptual coding performance and obtain better visual quality of the reconstructed video, compared with the original RDO in HEVC.