Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Debin Zhao is active.

Publication


Featured researches published by Debin Zhao.


systems man and cybernetics | 2008

The CAS-PEAL Large-Scale Chinese Face Database and Baseline Evaluations

Wen Gao; Bo Cao; Shiguang Shan; Xilin Chen; Delong Zhou; Xiaohua Zhang; Debin Zhao

In this paper, we describe the acquisition and contents of a large-scale Chinese face database: the CAS-PEAL face database. The goals of creating the CAS-PEAL face database include the following: 1) providing the worldwide researchers of face recognition with different sources of variations, particularly pose, expression, accessories, and lighting (PEAL), and exhaustive ground-truth information in one uniform database; 2) advancing the state-of-the-art face recognition technologies aiming at practical applications by using off-the-shelf imaging equipment and by designing normal face variations in the database; and 3) providing a large-scale face database of Mongolian. Currently, the CAS-PEAL face database contains 99 594 images of 1040 individuals (595 males and 445 females). A total of nine cameras are mounted horizontally on an arc arm to simultaneously capture images across different poses. Each subject is asked to look straight ahead, up, and down to obtain 27 images in three shots. Five facial expressions, six accessories, and 15 lighting changes are also included in the database. A selected subset of the database (CAS-PEAL-R1, containing 30 863 images of the 1040 subjects) is available to other researchers now. We discuss the evaluation protocol based on the CAS-PEAL-R1 database and present the performance of four algorithms as a baseline to do the following: 1) elementarily assess the difficulty of the database for face recognition algorithms; 2) preference evaluation results for researchers using the database; and 3) identify the strengths and weaknesses of the commonly used algorithms.


international soi conference | 2003

Illumination normalization for robust face recognition against varying lighting conditions

Shiguang Shan; Wen Gao; Bo Cao; Debin Zhao

Evaluations of the state-of-the-art of both academic face recognition algorithms and commercial systems have shown that recognition performance of most current technologies degrades due to the variations of illumination. We investigate several illumination normalization methods and propose some novel solutions. The main contribution includes: (1) A gamma intensity correction (GIC) method is proposed to normalize the overall image intensity at the given illumination level; (2) A region-based strategy combining GIC and the histogram equalization (HE) is proposed to further eliminate the side-lighting effect; (3) A quotient illumination relighting (QIR) method is presented to synthesize images under a predefined normal lighting condition from the provided face images captured under nonnormal lighting condition. These methods are evaluated and compared on the Yale illumination face database B and Harvard illumination face database. Considerable improvements are observed. Some conclusions are given at last.


Image and Vision Computing | 2005

Fast and robust text detection in images and video frames

Qixiang Ye; Qingming Huang; Wen Gao; Debin Zhao

Text in images and video frames carries important information for visual content understanding and retrieval. In this paper, by using multiscale wavelet features, we propose a novel coarse-to-fine algorithm that is able to locate text lines even under complex background. First, in the coarse detection, after the wavelet energy feature is calculated to locate all possible text pixels, a density-based region growing method is developed to connect these pixels into regions which are further separated into candidate text lines by structural information. Secondly, in the fine detection, with four kinds of texture features extracted to represent the texture pattern of a text line, a forward search algorithm is applied to select the most effective features. Finally, an SVM classifier is used to identify true text from the candidates based on the selected features. Experimental results show that this approach can fast and robustly detect text lines under various conditions.


visual communications and image processing | 2011

Fast mode decision algorithm for intra prediction in HEVC

Liang Zhao; Li Zhang; Siwei Ma; Debin Zhao

High Efficiency Video Coding (HEVC) is an ongoing video compression standard and a successor to H.264/AVC. It aims to provide significantly improved compression performance as compared to all existing video coding standards. In the intra prediction of HEVC, this is being achieved by providing up to 35 intra modes with larger coding unit. The optimal mode is selected by a rough mode decision (RMD) process from all of the available modes first, and then through the rate-distortion optimization (RDO) process for the final decision. Because the fact that every coding unit with different sizes is traversed in both procedures makes it very time-consuming, a gradient based fast mode decision algorithm is proposed in this paper to reduce the computational complexity of HEVC. Prior to intra prediction, gradient directions are calculated and a gradient-mode histogram is generated for each coding unit. Based on the distribution of the histogram, only a small part of the candidate modes are chosen for the RMD and the RDO processes. As compared to the default encoding scheme in HEVC test model HM 4.0, experimental results show that the fast intra mode decision scheme provides almost 20% time savings in all intra low complexity cases on average with negligible loss of coding efficiency.


IEEE Transactions on Multimedia | 2007

Joint Source-Channel Rate-Distortion Optimization for H.264 Video Coding Over Error-Prone Networks

Yuan Zhang; Wen Gao; Yan Lu; Qingming Huang; Debin Zhao

For a typical video distribution system, the video contents are first compressed and then stored in the local storage or transmitted to the end users through networks. While the compressed videos are transmitted through error-prone networks, error robustness becomes an important issue. In the past years, a number of rate-distortion (R-D) optimized coding mode selection schemes have been proposed for error-resilient video coding, including a recursive optimal per-pixel estimate (ROPE) method. However, the ROPE-related approaches assume integer-pixel motion-compensated prediction rather than subpixel prediction, whose extension to H.264 is not straightforward. Alternatively, an error-robust R-D optimization (ER-RDO) method has been included in H.264 test model, in which the estimate of pixel distortion is derived by simulating decoding process multiple times in the encoder. Obviously, the computing complexity is very high. To address this problem, we propose a new end-to-end distortion model for R-D optimized coding mode selection, in which the overall distortion is taken as the sum of several separable distortion items. Thus, it can suppress the approximation errors caused by pixel averaging operations such as subpixel prediction. Based on the proposed end-to-end distortion model, a new Lagrange multiplier is derived for R-D optimized coding mode selection in packet-loss environment by taking into account of the network conditions. The rate control and complexity issues are also discussed in this paper


Signal Processing-image Communication | 2009

Joint video/depth rate allocation for 3D video coding based on view synthesis distortion model

Yanwei Liu; Qingming Huang; Siwei Ma; Debin Zhao; Wen Gao

Joint video/depth rate allocation is an important optimization problem in 3D video coding. To address this problem, this paper proposes a distortion model to evaluate the synthesized view without access to the captured original view. The proposed distortion model is an additive model that accounts for the video-coding-induced distortion and the depth-quantization-induced distortion, as well as the inherent geometry distortion. Depth-quantization-induced distortion not only considers the warping error distortion, which is described by a piecewise linear model with the video power spectral property, but also takes into account the warping error correlation distortion between two sources reference views. Geometry distortion is approximated from that of the adjacent view synthesis. Based on the proposed distortion model, a joint rate allocation method is proposed to seek the optimal trade-off between video bit-rate and depth bit-rate for maximizing the view synthesis quality. Experimental results show that the proposed distortion model is capable of approximately estimating the actual distortion for the synthesized view, and that the proposed rate allocation method can almost achieve the identical rate allocation performance as the full-search method at less computational cost. Moreover, the proposed rate allocation method consumes less computational cost than the hierarchical-search method at high bit-rates while providing almost the equivalent rate allocation performance.


Image and Vision Computing | 2006

Object detection using spatial histogram features

Hongming Zhang; Wen Gao; Xilin Chen; Debin Zhao

In this paper, we propose an object detection approach using spatial histogram features. As spatial histograms consist of marginal distributions of an image over local patches, they can preserve texture and shape information of an object simultaneously. We employ Fisher criterion and mutual information to measure discriminability and features correlation of spatial histogram features. We further train a hierarchical classifier by combining cascade histogram matching and support vector machine. The cascade histogram matching is trained via automatically selected discriminative features. A forward sequential selection method is presented to construct uncorrelated and discriminative feature sets for support vector machine classification. We evaluate the proposed approach on two different kinds of objects: car and video text. Experimental results show that the proposed approach is efficient and robust in object detection.


systems man and cybernetics | 2007

Large-Vocabulary Continuous Sign Language Recognition Based on Transition-Movement Models

Gaolin Fang; Wen Gao; Debin Zhao

The major challenges that sign language recognition (SLR) now faces are developing methods that solve large-vocabulary continuous sign problems. In this paper, transition-movement models (TMMs) are proposed to handle transition parts between two adjacent signs in large-vocabulary continuous SLR. For tackling mass transition movements arisen from a large vocabulary size, a temporal clustering algorithm improved from k-means by using dynamic time warping as its distance measure is proposed to dynamically cluster them; then, an iterative segmentation algorithm for automatically segmenting transition parts from continuous sentences and training these TMMs through a bootstrap process is presented. The clustered TMMs due to their excellent generalization are very suitable for large-vocabulary continuous SLR. Lastly, TMMs together with sign models are viewed as candidates of the Viterbi search algorithm for recognizing continuous sign language. Experiments demonstrate that continuous SLR based on TMMs has good performance over a large vocabulary of 5113 Chinese signs and obtains an average accuracy of 91.9%


IEEE Transactions on Circuits and Systems for Video Technology | 2014

Image Restoration Using Joint Statistical Modeling in a Space-Transform Domain

Jian Zhang; Debin Zhao; Ruiqin Xiong; Siwei Ma; Wen Gao

This paper presents a novel strategy for high-fidelity image restoration by characterizing both local smoothness and nonlocal self-similarity of natural images in a unified statistical manner. The main contributions are three-fold. First, from the perspective of image statistics, a joint statistical modeling (JSM) in an adaptive hybrid space-transform domain is established, which offers a powerful mechanism of combining local smoothness and nonlocal self-similarity simultaneously to ensure a more reliable and robust estimation. Second, a new form of minimization functional for solving the image inverse problem is formulated using JSM under a regularization-based framework. Finally, in order to make JSM tractable and robust, a new Split Bregman-based algorithm is developed to efficiently solve the above severely underdetermined inverse problem associated with theoretical proof of convergence. Extensive experiments on image inpainting, image deblurring, and mixed Gaussian plus salt-and-pepper noise removal applications verify the effectiveness of the proposed algorithm.


IEEE Transactions on Circuits and Systems for Video Technology | 2008

Wyner–Ziv-Based Multiview Video Coding

Xun Guo; Yan Lu; Feng Wu; Debin Zhao; Wen Gao

Utilizing video correlations among views would definitely improve multiview video compression in terms of coding efficiency, which usually requests an expensive system to collect videos from different cameras and jointly compress them. Thanks to recent developments on distributed video coding, this paper proposes a new multiview video coding scheme based on Wyner-Ziv (WZ) coding technique, in which the complicated temporal and interview correlation exploration process is shifted from the encoder side to the decoder side so that broadband raw data traffic and high intensive computation for jointly encoding can be avoided. At the encoder side, a wavelet-based WZ scheme is proposed to compress video of every camera. Furthermore, in order to better utilize correlation in wavelet domain, all coefficients are organized as that done in SPIHT bit plane by bit plane. At the decoder side, a more flexible prediction technique that can jointly utilize temporal and view correlations is proposed to generate side information. Finally, experimental results show the proposed scheme significantly outperforms the conventional intra-frame coding for better random access and is even close to the inter-frame coding for better efficiency. Furthermore, compressed data is much robust when it is transmitted over an error-prone channel.

Collaboration


Dive into the Debin Zhao's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xiaopeng Fan

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xianming Liu

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Feng Jiang

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Shaohui Liu

Harbin Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge