Baihua Xiao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Baihua Xiao is active.

Explore More

Publication

Featured researches published by Baihua Xiao.

computer vision and pattern recognition | 2012

Sparse representation for face recognition based on discriminative low-rank dictionary learning

Long Ma; Chunheng Wang; Baihua Xiao; Wen Zhou

In this paper, we propose a discriminative low-rank dictionary learning algorithm for sparse representation. Sparse representation seeks the sparsest coefficients to represent the test signal as linear combination of the bases in an over-complete dictionary. Motivated by low-rank matrix recovery and completion, assume that the data from the same pattern are linearly correlated, if we stack these data points as column vectors of a dictionary, then the dictionary should be approximately low-rank. An objective function with sparse coefficients, class discrimination and rank minimization is proposed and optimized during dictionary learning. We have applied the algorithm for face recognition. Numerous experiments with improved performances over previous dictionary learning methods validate the effectiveness of the proposed algorithm.

Pattern Recognition Letters | 2013

Scene text detection using graph model built upon maximally stable extremal regions

Cunzhao Shi; Chunheng Wang; Baihua Xiao; Yang Zhang; Song Gao

Scene text detection could be formulated as a bi-label (text and non-text regions) segmentation problem. However, due to the high degree of intraclass variation of scene characters as well as the limited number of training samples, single information source or classifier is not enough to segment text from non-text background. Thus, in this paper, we propose a novel scene text detection approach using graph model built upon Maximally Stable Extremal Regions (MSERs) to incorporate various information sources into one framework. Concretely, after detecting MSERs in the original image, an irregular graph whose nodes are MSERs, is constructed to label MSERs as text regions or non-text ones. Carefully designed features contribute to the unary potential to assess the individual penalties for labeling a MSER node as text or non-text, and color and geometric features are used to define the pairwise potential to punish the likely discontinuities. By minimizing the cost function via graph cut algorithm, different information carried by the cost function could be optimally balanced to get the final MSERs labeling result. The proposed method is naturally context-relevant and scale-insensitive. Experimental results on the ICDAR 2011 competition dataset show that the proposed approach outperforms state-of-the-art methods both in recall and precision.

computer vision and pattern recognition | 2013

Scene Text Recognition Using Part-Based Tree-Structured Character Detection

Cunzhao Shi; Chunheng Wang; Baihua Xiao; Yang Zhang; Song Gao; Zhong Zhang

Scene text recognition has inspired great interests from the computer vision community in recent years. In this paper, we propose a novel scene text recognition method using part-based tree-structured character detection. Different from conventional multi-scale sliding window character detection strategy, which does not make use of the character-specific structure information, we use part-based tree-structure to model each type of character so as to detect and recognize the characters at the same time. While for word recognition, we build a Conditional Random Field model on the potential character locations to incorporate the detection scores, spatial constraints and linguistic knowledge into one framework. The final word recognition result is obtained by minimizing the cost function defined on the random field. Experimental results on a range of challenging public datasets (ICDAR 2003, ICDAR 2011, SVT) demonstrate that the proposed method outperforms state-of-the-art methods significantly both for character detection and word recognition.

computer vision and pattern recognition | 2013

Cross-View Action Recognition via a Continuous Virtual Path

Zhong Zhang; Chunheng Wang; Baihua Xiao; Wen Zhou; Shuang Liu; Cunzhao Shi

In this paper, we propose a novel method for cross-view action recognition via a continuous virtual path which connects the source view and the target view. Each point on this virtual path is a virtual view which is obtained by a linear transformation of the action descriptor. All the virtual views are concatenated into an infinite-dimensional feature to characterize continuous changes from the source to the target view. However, these infinite-dimensional features cannot be used directly. Thus, we propose a virtual view kernel to compute the value of similarity between two infinite-dimensional features, which can be readily used to construct any kernelized classifiers. In addition, there are a lot of unlabeled samples from the target view, which can be utilized to improve the performance of classifiers. Thus, we present a constraint strategy to explore the information contained in the unlabeled samples. The rationality behind the constraint is that any action video belongs to only one class. Our method is verified on the IXMAS dataset, and the experimental results demonstrate that our method achieves better performance than the state-of-the-art methods.

IEEE Signal Processing Letters | 2012

Action Recognition Using Context-Constrained Linear Coding

Zhong Zhang; Chunheng Wang; Baihua Xiao; Wen Zhou; Shuang Liu

Although traditional bag-of-words model has shown promising results for action recognition, it takes no consideration of the relationship among spatio-temporal points; furthermore, it also suffers serious quantization error. In this letter, we propose a novel coding strategy called context-constrained linear coding (CLC) to overcome these limitations. We first calculate the contextual distance between local descriptors and each codeword by considering the spatio-temporal contextual information. Then, linear coding using contextual distance is adopted to alleviate the quantization error. Our method is verified on two challenging databases (KTH and UCF sports), and the experimental results demonstrate that our method achieves better results than previous methods in action recognition.

acm multimedia | 2012

Deep nonlinear metric learning with independent subspace analysis for face verification

Xinyuan Cai; Chunheng Wang; Baihua Xiao; Xue Chen; Ji Zhou

Face verification is the task of determining by analyzing face images, whether a person is who he/she claims to be. It is a very challenge problem, due to large variations in lighting, background, expression, hairstyle and occlusion. The crucial problem is to compute the similarity of two face vectors. Metric learning has provides a viable solution to this problem. Until now, many metric learning algorithms have been proposed, but they are usually limited to learning a linear transformation (i.e. finding a global Mahalanobis metric). In this brief, we propose a nonlinear metric learning method, which learns an explicit mapping from the original space to an optimal subspace, using deep Independent Subspace Analysis network. Compared to kernel methods, which can also learn nonlinear transformations, our method is a deep and local learning architecture, and therefore exhibits more powerful ability to learn the nature of highly variable dataset. We evaluate our method on the LFW benchmark, and results show very comparable performance to the state-of-art methods (achieving 92.28% accuracy), while maintaining simplicity and good generalization ability.

IEEE Geoscience and Remote Sensing Letters | 2015

Automatic Cloud Detection for All-Sky Images Using Superpixel Segmentation

Shuang Liu; Linbo Zhang; Zhong Zhang; Chunheng Wang; Baihua Xiao

Cloud detection plays an essential role in meteorological research and has received considerable attention in recent years. However, this issue is particularly challenging due to the diverse characteristics of clouds. In this letter, a novel algorithm based on superpixel segmentation (SPS) is proposed for cloud detection. In our proposed strategy, a series of superpixels could be obtained adaptively by SPS algorithm according to the characteristics of clouds. We first calculate a local threshold for each superpixel and then determine a threshold matrix for the whole image. Finally, cloud can be detected by comparing with the obtained threshold matrix. Experimental results show that our proposed algorithm achieves better performance than the current cloud detection algorithms.

IEEE Transactions on Information Forensics and Security | 2013

Attribute Regularization Based Human Action Recognition

Zhong Zhang; Chunheng Wang; Baihua Xiao; Wen Zhou; Shuang Liu

Recently, attributes have been introduced as a kind of high-level semantic information to help improve the classification accuracy. Multitask learning is an effective methodology to achieve this goal, which shares low-level features between attributes and actions. Yet such methods neglect the constraints that attributes impose on classes, which may fail to constrain the semantic relationship between the attributes and actions. In this paper, we explicitly consider such attribute-action relationship for human action recognition, and correspondingly, we modify the multitask learning model by adding attribute regularization. In this way, the learned model not only shares the low-level features, but also gets regularized according to the semantic constrains. In addition, since attribute and class label contain different amounts of semantic information, we separately treat attribute classifiers and action classifiers in the framework of multitask learning for further performance improvement. Our method is verified on three challenging datasets (KTH, UIUC, and Olympic Sports), and the experimental results demonstrate that our method achieves better results than that of previous methods on human action recognition.

international conference on machine vision | 2007

A robust system for text extraction in video

Jingchao Zhou; Lei Xu; Baihua Xiao; Ruwei Dai; Si si

This paper presents a novel system to extract caption text in video. Firstly, text regions are detected primarily with emphasis on the recall rate. Then a multiple stage verification scheme is adopted to discard false alarms and boost the precision rate. Secondly, a text polarity estimation algorithm is provided. Based on it, multiple frame enhancement is conducted to strengthen the contrast between text and its background. Finally, a connected component filtering method is proposed to generate clear segmentation results and improve recognition performance. Experimental results confirm that the proposed system is robust and efficient.

IEEE Transactions on Circuits and Systems for Video Technology | 2014

Scene Text Recognition Using Structure-Guided Character Detection and Linguistic Knowledge

Cunzhao Shi; Chunheng Wang; Baihua Xiao; Song Gao; Jinlong Hu

Scene text recognition has inspired great interests from the computer vision community in recent years. In this paper, we propose a novel scene text-recognition method integrating structure-guided character detection and linguistic knowledge. We use part-based tree structure to model each category of characters so as to detect and recognize characters simultaneously. Since the character models make use of both the local appearance and global structure informations, the detection results are more reliable. For word recognition, we combine the detection scores and language model into the posterior probability of character sequence from the Bayesian decision view. The final word-recognition result is obtained by maximizing the character sequence posterior probability via Viterbi algorithm. Experimental results on a range of challenging public data sets (ICDAR 2003, ICDAR 2011, SVT) demonstrate that the proposed method achieves state-of-the-art performance both for character detection and word recognition.

Explore More