Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Cunzhao Shi is active.

Publication


Featured researches published by Cunzhao Shi.


Pattern Recognition Letters | 2013

Scene text detection using graph model built upon maximally stable extremal regions

Cunzhao Shi; Chunheng Wang; Baihua Xiao; Yang Zhang; Song Gao

Scene text detection could be formulated as a bi-label (text and non-text regions) segmentation problem. However, due to the high degree of intraclass variation of scene characters as well as the limited number of training samples, single information source or classifier is not enough to segment text from non-text background. Thus, in this paper, we propose a novel scene text detection approach using graph model built upon Maximally Stable Extremal Regions (MSERs) to incorporate various information sources into one framework. Concretely, after detecting MSERs in the original image, an irregular graph whose nodes are MSERs, is constructed to label MSERs as text regions or non-text ones. Carefully designed features contribute to the unary potential to assess the individual penalties for labeling a MSER node as text or non-text, and color and geometric features are used to define the pairwise potential to punish the likely discontinuities. By minimizing the cost function via graph cut algorithm, different information carried by the cost function could be optimally balanced to get the final MSERs labeling result. The proposed method is naturally context-relevant and scale-insensitive. Experimental results on the ICDAR 2011 competition dataset show that the proposed approach outperforms state-of-the-art methods both in recall and precision.


computer vision and pattern recognition | 2013

Scene Text Recognition Using Part-Based Tree-Structured Character Detection

Cunzhao Shi; Chunheng Wang; Baihua Xiao; Yang Zhang; Song Gao; Zhong Zhang

Scene text recognition has inspired great interests from the computer vision community in recent years. In this paper, we propose a novel scene text recognition method using part-based tree-structured character detection. Different from conventional multi-scale sliding window character detection strategy, which does not make use of the character-specific structure information, we use part-based tree-structure to model each type of character so as to detect and recognize the characters at the same time. While for word recognition, we build a Conditional Random Field model on the potential character locations to incorporate the detection scores, spatial constraints and linguistic knowledge into one framework. The final word recognition result is obtained by minimizing the cost function defined on the random field. Experimental results on a range of challenging public datasets (ICDAR 2003, ICDAR 2011, SVT) demonstrate that the proposed method outperforms state-of-the-art methods significantly both for character detection and word recognition.


computer vision and pattern recognition | 2013

Cross-View Action Recognition via a Continuous Virtual Path

Zhong Zhang; Chunheng Wang; Baihua Xiao; Wen Zhou; Shuang Liu; Cunzhao Shi

In this paper, we propose a novel method for cross-view action recognition via a continuous virtual path which connects the source view and the target view. Each point on this virtual path is a virtual view which is obtained by a linear transformation of the action descriptor. All the virtual views are concatenated into an infinite-dimensional feature to characterize continuous changes from the source to the target view. However, these infinite-dimensional features cannot be used directly. Thus, we propose a virtual view kernel to compute the value of similarity between two infinite-dimensional features, which can be readily used to construct any kernelized classifiers. In addition, there are a lot of unlabeled samples from the target view, which can be utilized to improve the performance of classifiers. Thus, we present a constraint strategy to explore the information contained in the unlabeled samples. The rationality behind the constraint is that any action video belongs to only one class. Our method is verified on the IXMAS dataset, and the experimental results demonstrate that our method achieves better performance than the state-of-the-art methods.


IEEE Transactions on Circuits and Systems for Video Technology | 2014

Scene Text Recognition Using Structure-Guided Character Detection and Linguistic Knowledge

Cunzhao Shi; Chunheng Wang; Baihua Xiao; Song Gao; Jinlong Hu

Scene text recognition has inspired great interests from the computer vision community in recent years. In this paper, we propose a novel scene text-recognition method integrating structure-guided character detection and linguistic knowledge. We use part-based tree structure to model each category of characters so as to detect and recognize characters simultaneously. Since the character models make use of both the local appearance and global structure informations, the detection results are more reliable. For word recognition, we combine the detection scores and language model into the posterior probability of character sequence from the Bayesian decision view. The final word-recognition result is obtained by maximizing the character sequence posterior probability via Viterbi algorithm. Experimental results on a range of challenging public data sets (ICDAR 2003, ICDAR 2011, SVT) demonstrate that the proposed method achieves state-of-the-art performance both for character detection and word recognition.


Pattern Recognition | 2014

End-to-end scene text recognition using tree-structured models

Cunzhao Shi; Chunheng Wang; Baihua Xiao; Song Gao; Jinlong Hu

Detecting and recognizing text in natural images are quite challenging and have received much attention from the computer vision community in recent years. In this paper, we propose a robust end-to-end scene text recognition method, which utilizes tree-structured character models and normalized pictorial structured word models. For each category of characters, we build a part-based tree-structured model (TSM) so as to make use of the character-specific structure information as well as the local appearance information. The TSM could detect each part of the character and recognize the unique structure as well, seamlessly combining character detection and recognition together. As the TSMs could accurately detect characters from complex background, for text localization, we apply TSMs for all the characters on the coarse text detection regions to eliminate the false positives and search the possible missing characters as well. While for word recognition, we propose a normalized pictorial structure (PS) framework to deal with the bias caused by words of different lengths. Experimental results on a range of challenging public datasets (ICDAR 2003, ICDAR 2011, SVT) demonstrate that the proposed method outperforms state-of-the-art methods both for text localization and word recognition


document analysis systems | 2012

Adaptive Graph Cut Based Binarization of Video Text Images

Cunzhao Shi; Baihua Xiao; Chunheng Wang; Yang Zhang

Interactive image segmentation which needs the user to give certain hard constraints has shown promising performance for object segmentation. In this paper, we consider characters in text image as a special kind of object, and propose an adaptive graph cut based text binarization method to segment text from background. The main contributions of the paper lie in: 1) in order to make the binarization local adaptive with uneven background, the text region image is firstly roughly split into several sub-images on which graph cut is applied, and 2) considering the unique characteristics of the text, we propose to automatically classify some pixels as text or background with high confidence, severed as hard constraints seeds for graph cut to extract text from background by spreading the seeds into the whole sub-image. The experimental results show that our approach could get better performance in both character extraction accuracy and recognition accuracy.


IEEE Transactions on Image Processing | 2015

Stroke Detector and Structure Based Models for Character Recognition: A Comparative Study

Cunzhao Shi; Song Gao; Meng-Tao Liu; Chengzuo Qi; Chunheng Wang; Baihua Xiao

Characters, which are man-made symbols composed of strokes arranged in a certain structure, could provide semantic information and play an indispensable role in our daily life. In this paper, we try to make use of the intrinsic characteristics of characters and explore the stroke and structure-based methods for character recognition. First, we introduce two existing part-based models to recognize characters by detecting the elastic strokelike parts. In order to utilize strokes of various scales, we propose to learn the discriminative multi-scale stroke detector-based representation (DMSDR) for characters. However, the part-based models and DMSDR need to manually label the parts or key points for training. In order to learn the discriminative stroke detectors automatically, we further propose the discriminative spatiality embedded dictionary learning-based representation (DSEDR) for character recognition. We make a comparative study of the performance of the tree-structured model (TSM), mixtures-of-parts TSM, DMSDR, and DSEDR for character recognition on three challenging scene character recognition (SCR) data sets as well as two handwritten digits recognition data sets. A series of experiments is done on these data sets with various experimental setup. The experimental results demonstrate the suitability of stroke detector-based models for recognizing characters with deformations and distortions, especially in the case of limited training samples.


international conference on document analysis and recognition | 2013

Adaptive Scene Text Detection Based on Transferring Adaboost

Song Gao; Chunheng Wang; Baihua Xiao; Cunzhao Shi; Yonghui Zhang; Zhijian Lv; Yanqin Shi

Detecting text in scene images is very challenging due to complex backgrounds, various fonts and different illumination conditions. Without prior knowledge, a detector previously trained using lots of samples still perform badly on a test image because of the disparities in distributions between the training samples and the testing ones. In this paper, we propose to adapt a pre-trained generic scene text detector towards new scenes by transfer learning. In particular, we choose cascade Adaboost as the detector style and try to re-weight pre-selected features according to their abilities to classify high confidence samples. The proposed adaptation mechanism has been evaluated on ICDAR 2011 scene text detection competition dataset and the encouraging experiments results can be compared with the latest published algorithms.


IEEE Geoscience and Remote Sensing Letters | 2017

Deep Convolutional Activations-Based Features for Ground-Based Cloud Classification

Cunzhao Shi; Chunheng Wang; Yu Wang; Baihua Xiao

Ground-based cloud classification is crucial for meteorological research and has received great concern in recent years. However, it is very challenging due to the extreme appearance variations under different atmospheric conditions. Although the convolutional neural networks have achieved remarkable performance in image classification, no one has evaluated their suitability for cloud classification. In this letter, we propose to use the deep convolutional activations-based features (DCAFs) for ground-based cloud classification. Considering the unique characteristic of cloud, we believe the local rich texture information might be more important than the global layout information and, thus, give a comprehensive evaluation of using both shallow convolutional layers-based features and DCAFs. Experimental results on two challenging public data sets demonstrate that although the realization of DCAF is quite straightforward without any use-dependent tricks, it outperforms conventional hand-crafted features considerably.


international conference on image processing | 2014

Learning co-occurrence strokes for scene character recognition based on spatiality embedded dictionary

Song Gao; Chunheng Wang; Baihua Xiao; Cunzhao Shi; Wen Zhou; Zhong Zhang

Robust scene-text-extraction system can be used in lots of areas. In this work, we propose to learn co-occurrence of local strokes for robust character recognition by using a spatiality embedded dictionary (SED). Different from spatial pyramid partitioning images into grids to incorporate spatial information, our SED associates every codeword with a particular response region and introduces more precise spatial information for character recognition. After localized soft coding and max pooling of the first layer, a sparse dictionary is learned to model co-occurrence of several local strokes, which further improves classification performance. Experiment on benchmark datasets demonstrates the effectiveness of our method and the results outperform state-of-the-art algorithms.

Collaboration


Dive into the Cunzhao Shi's collaboration.

Top Co-Authors

Avatar

Chunheng Wang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Baihua Xiao

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Song Gao

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Chengzuo Qi

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yang Zhang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Jian Xu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yanna Wang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Zhong Zhang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yu Wang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Wen Zhou

Chinese Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge