Cao Shi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Cao Shi is active.

Explore More

Publication

Featured researches published by Cao Shi.

document recognition and retrieval | 2013

Graphic composite segmentation for PDF documents with complex layouts

Canhui Xu; Zhi Tang; Xin Tao; Cao Shi

Converting the PDF books to re-flowable format has recently attracted various interests in the area of e-book reading. Robust graphic segmentation is highly desired for increasing the practicability of PDF converters. To cope with various layouts, a multi-layer concept is introduced to segment graphic composites including photographic images, drawings with text insets or surrounded with text elements. Both image based analysis and inherent digital born document advantages are exploited in this multi-layer based layout analysis method. By combining low-level page elements clustering applied on PDF documents and connected component analysis on synthetically generated PNG image document, graphic composites can be segmented for PDF documents with complex layouts. The experimental results on graphic composite segmentation of PDF document pages have shown satisfactory performance.

document recognition and retrieval | 2013

Character feature integration of Chinese calligraphy and font

Cao Shi; Jianguo Xiao; Wenhua Jia; Canhui Xu

A framework is proposed in this paper to effectively generate a new hybrid character type by means of integrating local contour feature of Chinese calligraphy with structural feature of font in computer system. To explore traditional art manifestation of calligraphy, multi-directional spatial filter is applied for local contour feature extraction. Then the contour of character image is divided into sub-images. The sub-images in the identical position from various characters are estimated by Gaussian distribution. According to its probability distribution, the dilation operator and erosion operator are designed to adjust the boundary of font image. And then new Chinese character images are generated which possess both contour feature of artistical calligraphy and elaborate structural feature of font. Experimental results demonstrate the new characters are visually acceptable, and the proposed framework is an effective and efficient strategy to automatically generate the new hybrid character of calligraphy and font.

Proceedings of SPIE | 2013

Graph-based layout analysis for PDF documents

Canhui Xu; Zhi Tang; Xin Tao; Yun Li; Cao Shi

To increase the flexibility and enrich the reading experience of e-book on small portable screens, a graph based method is proposed to perform layout analysis on Portable Document Format (PDF) documents. Digital born document has its inherent advantages like representing texts and fractional images in explicit form, which can be straightforwardly exploited. To integrate traditional image-based document analysis and the inherent meta-data provided by PDF parser, the page primitives including text, image and path elements are processed to produce text and non text layer for respective analysis. Graph-based method is developed in superpixel representation level, and page text elements corresponding to vertices are used to construct an undirected graph. Euclidean distance between adjacent vertices is applied in a top-down manner to cut the graph tree formed by Kruskal’s algorithm. And edge orientation is then used in a bottom-up manner to extract text lines from each sub tree. On the other hand, non-textual objects are segmented by connected component analysis. For each segmented text and non-text composite, a 13-dimensional feature vector is extracted for labelling purpose. The experimental results on selected pages from PDF books are presented.

Archive | 2012

Automatic Generation of Chinese Character Based on Human Vision and Prior Knowledge of Calligraphy

Cao Shi; Jianguo Xiao; Wenhua Jia; Canhui Xu

Prior knowledge of Chinese calligraphy is modeled in this paper, and the hierarchical relationship of strokes and radicals is represented by a novel five layer framework. Calligraphist’s unique calligraphy skill is analyzed and his particular strokes, radicals and layout patterns provide raw element for the proposed five layers. The criteria of visual aesthetics based on Marr’s vision assumption are built for the proposed algorithm of automatic generation of Chinese character. The Bayesian statistics is introduced to characterize the character generation process as a Bayesian dynamic model, in which, parameters to translate, rotate and scale strokes, radicals are controlled by the state equation, as well as the proposed visual aesthetics is employed by the measurement equation. Experimental results show the automatically generated characters have almost the same visual acceptance compared to calligraphist’s artwork.

Proceedings of SPIE | 2014

Nonlinear and non-Gaussian Bayesian based handwriting beautification

Cao Shi; Jianguo Xiao; Canhui Xu; Wenhua Jia

A framework is proposed in this paper to effectively and efficiently beautify handwriting by means of a novel nonlinear and non-Gaussian Bayesian algorithm. In the proposed framework, format and size of handwriting image are firstly normalized, and then typeface in computer system is applied to optimize vision effect of handwriting. The Bayesian statistics is exploited to characterize the handwriting beautification process as a Bayesian dynamic model. The model parameters to translate, rotate and scale typeface in computer system are controlled by state equation, and the matching optimization between handwriting and transformed typeface is employed by measurement equation. Finally, the new typeface, which is transformed from the original one and gains the best nonlinear and non-Gaussian optimization, is the beautification result of handwriting. Experimental results demonstrate the proposed framework provides a creative handwriting beautification methodology to improve visual acceptance.

Archive | 2012

Integration of Text Information and Graphic Composite for PDF Document Analysis

Canhui Xu; Zhi Tang; Xin Tao; Cao Shi

The trend of large scale digitization has greatly motivated the research on the processing of the PDF documents with little structure information. Challenging problems like graphic segmentation integrating with texts remain unsolved for successful practical application of PDF layout analysis. To cope with PDF documents, a hybrid method incorporating text information and graphic composite is proposed to segment the pages that are difficult to handle by traditional methods. Specifically, the text information is derived accurately from born-digital documents embedded with low-level structure elements in explicit form. Then page text elements are clustered by applying graph based method according to proximity and feature similarity. Meanwhile, the graphic components are extracted by means of texture and morphological analysis. By integrating the clustered text elements with image based graphic components, the graphics are segmented for layout analysis. The experimental results on pages of PDF books have shown satisfactory performance.

Proceedings of SPIE | 2014

Visual improvement for bad handwriting based on Monte-Carlo method

Cao Shi; Jianguo Xiao; Canhui Xu; Wenhua Jia

A visual improvement algorithm based on Monte Carlo simulation is proposed in this paper, in order to enhance visual effects for bad handwriting. The whole improvement process is to use well designed typeface so as to optimize bad handwriting image. In this process, a series of linear operators for image transformation are defined for transforming typeface image to approach handwriting image. And specific parameters of linear operators are estimated by Monte Carlo method. Visual improvement experiments illustrate that the proposed algorithm can effectively enhance visual effect for handwriting image as well as maintain the original handwriting features, such as tilt, stroke order and drawing direction etc. The proposed visual improvement algorithm, in this paper, has a huge potential to be applied in tablet computer and Mobile Internet, in order to improve user experience on handwriting.

Proceedings of SPIE | 2014

Automatic generation of Chinese character using features fusion from calligraphy and font

Cao Shi; Jianguo Xiao; Canhui Xu; Wenhua Jia

A spatial statistic based contour feature representation is proposed to achieve extraction of local contour feature from Chinese calligraphy character, and a features fusion strategy is designed to automatically generate new hybrid character, making well use of contour feature of calligraphy and structural feature of font. The features fusion strategy employs dilation and erosion operations iteratively to inject the extracted contour feature from Chinese calligraphy into font, which are similar to “pad” and “cut” in a sculpture progress. Experimental results demonstrate that the generated new hybrid character hold both contour feature of calligraphy and structural feature of font. Especially, two kinds of Chinese calligraphy skills called “Fei Bai” and “Zhang Mo” are imitated in the hybrid character. “Fei Bai” depicts a phenomenon that part of a stroke fade out due to the fast movement of hair brush or the lack of ink, and “Zhang Mo” describes a condition that hair brush holds so much ink that strokes overlap.

Archive | 2013