Zhixin Shi
University at Buffalo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zhixin Shi.
First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings. | 2004
Zhixin Shi; Venu Govindaraju
A new text line location and separation algorithm for complex handwritten documents is proposed. The algorithm is based on the application of a fuzzy directional runlength. The proposed technique was tested on a variety of complex handwritten document images including postal parcel images and historical handwritten documents such as Newtons and Galileos manuscripts. A preliminary testing showed a successful rate of 93% of the test set.
international conference on document analysis and recognition | 2009
Zhixin Shi; Srirangaraj Setlur; Venu Govindaraju
In this paper, we present a new text line extraction method for handwritten Arabic documents. The proposed technique is based on a generalized adaptive local connectivity map (ALCM) using a steerable directional filter. The algorithm is designed to solve the particularly complex problems seen in handwritten documents such as fluctuating, touching or crossing text lines. The proposed algorithm consists of three steps. Firstly, a steerable filter is used to probe and determine foreground intensity along multiple directions at each pixel while generating the ALCM. The ALCM is then binarized using an adaptive thresholding algorithm to get a rough estimate of the location of the text lines. In the second step, connected component analysis is used to classify text and non text patterns in the generated ALCM to refine the location of the text lines. Finally, the text lines are separated by superimposing the text line patterns in the ALCM on the original document image and extracting the connected components covered by the pattern mask. Analysis of experimental results on the DARPA MADCAT Arabic handwritten document data indicate that the method is robust and is capable of correctly isolating handwritten text lines even on challenging document images.
international conference on document analysis and recognition | 2005
Zhixin Shi; Srirangaraj Setlur; Venu Govindaraju
This paper presents an algorithm using adaptive local connectivity map for retrieving text lines from the complex handwritten documents such as handwritten historical manuscripts. The algorithm is designed for solving the particularly complex problems seen in handwritten documents. These problems include fluctuating text lines, touching or crossing text lines and low quality image that do not lend themselves easily to binarizations. The algorithm is based on connectivity features similar to local projection profiles, which can be directly extracted from gray scale images. The proposed technique is robust and has been tested on a set of complex historical handwritten documents such as Newtons and Galileos manuscripts. A preliminary testing shows a successful location rate of above 95% for the test set.
Pattern Recognition | 1997
Zhixin Shi; Venu Govindaraju
A new segmentation method for segmenting connected handwritten digit strings is presented. Unlike traditional methods where segmentation points are uniquely determined to cut the piece of stroke joining the connected numerals, our approach is one of identifying regions which serve as potential segmentation points. The regions are identified by a thorough analysis of the trajectory of strokes.
Pattern Recognition Letters | 2006
Zhixin Shi; Venu Govindaraju
A feature extraction method using the chaincode representation of fingerprint ridge contours is presented. The representation allows efficient image quality enhancement and detection of fine minutiae feature points. The direction field is estimated from a set of selected chaincodes. The original gray-scale image is enhanced using a dynamic filtering scheme that takes advantage of the estimated direction flow of the contours. Minutiae are generated using ridge contour following.
international conference on document analysis and recognition | 2009
Safwan Wshah; Zhixin Shi; Venu Govindaraju
We propose a new algorithm for segmentation of off-line handwritten Arabic words. The algorithm segments the connected letters to smaller segments each of which contains no more than three letters. Each letter may be segmented to at most five pieces. In addition to improving the recognition of Arabic words, another potential application of the proposed segmentation method is to build lexicon of small size, consisting of no more than three letter combinations. Generally, it is very hard to generate lexicon for recognition of unconstraint handwritten Arabic documents due to the large number of words of Arabic language.The algorithm has been tested on over 6300 words from 45 different documents written by 18 writers. The system is able to segment more than 93% of the words into segments, each containing at most one letter, 6% of the words into segments that contains two letters and 3% of the words into segments that contains three letters.
Scopus | 2004
Sargur N. Srihari; Zhixin Shi
Document storage and retrieval capabilities of the CEDAR-FOX forensic handwritten document examination system are described. The system is designed for automated and semiautomated analysis of scanned handwritten documents. For library creation, the system provides functionalities for (i) entering document metadata, e.g., identification number, writer and other collateral information, (ii) creating a textual transcript of the image content at the word level, and (iii) including automatically extracted document level features, e.g., stroke width, slant, word gaps, as well as finer features that capture the structural characteristics of characters and words. For extracting these features the system performs page analysis, page segmentation, line separation, word segmentation and finally recognition of characters and words. The extracted features are used for writer identification by matching against a library built as a database. The system design is driven by questioned document examination with its emphasis on writer identification. Several query modalities are permitted for retrieval: (i) document level: the entire document image is the query; (ii) partial image: a region of interest (ROI) of a document; (ii) word image: which is also called word spotting; (iv) text keyword: the user can type in keywords ranging from the words in the documents, case number, person names, time and the preregistered keywords such as brief descriptions of the case. The system has been implemented using Microsoft visual C++ and tested using MySQL database system from MySQL ABTM. It provides as a graphical user interface for forensic document identification, verification and analysis.
international conference on document analysis and recognition | 1997
Zhixin Shi; Sargur N. Srihari; Y.-C. Shiu; Vemulapati Ramanaprasad
Proposes a system for the segmentation and recognition of totally unconstrained handwritten numeral strings. The system is composed of several document analysis modules, namely a preprocessing module, a segmentation module and a recognition module. The preprocessing module includes connected component analysis, identifying substrings with touching digits and estimating the number of digits in the substring. The segmentation module is built with a new segmentation algorithm based on a thorough stroke analysis using contour representation of the strokes. In the recognition module, a high-performance digit recognizer is used for the isolated digit images after segmentation, and then a simple postprocessing routine is called for those cases where some punctuation marks or delimiters such as dashes, commas or periods are included in the numeral string. Due to the high performance of the segmentation module, the system is efficient and robust with a high recognition performance.
Pattern Recognition Letters | 1996
Zhixin Shi; Venu Govindaraju
Abstract Preprocessing of character images is an important step in recognition. We describe a method of enhancing binary character images to assist the subsequent recognition process in both handwritten and machine-printed documents. The method performs selective and adaptive stroke “filling” with a neighborhood operator which emphasizes stroke connectivity. An improvement of 7% points was realized in recognition of address fields.
international conference on document analysis and recognition | 2003
Zhixin Shi; Venu Govindaraju
A skew angle estimation approach based on the applicationof a fuzzy directional runlength is proposed for complexaddress images. The proposed technique was tested on a varietyof USPS parcel images including both machine printand handwritten addresses. The testing results showed asuccessful rate more than 90% of the test set.