Stephen W. K. Lam | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stephen W. K. Lam is active.

Explore More

Publication

Featured researches published by Stephen W. K. Lam.

Pattern Recognition | 1996

Gradient-based contour encoding for character recognition

Geetha Srikantan; Stephen W. K. Lam; Sargur N. Srihari

We describe novel methods of feature extraction for recognition of single isolated character images. Our approach is flexible in that the same algorithms can be used, without modification, for feature extraction in a variety of OCR problems. These include handwritten, machine-print, grayscale, binary and low-resolution character recognition. We use the gradient representation as the basis for extraction of low-level, structural and stroke-type features. These algorithms require a few simple arithmetic operations per image pixel which makes them suitable for real-time applications. A description of the algorithms and experiments with several data sets are presented in this paper. Experimental results using artificial neural networks are presented. Our results demonstrate high performance of these features when tested on data sets distinct from the training data.

international conference on document analysis and recognition | 1993

Anatomy of a form reader

Stephen W. K. Lam; Ladan Javanbakht; Sargur N. Srihari

Forms are used extensively in todays offices. The task of an automated form reader is to locate data filled on a form and to encode the content into appropriate symbolic descriptions. The challenges in form reading are due to high volume and large variety. A robust form reader with high adaptability and trainability. The form reader consists of two modules: field registration and data recognition module. The field registration module acquires knowledge about the forms of interest and the data recognition module recognizes text data on filled forms using the acquired knowledge. The capability of the reader increases progressively through supervised learning. The form reader has been training to read a large variety of forms with machine-printed data. The adaptability and trainability of the system have been demonstrated through the experiments.<<ETX>>

international conference on pattern recognition | 1990

Reading newspaper text

Stephen W. K. Lam; Dacheng Wang; Sargur N. Srihari

The authors describe a method for segmenting a newspaper page image into labeled macro components (blocks) and recognizing the content. Connected component analysis is used to segment a newspaper image into several rectangular blocks and to filter connected components into character and noncharacter components. Textural analysis is then used to classify the remaining noncharacter components into graphics and photographs. Experimental results indicate that these techniques work very well.<<ETX>>

international conference on document analysis and recognition | 1995

The design of a nearest-neighbor classifier and its use for Japanese character recognition

Tao Hong; Stephen W. K. Lam; Jonathan J. Hull; Sargur N. Srihari

The nearest neighbor (NN) approach is a powerful nonparametric technique for pattern classification tasks. In this paper, algorithms for prototype reduction, hierarchical prototype organization and fast NN search are described. To remove redundant category prototypes and to avoid redundant comparisons, the algorithms explain geometrical information of a given prototype set which is represented approximately by computing k-nearest/farthest neighbors of each prototype. The performance of a NN classifier using those algorithms for Japanese character recognition is reported.

machine vision applications | 1992

Gray-scale character recognition using boundary features

Stephen W. K. Lam; Anthony C. Girardin; Sargur N. Srihari

Optical character recognition (OCR) traditionally applies to binary-valued imagery although text always scanned and stored in gray-scale. Binarization of multivalued image may remove important topological information from characters and introduce noise to character background. Low quality imagery, produced by poor print text and improper image lift, magnifies the shortcomings of this process. A character classifier is proposed to recognize gray-scale characters by extracting structural features from character outlines. A fast local contrast based gray-scale edge detector has been developed to locate character boundaries. A pixel is considered as an edge-pixel if its gray value is below a threshold and has a neighbor whose gray value is above the threshold. Edges are then thinned to one pixel wide. Extracting structural features from edges is performed by convolving the edges with a set of feature templates. Currently, 16 features, such as strokes, curves, and corners, are considered. Extracted features are compressed to form a binary vector with 576 features and it is used as input to a classifier. This approach is being tested on machine-printed characters which are extracted from mail address blocks. Characters are sampled at 300 ppi and quantized with 8 bits. Experimental results also demonstrate that recognition rates can be improved by enhancing image quality prior to boundary detection.

KBCS '89 Proceedings of the International Conference on Knowledge Based Computer Systems | 1989

Newspaper Image Understanding

Venu Govindaraju; Stephen W. K. Lam; Debashish Niyogi; David B. Sher; Rohini K. Srihari; Sargur N. Srihari; Dacheng Wang

Understanding printed documents such as newspapers is a common intelligent activity of humans. Making a computer perform the task of analyzing a newspaper image and derive useful high-level representations requires the development and integration of techniques in several areas, including pattern recognition, computer vision, language understanding and artificial intelligence. We describe the organization and several components of a newspaper image undertanding system that begins with digitized images of newspaper pages and produces symbolic representations at several different levels. Such representations include: the visual sketch (connected components extracted from the background), physical layout (spatial extents of blocks corresponding to text, half-tones, graphics), logical layout (organization of story components), block primitives (e.g., recognized characters and words in text blocks, lines in graphics, faces in photographs, etc.), and semantic nets corresponding to photographic and textual blocks (individually, as well as grouped together as stories). We describe algorithms for deriving several of the representations and describe the interaction of different modules.

IS&T/SPIE's Symposium on Electronic Imaging: Science and Technology | 1993

Representing lexicons by modified trie for fast partial-string matching

Stephen W. K. Lam; Xin Shen; Sheila X. Zhao; Sargur N. Srihari

A fast lexicon search based on trie is presented. Traditionally, a successful trie retrieval requires each element of the input string to find an exact match on a particular path in the trie. However, this approach cannot verify a garbled input string, which is generated due to imperfect character segmentation and recognition of an OCR. The string may contain multiple candidates for a character position or no candidate due to segmentation error. The proposed representation scheme, an extension of trie, allows lexicon look-up even with only some of the string elements specified. An input string is presented as a regular expression and all the paths in the trie that satisfy the expression will be considered as word candidates. The candidates can then be ranked based on character classifier decisions. Confusion in character recognition can be handled by allowing an expression component to have more than one character candidate while segmentation error can be overcome by postulating a word region to contain a certain number of unknown characters. Three extensions have been made to trie data structures for position independent access and fast exhaustive search of all valid paths: (1) bidirectional linkage between two nodes at adjacent levels to enable trie traversal in either direction, (2) the nodes with the same letter at the same word position are linked so that all the words which have the same letter at a specific position can be located immediately, and (3) an index table which records the entry points of letters at specific word positions in the trie in order to facilitate trie access at an arbitrary letter position. This representation scheme has been tested on postal address field recognition. The trie represents 11,157 words, the average access time is 0.02 sec on a SUN SPARC2 and with an average of 3 candidates.

IS&T/SPIE's Symposium on Electronic Imaging: Science and Technology | 1993

Gradient-based contour encoding for gray-scale character recognition

Geetha Srikantan; Stephen W. K. Lam; Sargur N. Srihari

We present a fast new algorithm for grayscale character recognition. By operating on grayscale images directly, we attempt to maximize the information that can be extracted. Traditional recognition based on binarization of images is also avoided. Previous work using gradient-based contour encoding was used in a restricted domain where objects were size- invariant and fewer patterns had to be distinguished; a simpler feature set sufficed, in this domain. In character recognition, a feature-extractor has to handle character images of arbitrary size. Our algorithm extracts local contour features based on the pixel gradients present in an image. A gradient map, i.e., gradient magnitude and direction at every pixel, is computed; this map is thresholded to avoid responses to noise and spurious artifacts, the map is then partitioned coarsely. Character contours are encoded by quantizing gradient directions into a few representative direction bins. This method does not require that images be normalized to a fixed grid-size before feature extraction. Features extracted by this method have been used to train a 2-layer neural network classifier. Experiments with machine-printed numeric (10-class), and alphanumeric & punctuation (77-class) character images, captured at 300 ppi from mailpiece images, have been conducted. Currently, 99.4% numerals and 93.4% from 77-class images were recognized correctly from a testing set of 5420 numeric and 24,988 character images. When shape and case confusions are allowed, the recognition performance improves to 97.5%. A similar experiment with binarized 77-class images resulted in 92.1% correctly recognized test images. This method is being extended to handwritten characters recognition.

IS&T/SPIE's Symposium on Electronic Imaging: Science & Technology | 1995

Visual similarity analysis of Chinese characters and its uses in Japanese OCR

Tao Hong; Stephen W. K. Lam; Jonathan J. Hull; Sargur N. Srihari

Traditionally, a Chinese or Japanese optical character reader (OCR) has to represent each character category individually as one or more feature prototypes, or a structural description which is a composition of manually derived components such as radicals. Here we propose a new approach in which various kinds of visual similarities between different Chinese characters are analyzed automatically at the feature level. Using this method, character categories are related to each other by training on fonts; and character images from a text page can be related to each other based on the visual similarities they share. This method provides a way to interpret character images from a text page systematically, instead of a sequence of isolated character recognitions. The use of the method for post processing in Japanese text recognition is also discussed.

systems man and cybernetics | 1991

Frame-based knowledge representation for multi-domain document layout analysis

Stephen W. K. Lam; Sargur N. Srihari

A frame-based knowledge representation scheme is proposed to describe documents from different domains. The scheme is specifically designed to capture characteristics shared by documents and it allows a document image processing system to be designed without prior knowledge of the problem domain. A document is described by a model, which is a hierarchy of document layout objects. Layout objects with the same parent are related by their spatial and contextual constraints. Layout objects at the lowest level of the hierarchy are retrieved from document imagery. The robustness of this representation scheme is demonstrated by its application to three different document domains.<<ETX>>

Explore More