Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Richard G. Casey is active.

Publication


Featured researches published by Richard G. Casey.


Computer Graphics and Image Processing | 1982

Block segmentation and text extraction in mixed text/image documents

Friedrich M. Wahl; Kwan Y. Wong; Richard G. Casey

Abstract The segmentation and classification of digitized printed documents into regions of text and images is a necessary first processing step in document analysis systems. It is shown that a constrained run length algorithm is well suited to partition most documents into areas of text lines, solid black lines, and rectangular ☐es enclosing graphics and halftone images. During the processing these areas are labeled and meaningful features are calculated. By making use of the regular appearance of text lines as textured stripes, a linear adaptive classification scheme is constructed to discriminate text regions from others.


Ibm Journal of Research and Development | 1973

Decomposition of a data base and the theory of Boolean switching functions

Claude Delobel; Richard G. Casey

The notion of a functional relation among the attributes of a data set can be fruitfully applied in the structuring of an information system. These relations are meaningful both to the user of the system in his semantic understanding of the data, and to the designer in implementing the system. An important equivalence between operations with functional relations and operations with analogous Boolean functions is demonstrated in this paper. The equivalence is computationally helpful in exploring the properties of a given set of functional relations, as well as in the task of partitioning a data set into subfiles for efficient implementation.


Ibm Systems Journal | 1990

Intelligent forms processing

Richard G. Casey; David R. Ferguson

The automatic reading of optically scanned forms consists of two major components: extraction of the data image from the form and interpretation of the image as coded alphanumerics. The second component is also known as optical character recognition, or OCR. We have implemented a method for entry of a wide variety of forms that contain machine-printed data and that are often produced in business environments. The function, called Intelligent Forms Processing (IFP), accepts conventional forms that call for information to be printed in designated blank areas, but in which the information may exceed boundaries due to poor registration during printing. The human eye easily accommodates data that impinge on form boundaries or on background text; however, the same powers of discrimination applied to machine processing pose a technical challenge. The IFP system uses a setup phase to create a model of each form that is to be read. Scanned forms containing data are compared against the matching form model. Special algorithms are employed to extract data fields while removing background printing (e.g., form lines) intersecting the data. The extracted data images are interpreted by an OCR process that reads typical monospace fonts. New fonts may be added easily in a separate design mode. If the data are alphabetic, a lexicon may be assembled to define the possible entries.


IEEE Transactions on Computers | 1968

An Autonomous Reading Machine

Richard G. Casey; George Nagy

Abstract—An unconventional approach to character recognition is developed. The resulting system is based solely on the statistical properties of the language, therefore it can read printed text with no previous training or a priori information about the structure of the characters. The known letter-pair frequencies of the language are used to identify the printed symbols in the following manner.


IEEE Transactions on Information Theory | 1984

Decision tree design using a probabilistic model (Corresp.)

Richard G. Casey; George Nagy

A sequential optical character recognition algorithm, ideally suited for implementation by means of microprocessors with limited storage capabilities, is formulated in terms of a binary decision tree. Upper bounds On the recognition performance are derived in terms of the stability of the digitized picture elements. The design process is described in detail. The algorithm is tested on single-font typewritten characters and the experimental and theoretical results are compared.


international conference on document analysis and recognition | 1993

Optical recognition of chemical graphics

Richard G. Casey; Stephen K. Boyer; Paul Donald Healey; Alex Miller; Bernadette Oudot; Karl S. Zilles

A prototype system for encoding chemical structure diagrams from scanned printed documents is described. The system distinguishes a structure diagram from other printed material on a page image using size and spacing characteristics. It distinguishes line graphics from symbols in an intermediate vectorization stage. Line information is mapped into a connection diagram that represents atomic bonds. Atomic symbols are identified by means of chemical drawing conventions and optical character recognition. The final coded output interfaces to conventional chemistry software for database storage and retrieval, publishing, and modeling.<<ETX>>


Communications of The ACM | 1977

An encoding method for multifield sorting and indexing

Michael W. Blasgen; Richard G. Casey; Kapali P. Eswaran

Sequences of character strings with an order relation imposed between sequences are considered. An encoding scheme is described which produces a single, order-preserving string from a sequence of strings. The original sequence can be recovered from the encoded string, and one sequence of strings precedes another if and only if the encoding of the first precedes the encoding of the second. The strings may be variable length, without a maximum length restriction, and no symbols need be reserved for control purposes. Hence any symbol may occur in any string. The scheme is useful for multifield sorting, multifield indexing, and other applications where ordering on more than one field is important.


Ibm Journal of Research and Development | 1983

A processor-based OCR system

Richard G. Casey; Chentung Robert Jih

A low-cost optical character recognition (OCR) system can be realized by means of a document scanner connected to a CPU through an interface. The interface performs elementary image processing functions, such as noise filtering and thresholding of the video image from the scanner. The processor receives a binary image of the document, formats the image into individual character patterns, and classifies the patterns one-by-one. A CPU implementation is highly flexible and avoids much of the development and manufacturing costs for special-purpose, parallel circuitry typically used in commercial OCR. A processor-based recognition system has been investigated for reading documents printed in fixed-pitch conventional type fonts, such as occur in routine office typing. Novel, efficient methods for tracking a print line, resolving it into individual character patterns, detecting underscores, and eliminating noise have been devised. A previously developed classification technique, based on decision trees, has been extended in order to improve reading accuracy in an environment of considerable character variation, including the possibility that documents in the same font style may be produced using quite different print technologies. The system has been tested on typical office documents, and also on artificial stress documents, obtained from a variety of typewriters.


Communications of The ACM | 1973

Design of tree structures for efficient querying

Richard G. Casey

A standard information retrieval operation is to determine which records in a data collection satisfy a given query expressed in terms of data values. The process of locating the desired responses can be represented by a tree search model. This paper poses an optimization problem in the design of such trees to serve a well-specified application. The problem is academic in the sense that ordinarily the optimal tree cannot be implemented by means of practical techniques. On the other hand, it is potentially useful for the comparison it affords between observed performance and that of an intuitively attractive ideal search procedure. As a practical application of such a model this paper considers the design of a novel tree search scheme based on a bit vector representation of data and shows that essentially the same algorithm can be used to design either an ideal search tree or a bit-vector tree. An experimental study of a small formatted file illustrates the concepts.


Ibm Journal of Research and Development | 1982

Automatic scaling of digital print fonts

Richard G. Casey; Theodore D. Friedman; Kwan Y. Wong

New raster-based printers form character patterns using carefully designed matrices of dots. It is desirable to be able to use fonts designed for one printer on a different machine, but to do so the dot matrix patterns should first be scaled to the second printers resolution. If the scaling is carried out as a simple interpolation, however, severe degradation in the appearance of the characters may occur. A new algorithm reduces such degradation by recognizing attributes associated with print character quality in the original patterns and then correcting the scaled patterns in order to maintain those attributes. Attributes that are detected and preserved during scaling include local and global symmetries, stroke width, sharpness of corners, and smoothness of contour. The method has been used both to scale low-resolution fonts to a finer representation and to reduce the scale of high-resolution photocomposer fonts for output on an office-type printer.

Collaboration


Dive into the Richard G. Casey's collaboration.

Researchain Logo
Decentralizing Knowledge