Gary E. Kopec | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gary E. Kopec is active.

Explore More

Publication

Featured researches published by Gary E. Kopec.

Communications of The ACM | 1994

Editing images of text

Steven C Bagley; Gary E. Kopec

went advances in image scanning, storage, and retrieval have stimulated interest in incorporating scanned documenu within elecaonic document systems. Integrating scanned documents with structured documenu is an important step toward making electronic document processing universally available. Most approaches to this problem are based on one of two paradigms-bitmap editing and format conversion. Conventional bitmap editors and paint programs treat scanned images as simple pixel arrays without internal structure. A major strength of this approach is that bitmap editors are applicable to an open-ended class of images. Moreover, almost by definition they preserve the detailed format and typographic design of the input material. In particular, if distortions due to scanning and printing are ignored, a bitmap editor behaves as an identity system if no editing operations are performed. There are numerous scenarios in which editing an image while preserving the appearance of unedited material is important. In general, such scenarios involve documenrs that originate in image form and are to be retained in image form after modification. Examples include last-minute correction of spelling mistakes before photocopying, modiing viewgraphs at meetings, exchange of document drafts by fax, and recreational forgery (the construction of obvious parodies for humorous purposes). While it would seem that bitmap editors are ideally suited to such applications, in practice their utility is often limited. The primary problem is that current bitmap editon support only relatively low-level editing operations. Typical facilities include selection of polygonal regions; cutting, pasting and copying selected regions; and painting operations such as bit setting, clearing, and complementation. No attempt is made to classify the content of the image (e.g., as text or line art) and no operations are pm tided that assist higher-level content-specific operations. Thus, for example, using a paint program to delete a character from the middle of a word

IEEE Transactions on Image Processing | 1993

Least-squares font metric estimation from images

Gary E. Kopec

The problem of determining font metrics from measurements on images of typeset text is discussed, and least-squares procedures for font metric estimation are developed. When it is shown that kerning is not present, sidebearing estimation involves solving a set of linear equations, called the sidebearing normal equations. More generally, simultaneous sidebearing and kerning term estimation involves an iterative procedure in which a modified set of sidebearing normal equations is solved during each iteration. Character depth estimates are obtained by solving a set of baseline normal equations. In a preliminary evaluation of the proposed procedures on scanned text images in three fonts, the root-mean-square set width estimation error was about 0.2 pixel. An application of font metric estimation to text image editing is discussed.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1997

Supervised template estimation for document image decoding

Gary E. Kopec; Mauricio Lomelin

An approach to supervised training of character templates from page images and unaligned transcriptions is proposed. The template training problem is formulated as one of constrained maximum likelihood parameter estimation within the document image decoding framework. This leads to a three-phase iterative training algorithm consisting of transcription alignment, aligned template estimation (ATE), and channel estimation steps. The maximum likelihood ATE problem is shown to be NP-complete and, thus, an approximate solution approach is developed. An evaluation of the training procedure in a document-specific decoding task, using the University of Washington UW-II database of scanned technical journal articles, is described.

international conference on document analysis and recognition | 1993

Automatic generation of custom document image decoders

Gary E. Kopec; Philip A. Chou

A framework for document image recognition, called document image decoding (DID), that supports the automatic generation of custom document recognition systems from user-specified document models is discussed. A document recognition problem is viewed as consisting of a message source, an imager, a noisy channel, and an image decoder (recognizer). The inputs to a decoder generator are explicit models for the message source, imager and channel; the output is a specialized program that decodes an image in terms of these models. The models used in DID are based on a stochastic attribute grammar model of document production. Use of an automatically generated decoder to analyze telephone yellow pages is described.<<ETX>>

international conference on acoustics, speech, and signal processing | 1993

Document image decoding using Markov source models

Gary E. Kopec; Phil A. Chou

The authors describe a communication theory approach to document image reconstruction, patterned after the use of hidden Markov models in speech recognition. A document recognition problem is viewed as consisting of three elements-an image generator, a noisy channel, and an image decoder. A document image generator is a Markov source which combines a message source with an imager. The message source produces a string of symbols which contains the information to be transmitted. The imager is modeled as a finite-state transducer, which converts the message into an ideal bitmap. The channel transforms the ideal image into a noisy observed image. The decoder estimates the message from the observed image by finding the a posteriori most probable path through the combined source and channel models using a Viterbi-like algorithm. Application of the proposed method to decoding telephone yellow pages is described.<<ETX>>

international conference on acoustics, speech, and signal processing | 1994

Heuristic image decoding using separable source models

Anthony C Kam; Gary E. Kopec

This paper describes an approach to reducing the computational cost of document image decoding using Markov source models. The kernel of the approach is a type of informed best-first search algorithm, called the iterated complete path (ICP) algorithm. ICP reduces computation by performing full Viterbi decoding only in those regions of the decoding trellis likely to contain the best path. These regions are identified by upper bounding the full decoding score using simple heuristic functions. Three types of heuristics have been explored, based on horizontal pixel projection, adjacent row scores, and decoding a reduced resolution image. Speedup factors of 3-25 have been obtained using these heuristics to decode text pages and telephone yellow page columns, leading to decoding times of about 1 minute per text page and 3 minutes per yellow page column on a four processor machine.<<ETX>>

international conference on image processing | 1994

Document image decoding

Gary E. Kopec; Philip A. Chou

Document image decoding (DID) is a proposed generic framework for document recognition that is based on an explicit communication theory view of the processes of document creation, transmission and interpretation. This paper presents a brief summary of work in DID carried out at the Xerox Palo Alto Research Center over the past several years.<<ETX>>

international conference on acoustics, speech, and signal processing | 1989

An LPC-based spectral similarity measure for speech recognition in the presence of co-channel speech interference

Gary E. Kopec; Marcia A. Bush

The authors present an alternative to the enhancement paradigm for cochannel speech recognition, in which target-interference separation and target recognition occur simultaneously, driven by a model of the recognition vocabulary. The method is based on an LPC (linear predictive coding) spectral similarity measure which allows a reference spectrum to match only a subset of the poles of a noisy input spectrum, rather than requiring a whole-spectrum comparison. A preliminary evaluation of the proposed method in a speaker-trained isolated-digit recognition task suggests a reduction in error rate of 50-70% at low target-interference ratios, as compared to a conventional whole-spectrum similarity measure.<<ETX>>

international conference on image processing | 1996

Document image decoding in the Berkeley Digital Library

Gary E. Kopec

The UC Berkeley Environmental Digital Library Project is one of six university led projects that were initiated in the fall of 1994 as part of a four year digital library initiative sponsored by the NSF, NASA and ARPA. The Berkeley project is particularly interesting from an image processing perspective because its testbed collection consists almost entirely of scanned materials. As a result, the Berkeley project is making extensive use of document recognition and other image analysis technology to provide content based access to the collection. The Document Image Decoding (DID) group at Xerox PARC is a member of the Berkeley team and is investigating the application of DID techniques to providing high quality (accurate and properly structured) transcriptions of scanned documents. The paper briefly describes the Berkeley project, discusses some of its recognition requirements and presents an example of an advanced structured document created using DID technology.

international conference on image processing | 1996

Document image decoding approach to character template estimation

Gary E. Kopec; Mauricio Lomelin

An approach to supervised training of document-specific character templates from sample page images and unaligned transcriptions is presented. The template estimation problem is formulated as one of constrained maximum likelihood parameter estimation within the document image decoding (DID) framework. This leads to a two-phase iterative training algorithm consisting of transcription alignment and aligned template estimation (ATE) steps. The ATE step is the heart of the algorithm and involves assigning template pixel colors to maximize likelihood while satisfying a template disjointness constraint. In one large-scale experiment, use of document-specific templates resulted in a character error rate that was about an order of magnitude less than that of a commercial omni-font OCR program.

Explore More