Philip A. Chou
Xerox
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Philip A. Chou.
IS&T/SPIE's Symposium on Electronic Imaging: Science & Technology | 1995
Gary E. Kopec; Philip A. Chou; David A. Maltz
This paper describes a Markov source model for a simple subset of printed music notation. The model is based on the Adobe Sonata music symbol set and a message language of our own design. Chord imaging is the most complex part of the model. Much of the complexity follows from a rule of music typography that requires the noteheads for adjacent pitches to be placed on opposite sides of the chord stem. This rule leads to a proliferation of cases for other typographic details such as dot placement. We describe the language of message strings accepted by the model and discuss some of the imaging issues associated with various aspects of the message language. We also point out some aspects of music notation that appear problematic for a finite-state representation. Development of the model was greatly facilitated by the duality between image synthesis and image decoding. Although our ultimate objective was a music image model for use in decoding, most of the development proceeded by using the evolving model for image synthesis, since it is computationally far less costly to image a message than to decode an image.
IS&T/SPIE's Symposium on Electronic Imaging: Science & Technology | 1995
Philip A. Chou; Gary E. Kopec
Document image decoding (DID) refers to the process of document recognition within a communication theory framework. In this framework, a logical document structure is a message communicated by encoding the structure as an ideal image, transmitting the ideal image through a noisy channel, and decoding the degraded image into a logical structure as close to the original message as possible, on average. Thus document image decoding is document image recognition where the recognizer performs optimal reconstruction by explicitly modeling the source of logical structures, the encoding procedure, and the channel noise. In previous work, we modeled the source and encoder using probabilistic finite-state automata and transducers. In this paper, we generalize the source and encoder models using context-free attribute grammars. We employ these models in a document image decoder that uses a dynamic programming algorithm to minimize the probability of error between original and reconstructed structures. The dynamic programming algorithm is a generalization of the Cocke-Younger-Kasami parsing algorithm.
IS&T/SPIE's Symposium on Electronic Imaging: Science and Technology | 1993
Gary E. Kopec; Philip A. Chou
This paper describes a communication theory approach to document image recognition, patterned after the use of hidden Markov models in speech recognition. In general, a document recognition problem is viewed as consisting of three elements -- an image generator, a noisy channel, and an image decoder. A document image generator is a Markov source (stochastic finite-state automation) which combines a message source with an imager. The message source produces a string of symbols, or text, which contains the information to be transmitted. The imager is modeled as a finite-state transducer which converts the one-dimensional message string into an ideal two-dimensional bitmap. The channel transforms the ideal image into a noisy observed image. The decoder estimates the message, given the observed image, by finding the a posteriori most probable path through the combined source and channel models using a Viterbi-like dynamic programming algorithm. The proposed approach has been applied to the problem of decoding scanned telephone yellow pages to extract names and numbers from the listings. A finite-state model for yellow page columns was constructed and used to decode a database of scanned column images containing about 1100 individual listings. Overall, 99.5% of the listings were correctly recognized, with character classification rates of 98% and 99.6%, respectively, for the names and numbers.
IS&T/SPIE 1994 International Symposium on Electronic Imaging: Science and Technology | 1994
Gary E. Kopec; Philip A. Chou
Document image decoding (DID) is a recently proposed generic framework for document recognition that is based on an explicit communication theory view of the processes of document creation, transmission, and interpretation. DID views a document recognition problem as containing four elements -- a message (information) source, an encoder (formatter and renderer), a noisy channel (e.g., printer, scanner) and an image decoder (recognizer). Application of DID to a particular recognition problem involves developing stochastic models for the source, encoder, and channel processes. The DID approach to modeling is based on the use of stochastic attributed context-free grammars. DID supports an approach to image decoding that has as the kernel of the method an informed best-first search algorithm, called the iterated complete path (ICP) algorithm, that is similar to branch-and bound and related heuristic search and optimization techniques. The inputs to the decoder generator are a Markov source model and values for channel parameters. The generator creates the necessary computation schedules and outputs an optimized in-line C program that implements the decoder. The customized decoder program is then compiled, linked with a support library and used to decode images.
Archive | 1994
Mohan Vishwanath; Philip A. Chou
Archive | 1995
Gary E. Kopec; Philip A. Chou; Leslie T. Niles
Archive | 1995
Gary E. Kopec; Philip A. Chou; Leslie T. Niles
Archive | 1995
Gary E. Kopec; Philip A. Chou
Archive | 1995
Gary E. Kopec; Philip A. Chou; Leslie T. Niles
Archive | 1994
Vijay Balasubramanian; Francine R. Chen; Philip A. Chou; Donald G. Kimber; Alex D. Poon; Karon A. Weber; Lynn D. Wilcox