Patrick M. Kelly
Los Alamos National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Patrick M. Kelly.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1997
Judith Hochberg; Patrick M. Kelly; Timothy R. Thomas; Lila Kerns
We describe an automated script identification system for typeset document images. Templates for each script are created by clustering textual symbols from a training set. Symbols from new images are compared to the templates to find the best script. Our current system processes thirteen scripts with minimal preprocessing and high accuracy.
International Journal on Document Analysis and Recognition | 1999
Judith Hochberg; Kevin Bowers; Michael Cannon; Patrick M. Kelly
Abstract. A system for automatically identifying the script used in a handwritten document image is described. The system was developed using a 496-document dataset representing six scripts, eight languages, and 279 writers. Documents were characterized by the mean, standard deviation, and skew of five connected component features. A linear discriminant analysis was used to classify new documents, and tested using writer-sensitive cross-validation. Classification accuracy averaged 88% across the six scripts. The same method, applied within the Roman subcorpus, discriminated English and German documents with 85% accuracy.
Storage and Retrieval for Image and Video Databases | 1996
Julio E. Barros; James C. French; Worthy N. Martin; Patrick M. Kelly; T. Michael Cannon
Dissimilarity measures, the basis of similarity-based retrieval, can be viewed as a distance and a similarity-based search as a nearest neighbor search. Though there has been extensive research on data structures and search methods to support nearest-neighbor searching, these indexing and dimension-reduction methods are generally not applicable to non-coordinate data and non-Euclidean distance measures. In this paper we reexamine and extend previous work of other researchers on best match searching based on the triangle inequality. These methods can be used to organize both non-coordinate data and non-Euclidean metric similarity measures. The effectiveness of the indexes depends on the actual dimensionality of the feature set, data, and similarity metric used. We show that these methods provide significant performance improvements and may be of practical value in real-world databases.
International Journal on Document Analysis and Recognition | 1999
Michael Cannon; Judith Hochberg; Patrick M. Kelly
Abstract. We present a useful method for assessing the quality of a typewritten document image and automatically selecting an optimal restoration method based on that assessment. We use five quality measures that assess the severity of background speckle, touching characters, and broken characters. A linear classifier uses these measures to select a restoration method. On a 139-document corpus, our methodology reduced the corpus OCR character error rate from 20.27% to 12.60%.
international symposium on neural networks | 1992
Patrick M. Kelly; Don R. Hush; James M. White
The learning vector quantization (LVQ) algorithm is a common method which allows a set of reference vectors for a distance classifier to adapt to a given training set. The authors have developed a similar learning algorithm, the LVQ using the Mahalanabis distance metric (LVQ-MM), which manipulates hyperellipsoidal cluster boundaries as opposed to reference vectors. Regions of the input feature space are first enclosed by ellipsoidal decision boundaries, and then these boundaries are iteratively modified to reduce classification error. Results obtained by classifying the Iris data set are provided.<<ETX>>
Proceedings of SPIE | 1993
Patrick M. Kelly; James M. White
Interpreting remotely sensed data typically requires expensive, specialized computing machinery capable of storing and manipulating large amounts of data quickly. In this paper, we present a method for accurately analyzing and categorizing remotely sensed data on much smaller, less expensive platforms. Data size is reduced in such a way as to retain the integrity of the original data, where the format of the resultant data set lends itself well to providing an efficient, interactive method of data classification.
applied imagery pattern recognition workshop | 1994
Julio E. Barros; James C. French; Worthy N. Martin; Patrick M. Kelly; James M. White
This paper discusses our view of image databases, content-based retrieval, and our experiences with an experimental system. We present a methodology in which efficient representation and indexing are the basis for retrieval of images by content as well as associated external information. In the example system images are indexed and accessed based on properties of the individual regions in the image. Regions in each image are indexed by their spectral characteristics, as well as by their shape descriptors and position information. The goal of the system is to reduce the number of images that need to be inspected by a user by quickly excluding substantial parts of the database. The system avoids exhaustive searching through the image database when a query is submitted.
applied imagery pattern recognition workshop | 1995
Patrick M. Kelly; T. Michael Cannon
This paper presents results from our experience with CANDID (comparison algorithm for navigating digital image databases), which was designed to facilitate image retrieval by content using a query-by-example methodology. A global signature describing the texture, shape, or color content is first computed for every image stored in a database, and a normalized similarity measure between probability density functions of feature vectors is used to match signatures. This method can be used to retrieve images from a database that are similar to a user-provided example image. Results for three test applications are included.
Storage and Retrieval for Image and Video Databases | 1996
Patrick M. Kelly; T. Michael Cannon; Julio E. Barros
The CANDID project (comparison algorithm for navigating digital image databases) employs probability density functions (PDFs) of localized feature information to represent the content of an image for search and retrieval purposes. A similarity measure between PDFs is used to identify database images that are similar to a user-provided query image. Unfortunately, signature comparison involving PDFs is a very time-consuming operation. In this paper, we look into some efficiency considerations when working with PDFs. Since PDFs can take on many forms, we look into tradeoffs between accurate representation and efficiency of manipulation for several data sets. In particular, we typically represent each PDF as a Gaussian mixture (e.g. as a weighted sum of Gaussian kernels) in the feature space. We find that by constraining all Gaussian kernels to have principal axes that are aligned to the natural axes of the feature space, computations involving these PDFs are simplified. We can also constrain the Gaussian kernels to be hyperspherical rather than hyperellipsoidal, simplifying computations even further, and yielding an order of magnitude speedup in signature comparison. This paper illustrates the tradeoffs encountered when using these constraints.
Storage and Retrieval for Image and Video Databases | 1995
Julio E. Barros; James C. French; Worthy N. Martin; Patrick M. Kelly
Current feature-based image databases can typically perform efficient and effective searches on scalar feature information. However, many important features, such as graphs, histograms, and probability density functions, have more complex structure. Mechanisms to manipulate complex feature data are not currently well understood and must be further developed. The work we discuss in this paper explores techniques for the exploitation of spectral distribution information in a feature-based image database. A six band image was segmented into regions and spectral information for each region was maintained. A similarity measure for the spectral information is proposed and experiments are conducted to test its effectiveness. The objective of our current work is to determine if these techniques are effective and efficient at managing this type of image feature data.