John C. Handley
Xerox
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by John C. Handley.
Data Mining and Knowledge Discovery | 2007
Eamonn J. Keogh; Stefano Lonardi; Chotirat Ann Ratanamahatana; Li Wei; Sang-Hee Lee; John C. Handley
The vast majority of data mining algorithms require the setting of many input parameters. The dangers of working with parameter-laden algorithms are twofold. First, incorrect settings may cause an algorithm to fail in finding the true patterns. Second, a perhaps more insidious problem is that the algorithm may report spurious patterns that do not really exist, or greatly overestimate the significance of the reported patterns. This is especially likely when the user fails to understand the role of parameters in the data mining process. Data mining algorithms should have as few parameters as possible. A parameter-light algorithm would limit our ability to impose our prejudices, expectations, and presumptions on the problem at hand, and would let the data itself speak to us. In this work, we show that recent results in bioinformatics, learning, and computational theory hold great promise for a parameter-light data-mining paradigm. The results are strongly connected to Kolmogorov complexity theory. However, as a practical matter, they can be implemented using any off-the-shelf compression algorithm with the addition of just a dozen lines of code. We will show that this approach is competitive or superior to many of the state-of-the-art approaches in anomaly/interestingness detection, classification, and clustering with empirical tests on time series/DNA/text/XML/video datasets. As a further evidence of the advantages of our method, we will demonstrate its effectiveness to solve a real world classification problem in recommending printing services and products.
Paleobiology | 2009
Linda C. Ivany; Carlton E. Brett; Heather L. B. Wall; Patrick D. Wall; John C. Handley
Abstract The concept of coordinated stasis, manifest as a pattern of long intervals of concurrent taxonomic and ecologic persistence separated by comparatively abrupt periods of biotic change, has been challenged in recent studies that claim a lack of prolonged persistence of taxa and associations. A key problem has been the difficulty of distinguishing faunal change owing to localized, short-term environmental fluctuation or patchiness from that indicating regionally pervasive, long-term evolutionary or ecological change. Here, we use an extensive database from the Middle Devonian Hamilton Group of the Appalachian Basin to test for taxonomic and ecologic persistence within this ecological-evolutionary subunit, a succession of purported relative stability. Replicate samples collected from many localities and stratigraphic horizons over a wide geographic area allow us to address the effects of small-scale environmental variation and localized faunal patchiness while exploring basin-scale variation in faunal composition within and between the formations of the Hamilton Group. Observed stratigraphic distributions of fossils are consistent with a scenario in which all taxa are present from bottom to top of the Hamilton Group, and absences result only from sampling failure. Although small-scale variation in faunal composition indeed does occur, there is no more variation among formations than occurs within them. Assemblages from different formations, whether they are defined by taxonomic or ecologic composition, are statistically indistinguishable according to several independent metrics, including ANOSIM and a maximum likelihood estimation that evaluates stratigraphic turnover using Bayesian “Information Criterion.” Simulated data sets indicate that test results are most consistent with species-level extinction of 2.6% per Myr within the Hamilton Group, far lower than the Givetian rate of 11.5% per Myr generic extinction derived from a global database. Such faunal persistence over the ∼5.5 Myr encompassed by this unit is consistent with the pattern of coordinated stasis. Earlier studies showing greater amounts of temporal turnover in Hamilton Group faunas are likely influenced by their smaller geographic scale of analysis, suggesting that regional studies done elsewhere may yield similar results.
international conference on image processing | 2002
Salil Prabhakar; Hui Cheng; John C. Handley; Zhigang Fan; Ying-wei Lin
High-level (semantic) image classification can be achieved by analysis of low-level image attributes geared for the particular classes. In this paper, we have proposed a novel application of the known image processing and classification techniques to achieve such a high-level classification of color images. Our image classification algorithm uses three low-level image features: texture, color, and edge characteristics to classify a color image into two classes: business graphics or natural picture. We have achieved an accuracy of 96.6% on our database of 209 images using a combination of tree and neural network classifiers.
Paleobiology | 2012
Jocelyn A. Sessa; Timothy J. Bralower; Mark E. Patzkowsky; John C. Handley; Linda C. Ivany
Abstract The late Mesozoic through early Cenozoic is an interval of significant biologic turnover and ecologic reorganization within marine assemblages, but the timing and causes of these changes remain poorly understood. Here, we quantify the pattern and timing of shifts in the diversity (richness and evenness) and ecology of local (i.e., sample level) mollusk-dominated assemblages during this critical interval using field-collected and published data sets from the U.S. Gulf Coastal Plain. We test whether the biologic and ecologic patterns observed primarily at the global level during this time are also expressed at the local level, and whether the end-Cretaceous (K/Pg) mass extinction and recovery moderated these trends. To explore whether environment had any effect on these patterns, we examine data from shallow subtidal and offshore settings. Assemblages from both settings recovered to pre-extinction diversity levels rapidly, in less than 7 million years. Following initial recovery, diversity remained unchanged in both settings. The trajectory of ecological restructuring was distinct for each setting in the wake of the K/Pg extinction. In offshore assemblages, the abundance and number of predatory carnivorous taxa dramatically increased, and surficial sessile suspension feeders were replaced by more active suspension feeders. In contrast, shallow subtidal assemblages did not experience ecological reorganization following the K/Pg extinction. The distinct ecological patterns displayed in each environment follow onshore-offshore patterns of innovation, whereby evolutionary novelties first appear in onshore settings relative to offshore habitats. Increased predation pressure may explain the significant ecological restructuring of offshore assemblages, whereby the explosive radiation of predators drove changes in their prey. Habitat-specific ecological restructuring, and its occurrence solely during the recovery interval, implies that disturbance and incumbency were also key in mediating these ecological changes.
international conference on document analysis and recognition | 2005
John C. Handley; Anoop M. Namboodiri; Richard Zanibbi
We present a document understanding system in which the arrangement of lines of text and block separators within a document are modeled by stochastic context free grammars. A grammar corresponds to a document genre; our system may be adapted to a new genre simply by replacing the input grammar. The system incorporates an optical character recognition system that outputs characters, their positions and font sizes. These features are combined to form a document representation of lines of text and separators. Lines of text are labeled as tokens using regular expression matching. The maximum likelihood parse of this stream of tokens and separators yields a functional labeling of the document lines. We describe business card and business letter applications.
international conference on image processing | 2002
Salil Prabhakar; Hui Cheng; Raja Bala; John C. Handley; Ying-wei Lin
Business graphics are an important class of digital imagery. Such images are computer-generated, and comprise synthetic elements such as solid fills, line art, and color sweeps. Often these images are first printed and then scanned for further electronic reuse. The printing and scanning process destroys the synthetic structure of a graphics image, and furthermore introduces distortions due to halftoning and other forms of printer and scanner noise. Subsequent reproductions usually amplify these distortions thus resulting in rapid degradation of image quality. It would thus be desirable to detect and reconstruct the original synthetic structure from the scanned image. This paper presents an effort in this direction, namely a method to detect color sweeps in scanned images. Once detected, the synthetic signature of the sweep is derived, namely its starting and ending color. This information can be used to optimize subsequent image processing operations such as rendering to an output device, or image compression. This work represents a novel application of known image processing techniques to extract semantic information from graphics images.
document recognition and retrieval | 2000
John C. Handley
A table in a document is a rectilinear arrangement of cells where each cell contains a sequence of words. Several lines of text may compose one cell. Cells may be delimited by horizontal or vertical lines, but often this is not the case. A table analysis system is described which reconstructs table formatting information from table images whether or not the cells are explicitly delimited. Inputs to the system are word bounding boxes and any horizontal and vertical lines that delimit cells. Using a sequence of carefully-crafted rules, multi-line cells and their interrelationships are found even though no explicit delimiters are visible. This robust system is a component of a commercial document recognition system.
PALAIOS | 2009
John C. Handley; H. David Sheets; Charles E. Mitchell
Abstract The hypothesis of coordinated stasis (CS) holds that taxa within ecological communities show a pattern of persistence over geologic time (faunal stability). This hypothesis has been examined by looking for evidence of stasis and change in paleocommunity structure based on patterns of taxon abundances obtained from bulk samples of fossil assemblages. Community structure based on taxon counts is often investigated using distance-based clustering methods, employing Analysis of Similarity (ANOSIM) or other multivariate statistical tools. We propose a new method for analyzing trends in community structure by viewing taxon counts from bulk samples as a time series and assess stasis and change based on a probabilistic assessment of the continuity of the species distribution patterns. In this flexible approach, taxon counts from samples ordered in time or space (or grouped by a geologically informed hypothesis) are modeled as a sequence of multinomial or Bernoulli outcomes drawn from an underlying ecological distribution. The optimal model of community structure may then be chosen from a set of hypotheses about those distributions, based on Akaikes Information Criterion, an information-theoretic measure of fit that penalizes likelihood of fitting the data with the number of parameters needed to attain the fit. CS is the model in which the underlying probabilities for each sample are constant across different samples. In other words, the most likely scenario is that all samples are drawn from the same underlying taxon abundance distribution. We propose that this approach is a powerful and flexible method to statistically assess CS, as well as other hypotheses about community structure, and we demonstrate this method using paleoecological data from the literature.
systems man and cybernetics | 1998
John C. Handley
Optical character recognition is perhaps the most studied application of pattern recognition. Recent work has increased accuracy in two ways. Combination of individual classifier outputs overcomes deficiencies of features and trainability of single classifiers. OCR systems take page images as input and output strings of recognized characters. Due to character segmentation errors, characters can be split or merged preventing output combination character-by-character. Merging of output strings is done using string alignment algorithms.
PLOS ONE | 2013
Judith Nagel-Myers; Gregory P. Dietl; John C. Handley; Carlton E. Brett
The fossil record is the only source of information on the long-term dynamics of species assemblages. Here we assess the degree of ecological stability of the epifaunal pterioid bivalve assemblage (EPBA), which is part of the Middle Devonian Hamilton fauna of New York—the type example of the pattern of coordinated stasis, in which long intervals of faunal persistence are terminated by turnover events induced by environmental change. Previous studies have used changes in abundance structure within specific biofacies as evidence for a lack of ecological stability of the Hamilton fauna. By comparing data on relative abundance, body size, and predation, indexed as the frequency of unsuccessful shell-crushing attacks, of the EPBA, we show that abundance structure varied through time, but body-size structure and predation pressure remained relatively stable. We suggest that the energetic set-up of the Hamilton faunas food web was able to accommodate changes in species attributes, such as fluctuating prey abundances. Ecological redundancy in prey resources, adaptive foraging of shell-crushing predators (arising from predator behavioral or adaptive switching in prey selection in response to changing prey abundances), and allometric scaling of predator-prey interactions are discussed as potential stabilizing factors contributing to the persistence of the Hamilton faunas EPBA. Our study underscores the value and importance of multiple lines of evidence in tests of ecological stability in the fossil record.