C. Taillie
Pennsylvania State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by C. Taillie.
Journal of the American Statistical Association | 1982
G. P. Patil; C. Taillie
Abstract This paper puts forth the view that diversity is an average property of a community and identifies that property as species rarity. An intrinsic diversity ordering of communities is defined and is shown to be equivalent to stochastic ordering. Also, the sensitivity of an index to rare species is developed, culminating in a crossing-point theorem and a response theory to perturbations. Diversity decompositions, analogous to the analysis of variance, are discussed for two-way classifications and mixtures. The paper concludes with a brief survey of genetic diversity, linguistic diversity, industrial concentration, and income inequality.
Environmental and Ecological Statistics | 2004
G. P. Patil; C. Taillie
A declared need is around for geoinformatic surveillance statistical science and software infrastructure for spatial and spatiotemporal hotspot detection. Hotspot means something unusual, anomaly, aberration, outbreak, elevated cluster, critical resource area, etc. The declared need may be for monitoring, etiology, management, or early warning. The responsible factors may be natural, accidental, or intentional. This proof-of-concept paper suggests methods and tools for hotspot detection across geographic regions and across networks. The investigation proposes development of statistical methods and tools that have immediate potential for use in critical societal areas, such as public health and disease surveillance, ecosystem health, water resources and water services, transportation networks, persistent poverty typologies and trajectories, environmental justice, biosurveillance and biosecurity, among others. We introduce, for multidisciplinary use, an innovation of the health-area-popular circle-based spatial and spatiotemporal scan statistic. Our innovation employs the notion of an upper level set, and is accordingly called the upper level set scan statistic, pointing to a sophisticated analytical and computational system as the next generation of the present day popular SaTScan. Success of surveillance rests on potential elevated cluster detection capability. But the clusters can be of any shape, and cannot be captured only by circles. This is likely to give more of false alarms and more of false sense of security. What we need is capability to detect arbitrarily shaped clusters. The proposed upper level set scan statistic innovation is expected to fill this need
Environmental and Ecological Statistics | 2004
G. P. Patil; C. Taillie
This paper is concerned with the question of ranking a finite collection of objects when a suite of indicator values is available for each member of the collection. The objects can be represented as a cloud of points in indicator space, but the different indicators (coordinate axes) typically convey different comparative messages and there is no unique way to rank the objects while taking all indicators into account. A conventional solution is to assign a composite numerical score to each object by combining the indicator information in some fashion. Consciously or otherwise, every such composite involves judgments (often arbitrary or controversial) about tradeoffs or substitutability among indicators. Rather than trying to combine indicators, we take the view that the relative positions in indicator space determine only a partial ordering and that a given pair of objects may not be inherently comparable. Working with Hasse diagrams of the partial order, we study the collection of all rankings that are compatible with the partial order (linear extensions). In this way, an interval of possible ranks is assigned to each object. The intervals can be very wide, however. Noting that ranks near the ends of each interval are usually infrequent under linear extensions, a probability distribution is obtained over the interval of possible ranks. This distribution, called the rank-frequency distribution, turns out to be unimodal (in fact, log-concave) and represents the degree of ambiguity involved in attempting to assign a rank to the corresponding object. Stochastic ordering of probability distributions imposes a partial order on the collection of rank-frequency distributions. This collection of distributions is in one-to-one correspondence with the original collection of objects and the induced ordering on these objects is called the cumulative rank-frequency (CRF) ordering; it extends the original partial order. Although the CRF ordering need not be linear, it can be iterated to yield a fixed point of the CRF operator. We hypothesize that the fixed points of the CRF operator are exactly the linear orderings. The CRF operator treats each linear extension as an equal “voter” in determining the CRF ranking. It is possible to generalize to a weighted CRF operator by giving linear extensions differential weights either on mathematical grounds (e.g., number of jumps) or empirical grounds (e.g., indicator concordance). Explicit enumeration of all possible linear extensions is computationally impractical unless the number of objects is quite small. In such cases, the rank-frequencies can be estimated using discrete Markov chain Monte Carlo (MCMC) methods.
Environmental and Ecological Statistics | 1995
Amarjot Kaur; G. P. Patil; A. K. Sinha; C. Taillie
The paper provides an up-to-date annotated bibliography of the literature on ranked set sampling. The bibliography includes all pertinent papers known to the authors, and is intended to cover applications as well as theoretical developments. The annotations are arranged in chronological order and are intended to be sufficiently complete and detailed that a reading from beginning to end would provide a statistically mature reader with a state-of-the-art survey of ranked set sampling, including historical development, current status, and future research directions and applications. A final section of the paper gives a listing of all annotated papers, arranged in alphabetical order by author.
Environmental and Ecological Statistics | 1999
G. P. Patil; A. K. Sinha; C. Taillie
Abu-Dayyeh, H., and Muttlak, H.A. (1996). Using ranked set sampling for hypothesis tests on the scale parameter of the exponential and uniform distributions. Pakistan Journal of Statistics, 12, 131±138. Aragon, E., Gore, S.D., and Patil, G.P. (1994). Environmental sampling, observational economy, and extreme values. Parisankhyan Samikkha, 1, 71±81. Aragon, E., Patil, G.P., and Taillie, C. (1999). A performance indicator for ranked set sampling using ranking error probability matrix. Environmental and Ecological Statistics, 6. (To appear). Balakrishnan, N, and Rao, C.R. (1997). A note on the best linear unbiased estimation based on order statistics. American Statistician, 51, 181±185. Balasubramanian, K., and Balakrishnan, N. (1993). Duality principle in order statistics. Journal of the Royal Statistical Society B , 687±691. Barabesi, L. (1998). The computation of the distribution of the sign test statistics for ranked set sampling. Communication in Statistics, Simulation and Computation, 27, 833±842. Barabesi, L., and Fattorini, L. (1994). Kernel estimators for the intensity of a spatial point process by point-to-plant distances: Random sampling vs. ranked set sampling. TR 94±0402, Center for Statistical Ecology and Environmental Statistics, Department of Statistics, Penn State University, University Park, PA. Barnett, V. (1999). Ranked set sample design for environmental investigations. Environmental and Ecological Statistics, 6. (To appear). Barnett, V., and Moore, K. (1997). Best linear unbiased estimates in ranked set sampling with particular reference to imperfect ordering. Journal of Applied Statistics, 24, 697±710. Barreto, M.C.M., and Barnett, V. (1999). Best linear unbiased estimators for the simple linear regression model using ranked-set sampling. Environmental and Ecological Statistics, 6. (To appear). Bhoj, D.S. (1997). Estimation of parameters of the extreme value distribution using ranked set sampling. Communications in Statistics ± Theory and Methods, 26, 653±. Bhoj, D.S., and Ahsanullah, M. (1996). Estimation of parameters of the generalized geometric distribution using ranked set sampling. Biometrics, 52, 685±694. Bohn, L.L. (1993). A two-sample procedure for ranked-set samples. Technical Report Number 426, Department of Statistics, University of Florida, Gainesville, FL. Bohn, L.L. (1996). A review of nonparametric ranked set sampling methodology. Communication in Statistics ± Theory and Methods, 25, 2675±. Bohn, L.L., and Wolfe, D.A. (1992) Nonparametric two-sample procedures for ranked-set samples data. Journal of the American Statistical Association, 87, 552±561. Bohn, L.L., and Wolfe, D.A. (1994). The effect of imperfect judgment rankings on properties of procedures based on the ranked-set samples analog of the Mann-Whitney-Wilcoxon statistics. Journal of the American Statistical Association, 89, 168±176. Boyles, R.A., and Samaniego, F.J. (1986). Estimating a distribution function based on nomination sampling. Journal of the American Statistical Association, 81, 1039±1045. Environmental and Ecological Statistics 6, 91±98 (1999)
Handbook of Statistics | 1994
Jeffrey H. Gove; G. P. Patil; Benee F. Swindel; C. Taillie
Abstract The diversity of an ecological community is defined in terms of the average species rarity of that community using both dichotomous- and rank-type rarity measures. Common diversity indices and profiles are developed using this definition and the concept of intrinsic diversity ordering is presented. The use of these diversity measures is illustrated in case studies involving the plant communities in two distinct forested ecosystems of the southeastern and northeastern United States. In these case studies, the objective is the comparative assessment of diversity changes over time and between treatments. Diversity profiles are further utilized in nonlinear mathematical programming models for uneven-aged stand management. The models present strategies for maximization and maintenance of structural diversity within uneven-aged stands. Finally, a principal components regression technique is presented which facilitates prediction of plant species diversity; the user need only be able to classify an individual into one of five life-forms — significantly reducing the taxonomic skills required for diversity assessment.
Environmental and Ecological Statistics | 2005
Denice H. Wardrop; Joseph A. Bishop; M. Easterling; Kristen C. Hychka; Wayne L. Myers; G. P. Patil; C. Taillie
AbstractThe Atlantic Slope Consortium (ASC) is a project designed to develop and test a set of indicators in coastal systems that are ecologically appropriate, economically reasonable, and relevant to society. The suite of indicators will produce integrated assessments of the condition, health and sustainability of aquatic ecosystems based on ecological and socioeconomic information compiled at the scale of estuarine segments and small watersheds. The research mandate of the ASC project is the following:Using a universe of watersheds, covering a range of social choices, we ask two questions:• How “good” can the environment be, given those social choices?• What is the intellectual model of condition within those choices, i.e., what are the causes of condition and what are the steps for improvement?As a basis for compiling ecological indicators, a watershed classification system was required for the experimental design. The goal was to develop approximately five categories of watersheds for each physiographic province, utilizing landscape and land use parameters that would be predictive of aquatic resource condition. All 14-digit Hydrologic Unit Code (HUC) watersheds in the Mid-Atlantic region would then be classified according to the regime. Five parameters were utilized for the classification: three land cover categories, consisting of forested, agricultural, and urban, median slope or median elevation, and total variance of land covers in 1-km-radius circles positioned on all stream convergence points in a specified 14-digit␣HUC watershed. Cluster analysis utilizing these five parameters resulted in approximately five well-defined watershed classes per physiographic province. The distribution of all watersheds in the Mid-Atlantic region across these categories provides a unique report on the probable condition of watersheds in the region.
Ecological Modelling | 1995
Jeffrey H. Gove; G. P. Patil; C. Taillie
Abstract A mathematical programming model is presented which yields an optimal diameter distribution that is at least as diverse as some antecedent or target distribution. At the heart of this model is a set of constraints that ensures this outcome as long as a feasible solution to the model is found. The theory of intrinsic diversity ordering, which forms the basis for the constraint set derivation, is also discussed. The set of diversity-maintaining constraints presented are completely general and may be added to other mathematical programming formulations where quantities other than horizontal structural diversity are of interest. Two examples are given which illustrate the use of the model.
Statistical Data Analysis and Inference | 1989
G. P. Patil; C. Taillie
Fisher (1934) recognized that the method of ascertainment can influence the form for the distribution of recorded observations, a concept that was later formalized by Rao (1965) into what is termed “weighted distribution.” This paper has two parts. The first reviews various structural properties of weighted distributions, both univariate and bivariate. The effect of weighted observations on Bayesian inference is also considered. The second part deals with three applications of weighted distributions: (1) in stochastic population dynamics, the exploited population size distribution is a weighted form of the natural (unexploited) size distribution; (2) Diaconis and Efron have introduced the double exponential family (DEF) for addressing overdispersion in data. The DEF is shown to be a weighted distribution with the unusual feature that the weight function involves the parameters (specifically, the mean) of the original distribution; (3) weight functions have been employed by Iyengar and Greenhouse (1988) to model the effects of publication bias in a meta-analysis study. We extend their model so as to account for heterogeneity as well as publication bias. For the particular data examined by Iyengar and Greenhouse, likelihood ratio tests indicate that heterogeneity, and not publication bias, is responsible for the observed pattern of effect sizes.
Landscape Ecology | 2001
Glen D. Johnson; Wayne L. Myers; G. P. Patil; C. Taillie
When the objective is to characterize landscapes with respect to relative degree and type of forest (or other critical habitat) fragmentation, it is difficult to decide which variables to measure and what type of discriminatory analysis to apply. It is also desirable to incorporate multiple measurement scales. In response, a new method has been developed that responds to changes in both the marginal and spatial distributions of land cover in a raster map. Multiscale features of the map are captured in a sequence of successively coarsened resolutions based on the random filter for degrading raster map resolutions. Basically, the entropy of spatial pattern associated with a particular pixel resolution is calculated, conditional on the pattern of the next coarser ‘parent’ resolution. When the entropy is plotted as a function of changing resolution, we obtain a simple two-dimensional graph called a ‘conditional entropy profile’, thus providing a graphical visualization of multi-scale fragmentation patterns.Using eight-category raster maps derived from 30-meter resolution LANDSAT Thematic Mapper images, the conditional entropy profile was obtained for each of 102 watersheds covering the state of Pennsylvania (USA). A suite of more conventional single-resolution landscape measurements was also obtained for each watershed using the FRAGSTATS program. After dividing the watersheds into three major physiographic provinces, cluster analysis was performed within each province using various combinations of the FRAGSTATS variables, land cover proportions and variables describing the conditional entropy profiles. Measurements of both spatial pattern and marginal land cover proportions were necessary to clearly discriminate the watersheds into distinct clusters for most of the state; however, the Piedmont province essentially only required the land cover proportions. In addition to land cover proportions, only the variables describing a conditional entropy profile appeared to be necessary for the Ridge and Valley province, whereas only the FRAGSTATS variables appeared to be necessary for the Appalachian Plateaus province. Meanwhile, the graphical representation of conditional entropy profiles provided a visualization of multi-scale fragmentation that was quite sensitive to changing pattern.