Christian Posse
Stanford University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Christian Posse.
Data Mining and Knowledge Discovery | 2002
David Madigan; Nandini Raghavan; William DuMouchel; Martha Nason; Christian Posse; Greg Ridgeway
Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analyses carried out on the original dataset. Likelihood-based data squashing (LDS) differs from a previously published squashing algorithm insofar as it uses a statistical model to squash the data. The results show that LDS provides excellent squashing performance even when the target statistical analysis departs from the model used to squash the data.
Journal of Computational and Graphical Statistics | 2001
Christian Posse
In recent years, hierarchical model-based clustering has provided promising results in a variety of applications. However, its use with large datasets has been hindered by a time and memory complexity that are at least quadratic in the number of observations. To overcome this difficulty, this article proposes to start the hierarchical agglomeration from an efficient classification of the data in many classes rather than from the usual set of singleton clusters. This initial partition is derived from a subgraph of the minimum spanning tree associated with the data. To this end, we develop graphical tools that assess the presence of clusters in the data and uncover observations difficult to classify. We use this approach to analyze two large, real datasets: a multiband MRI image of the human brain and data on global precipitation climatology. We use the real datasets to discuss ways of integrating the spatial information in the clustering analysis. We focus on two-stage methods, in which a second stage of processing using established methods is applied to the output from the algorithm presented in this article, viewed as a first stage.
Journal of Computational and Graphical Statistics | 1995
Christian Posse
Abstract In this article we examine the current status of exploratory projection pursuit. This involves a large comparison of all two-dimensional projection indexes and optimization algorithms that have been proposed in the literature. We show that the efficacy of the exploration depends to a large extent on the optimization routine. We also stress the importance of studying the behavior of the empirical projection indexes rather than the underlying theoretical distances they estimate. Indeed, indexes based on orthonormal polynomial expansions differ greatly in their behavior from the theoretical performance of the weighted L 2-distances they estimate. This study reveals three universal indexes, namely the Legendre index (Friedman), the Hermite index (Hall), and the chi-squared index (Posse), which are sensitive to any kind of departure from normality in the core of the distribution, and two indexes ideal for catching clusters, namely the Laguerre-Fourier index (Morton) and the Natural Hermite index (Cook...
Computational Statistics & Data Analysis | 1995
Christian Posse
Posse (1990) presented a projection pursuit technique, based on a global optimization algorithm and on a chi-squared projection index, for finding the plane in which the data are the most interesting. This paper extends and improves this algorithm providing an exploratory data analysis by projection pursuit that has important advantages over its competitors. The global optimization algorithm, when combined with a structure removal procedure due to Friedman (1987), allows a sequential identification of interesting bidimensional views of decreasing importance. The modified chi-squared index satisfies the five basic demands for a projection index. It is (1) uniquely minimized at the bivariate normal distribution, (2) approximately affine invariant, (3) consistent, (4) resistant to features in the tail of the distribution and, (5) simple enough to permit quick computation even for large data sets. The paper gives simple rules for judging the significance of a structure found by this algorithm. These rules define a stopping criterion for the search process. They are based on theoretical (asymptotic) arguments and are well-supported by simulations. The efficacy of the new algorithm is illustrated through several studies of real and simulated data.
Journal of Computational and Graphical Statistics | 2005
Igor Perisic; Christian Posse
Exploratory projection pursuit is a technique for finding interesting low-dimensional projections of multivariate data. To reach this goal, one optimizes an index, assigned to every projection, that characterizes the structure present in the projection. Most indices require the estimation of the marginal density of the projected data. Such estimations involve a tuning parameter that greatly influences the behavior of the index. However, the optimization is usually performed with an ad hoc, often fixed value for this parameter. This article proposes indices based on the empirical distribution function that do not require to be tuned. This allows users to use exploratory projection pursuit without having to predetermine an “esoteric” tuning parameter.
Archive | 2001
David Madigan; Nandini Raghavan; William DuMouchel; Martha Nason; Christian Posse; Greg Ridgeway
Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analyses carried out on the original dataset. Likelihood-based data squashing (LDS) differs from a previously published squashing algorithm insofar as it uses a statistical model to squash the data. The results show that LDS provides excellent squashing performance even when the target statistical analysis departs from the model used to squash the data.
Archive | 2002
Igor Perisic; Christian Posse
Archive | 1999
Michelle Keim Condliff; David Lewis; David Madigan; Christian Posse
american medical informatics association annual symposium | 2000
Christian Posse; Kerry Meyer; Martha Nason; Peter J. Dunbar; Keith Kelley; Nathalie Rosenblatt; J. White
american medical informatics association annual symposium | 2000
Kerry Meyer; Christian Posse; David Masuda; Nathalie Rosenblatt; P. Macko; Peter J. Dunbar