Charles E. Heckler
Eastman Kodak Company
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Charles E. Heckler.
Journal of Chemometrics | 2000
Willem Windig; Brian Antalek; Mark J. Robbins; Nicholas Zumbulyadis; Charles E. Heckler
DECRA (direct exponential curve resolution algorithm) is a fast multivariate method used to resolve spectral data with concentration profiles that are linear combinations of exponential functions. DECRA has been previously applied to a wide variety of spectroscopies. Results are presented in this paper for two new application areas: solid state nuclear magnetic resonance spectra of polymorphic crystal mixtures and mid‐infrared spectroscopy of chemical reactions. Furthermore, the paper will show the effect of the way the data set is split, which is a part of the algorithm, on the results. Copyright
Journal of Statistical Planning and Inference | 1997
Poduri S. R. S. Rao; Charles E. Heckler
Abstract Estimation of the variance components and the mean of the balanced and unbalanced threefold nested design is considered. The relative merits of the following procedures are evaluated: Analysis of variance (ANOVA), maximum likelihood (ML), restricted maximum likelihood (REML), and minimum variance quadratic unbiased estimator (MIVQUE). A new procedure called the weighted analysis of means (WAM) estimator which utilizes prior information on the variance components is proposed. It is found to have optimum properties similar to the REML and MIVQUE, and it is also computationally simpler. For the mean, the overall sample average, grand mean, unweighted mean, and generalized least-squares (GLS) estimator with its weights obtained from the above estimators for the variance components are considered. Comparisons of the above procedures for the variance components and the mean are made from exact expressions for the biases and mean square errors (MSEs) of the estimators and from empirical investigations.
Technometrics | 2005
Charles E. Heckler
Cressie, N. A. C. (1993), Statistics for Spatial Data (rev. ed.), New York: Wiley. Diggle, P. J. (1983), Statistical Analysis of Spatial Point Patterns, London: Academic Press. Katti, S. K. (1986), Review of Statistical Analysis of Spatial Point Patterns and Spatial Statistics, by P. J. Diggle, and B. D. Ripley, Journal of the American Statistical Association, 81, 263–264. Ripley, B. D. (1981), Spatial Statistics, New York: Wiley. (1986), Review of Statistical Analysis of Spatial Point Patterns, by P. J. Diggle, Mathematical Geology, 18, 353–354. Rowlingson, B. S., and Diggle, P. J. (1993), “SPLANCS: Spatial Point Pattern Analysis Code in S–PLUS,” Computers & Geosciences, 19, 627–655. Stoyan, D. (1985), Review of Statistical Analysis of Spatial Point Patterns, by P. J. Diggle, Biometrical Journal, 27, 690 (in German). Symanzik, J. (2001), Review of A Casebook for Spatial Statistical Data Analysis, by D. A. Griffith, and L. J. Layne, Technometrics, 43, 375–376.
Technometrics | 2008
Charles E. Heckler
with many easy-to-understand examples for illustration. I was pleased to see a large number of figures to give visual cues in the explanation. As a minor concern, I did find some figures included with no explanation and some that could have been better labeled to help ease interpretation. The author does a fine job emphasizing the practical utility of any analysis result; in other words, there should be some assessment of the practicality of any result obtained, rather than blind acceptance of the analysis results. To get the most out of this book, the reader would need some knowledge of basic statistical methods and maybe even some previous exposure to decision trees. A reader who has previous experience with decision trees should have no difficulty relating the examples given here. Throughout the book, the author does a good job of comparing and contrasting various approaches (e.g., CHAID vs. CRT) for building decision trees. The book consists of six chapters. Following an introduction to generic examples of decision trees in Chapter 1, Chapter 2 delves into a little bit of the historical development of decision trees. The author clearly notes the limitations of the earlier approaches, while using the history to note different features and uses of decisions trees. Chapter 3, the longest chapter, comprising nearly a third of the book, gives a detailed six-step process of how to create a decision tree. As part of explaining the different steps, the chapter covers such topics as data cleaning and handling missing data. Chapter 4 compares decision trees with two other business tools: multidimensional cubes and regression. Again, the author clearly notes the advantages and disadvantages in his comparison of the different tools. Chapter 5 covers the importance of matching decision tree results with conceptual explanations that make physical sense and presents more recent developments on how results from multiple decision trees can be combined to give improved results. The last chapter relates decision trees to other data mining approaches. A glossary, list of references, and an index are included at the end of the book. Examples of views and output from Enterprise Miner are numerous throughout the book. However, the reader looking for a help manual will be disappointed with the lack of direction in obtaining those views and outputs. Consequently, I found the title to be a little misleading, in that this book does not go into the details of implementating decision trees to a specific problem, nor does it give step-by-step instructions on how to do the analysis using the software. Nevertheless, I found the book to be very useful in learning more about the concepts behind decision trees but not so much in how to actually start building one from real data. A reader desiring more direction on how to implement decision trees using Enterprise Miner would be better off using the book by Matignon (2007) or the Enterprise Miner help files available from SAS. Although screen captures of various data sets are shown throughout, the data sets are not available for the reader to do some analysis with them for practice. The lack of problems in the chapters would make this difficult to use as a textbook. It would be better as a supplemental text or as a book for someone with no or little experience to get some exposure to some of the language, history, and concepts associated with decision trees.
Technometrics | 2007
Charles E. Heckler
This book is an introduction to a form of multivariate data analysis developed in the late 1970s and 1980s by Jean–Paul Benzécri. Much of it was documented in French, and it is often termed the French School. In general, the process consists of (1) coding data into a form conducive to the generation of a contingency table, (2) creation of a distance matrix based on a chi-square metric, (3) reduction of the dimensionality of the information in the distance matrix with the singular value decomposition (or other equivalent methods), (4) depicting the projected data graphically in a biplot-like presentation, and (5) performing hierarchical cluster analysis to provide alternate views of the data. Steps (2)–(4) are a form of multidimensional scaling, and are commonly referred to as correspondence analysis, but the book deals with all four steps, with plenty of emphasis on the coding and clustering steps. The author studied with Benzécri, and the book aims to be a summary of Benzécri’s work. Let us get one thing out of the way at the outset. Although the title might imply that there is material about R (1996) and Java per se, the reader will not learn much about programming in either language. R code listings throughout the book are used as the main vehicle to define the details of the procedures. This is very valuable. Readers familiar with R (or S) will have no difficulty reading the code, and will know exactly what the author is doing. The coding style is rudimentary, so readers need not be highly experienced in the R language to understand the code. Many of the procedures are also written in Java. Appropriately, the source code is not listed in the book—for a given amount of code, a lot more can be accomplished with R—but it is available (along with the R code, some C code, and selected datasets) on the author’s websites www.correspondences.info. There are six chapters, a preface and a forward. The forward by Jean–Paul Benzécri (in French and English) is interesting and entertaining, with reminiscences of the precomputer days, and the intellectual challenges faced by data analysts now that computational burden is no longer much of an issue. Chapter 1 (Introduction) starts with a lively, well-written history of data analysis, leading to Benzécri’s process of data analysis. Next, the ideas behind chi-square distance and correspondence analysis are explained and contrasted with principal components analysis. There is an introduction to data coding, where (among other things) continuous data can be transformed into a form suitable for correspondence analysis. Hierarchical cluster analysis is introduced as another view of the information in either the raw data, or its projections. The chapter ends with annotated listings of the R software. Chapter 2 (Theory of Correspondence Analysis) is claimed as being “not essential for reading later chapters,” and may be used as a reference as needed. This chapter is not light reading! It is highly mathematical. It is very well written mathematics, but it comes as a shock after reading an otherwise applied and practical book. Readers well grounded in real analysis and tensor calculus should do fine. Others may find it tough-going. The other bad news is that some key information is buried in this chapter, including multiple correspondence analysis, the Burt table, different methods of hierarchical cluster analysis, and importantly, notation used later on. Much of the material belongs in an appendix, and the rest should have been merged with the other chapters. Chapter 3 is dedicated to data coding, where numerous (some very clever) methods are used to transform and weight raw data to put it into a form suitable for correspondence analysis. The methods of doubling, complete disjunctive coding, fuzzy coding, and additional methods are defined and illustrated. The presentation is excellent. It gives plenty of motivation to think hard about the objectives of the analysis, and whether the form of the raw data fulfills these objectives. A skeptic might think that the coding methods are intended to allow virtually any data to be analyzed via correspondence analysis, but the author convincingly shows the value of at least considering data coding, whether one uses correspondence analysis or not. Also, I can see the possibility of these tools used by trial and error until the researcher sees the answer they seek. Moreover, discretizing continuous data that may already be in a form perfect for analysis (e.g., by principal components) may be both unnecessary and misleading: if the aim is to infer something about a continuous distribution, discretization can badly distort the picture. To avoid these pitfalls, the reader needs to understand the guidance given in the beginning pages of this chapter. Chapter 4 is dedicated to examples. Several comprise size and shape data, and the rest are financial and economic examples. Each one is thoroughly explained, and the software used to do the coding and analysis is presented. The general theme in these examples is the discovery of structure in the data. Using the powerful methods described in the book, detailed structure is found and described. It is very interesting reading, but I was left wondering how much of this structure is due to the manner in which the data was collected and not fundamental population properties. (A statistician should ask this question.) The last chapter, a long one, is dedicated to text analysis, where files of text form are parsed into a form suitable for correspondence analysis (e.g., contingency tables of counts of certain types of words versus literary works). The introductory material in this chapter gives enough background and terminology for readers unfamiliar with this kind of application to at least superficially understand the rest of the chapter. However, those unfamiliar with this application area, like myself, will find much of the material ponderously detailed and difficult to grasp at a deeper level. This is particularly true with most of the first half of the chapter, where a review of applications, primarily by Benzécri, is given without showing any exemplifying data. Luckily, the chapter ends with several detailed, and very interesting, case studies. (How can one resist an application entitled Eight Hypotheses of Parmenides Regarding the One?) It is obligatory to ask if this book will appeal to statistical practitioners in the engineering and physical sciences. For those who routinely apply multivariate techniques, and are curious about the “French School” of data analysis, it is worth reading, despite its flaws here and there. It gives a convincing argument in favor of carefully thinking about the form the data should assume (i.e., data coding) prior to the use of any multivariate technique. It describes applications that are “outside the box” for engineering and the physical science purposes, but that is a good thing: many methods used today were not originally applied to engineering and physical science data. It has considerable material of a historical and philosophical nature, and that is also a good thing. Be forewarned the writing is often packed with many details, and it is a little uneven in style. Readers wanting “just the facts” about correspondence analysis may be more satisfied with chapter 10 of Jackson (1991), or Greenacre (1993).
Archive | 1992
Paul A. Kildal-Brandt; Thomas A. Weber; Charles E. Heckler
Technometrics | 2005
Charles E. Heckler
Archive | 2003
Joan Christine Potenza; Charles E. Heckler; Xiaoru Wang; Huijuan D. Chen
Archive | 2003
Louis E. Friedrich; Philip A. Allway; Judith A. Bogie; Bernard A. Clark; Charles E. Heckler; Stephen Paul Singer
Technometrics | 1996
Charles E. Heckler