Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where William S. Cleveland is active.

Publication


Featured researches published by William S. Cleveland.


Journal of the American Statistical Association | 1979

Robust Locally Weighted Regression and Smoothing Scatterplots

William S. Cleveland

Abstract The visual information on a scatterplot can be greatly enhanced, with little additional cost, by computing and plotting smoothed points. Robust locally weighted regression is a method for smoothing a scatterplot, (x i , y i ), i = 1, …, n, in which the fitted value at z k is the value of a polynomial fit to the data using weighted least squares, where the weight for (x i , y i ) is large if x i is close to x k and small if it is not. A robust fitting procedure is used that guards against deviant points distorting the smoothed points. Visual, computational, and statistical issues of robust locally weighted regression are discussed. Several examples, including data on lead intoxication, are used to illustrate the methodology.


Journal of the American Statistical Association | 1988

Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting

William S. Cleveland; Susan J. Devlin

Abstract Locally weighted regression, or loess, is a way of estimating a regression surface through a multivariate smoothing procedure, fitting a function of the independent variables locally and in a moving fashion analogous to how a moving average is computed for a time series. With local fitting we can estimate a much wider class of regression surfaces than with the usual classes of parametric functions, such as polynomials. The goal of this article is to show, through applications, how loess can be used for three purposes: data exploration, diagnostic checking of parametric models, and providing a nonparametric regression surface. Along the way, the following methodology is introduced: (a) a multivariate smoothing procedure that is an extension of univariate locally weighted regression; (b) statistical procedures that are analogous to those used in the least-squares fitting of parametric functions; (c) several graphical methods that are useful tools for understanding loess estimates and checking the a...


Journal of the American Statistical Association | 1984

Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods

William S. Cleveland; Robert McGill

Abstract The subject of graphical methods for data analysis and for data presentation needs a scientific foundation. In this article we take a few steps in the direction of establishing such a foundation. Our approach is based on graphical perception—the visual decoding of information encoded on graphs—and it includes both theory and experimentation to test the theory. The theory deals with a small but important piece of the whole process of graphical perception. The first part is an identification of a set of elementary perceptual tasks that are carried out when people extract quantitative information from graphs. The second part is an ordering of the tasks on the basis of how accurately people perform them. Elements of the theory are tested by experimentation in which subjects record their judgments of the quantitative information on graphs. The experiments validate these elements but also suggest that the set of elementary tasks should be expanded. The theory provides a guideline for graph construction...


Science | 1985

Graphical Perception and Graphical Methods for Analyzing Scientific Data

William S. Cleveland; Robert McGill

Graphical perception is the visual decoding of the quantitative and qualitative information encoded on graphs. Recent investigations have uncovered basic principles of human graphical perception that have important implications for the display of data. The computer graphics revolution has stimulated the invention of many graphical methods for analyzing and presenting scientific data, such as box plots, two-tiered error bars, scatterplot smoothing, dot charts, and graphing on a log base 2 scale.


Archive | 1996

Smoothing by Local Regression: Principles and Methods

William S. Cleveland; Clive R. Loader

Local regression is an old method for smoothing data, having origins in the graduation of mortality data and the smoothing of time series in the late 19th century and the early 20th century. Still, new work in local regression continues at a rapid pace. We review the history of local regression. We discuss four of its basic components that must be chosen in using local regression in practice — the weight function, the parametric family that is fitted locally, the bandwidth, and the assumptions about the distribution of the response. A major theme of the paper is that these choices represent a modeling of the data; different data sets deserve different choices. We describe polynomial mixing, a method for enlarging polynomial parametric families. We introduce an approach to adaptive fitting,assessment of parametric localization. We describe the use of this approach to design two adaptive procedures: one automatically chooses the mixing degree of mixing polynomials at each x using cross-validation, and the other chooses the bandwidth at each x using C p . Finally, we comment on the efficacy of using asymptotics to provide guidance for methods of local regression.


Statistics and Computing | 1991

Computational methods for local regression

William S. Cleveland; E. Grosse

Local regression is a nonparametric method in which the regression surface is estimated by fitting parametric functions locally in the space of the predictors using weighted least squares in a moving fashion similar to the way that a time series is smoothed by moving averages. Three computational methods for local regression are presented. First, fast surface fitting and evaluation is achieved by building ak-d tree in the space of the predictors, evaluating the surface at the corners of the tree, and then interpolating elsewhere by blending functions. Second, surfaces are made conditionally parametric in any proper subset of the predictors by a simple alteration of the weighting scheme. Third degree-of-freedom quantities that would be extremely expensive to compute exactly are approximated, not by numerical methods, but through a statistical model that predicts the quantities from the trace of the hat matrix, which can be computed easily.


measurement and modeling of computer systems | 2001

On the nonstationarity of Internet traffic

Jin Cao; William S. Cleveland; D. L. Lin; Don X. Sun

Traffic variables on an uncongested Internet wire exhibit a pervasive nonstationarity. As the rate of new TCP connections increases, arrival processes (packet and connection) tend locally toward Poisson, and time series variables (packet sizes, transferred file sizes, and connection round-trip times) tend locally toward independent. The cause of the nonstationarity is superposition: the intermingling of sequences of connections between different source-destination pairs, and the intermingling of sequences of packets from different connections. We show this empirically by extensive study of packet traces for nine links coming from four packet header databases. We show it theoretically by invoking the mathematical theory of point processes and time series. If the connection rate on a link gets sufficiently high, the variables can be quite close to Poisson and independent; if major congestion occurs on the wire before the rate gets sufficiently high, then the progression toward Poisson and independent can be arrested for some variables.


Journal of the American Statistical Association | 1984

The Many Faces of a Scatterplot

William S. Cleveland; Robert McGill

Abstract The scatterplot is one of our most powerful tools for data analysis. Still, we can add graphical information to scatterplots to make them considerably more powerful. These graphical additions, faces of sorts, can enhance capabilities that scatterplots already have or can add whole new capabilities that faceless scatterplots do not have at all. The additions we discuss here—some new and some old—are (a) sunflowers, (b) category codes, (c) point cloud sizings, (d) smoothings for the dependence of y on x (middle smoothings, spread smoothings, and upper and lower smoothings), and (e) smoothings for the bivariate distribution of x and y (pairs of middle smoothings, sum-difference smoothings, scale-ratio smoothings, and polar smoothings). The development of these additions is based in part on a number of graphical principles that can be applied to the development of statistical graphics in general.


Science | 1974

Sunday and Workday Variations in Photochemical Air Pollutants in New Jersey and New York

William S. Cleveland; T. E. Graedel; Beat Kleiner; Jack L. Warner

Concentration distributions of air contaminants and meteorological variables in New Jersey and New York for workdays (Mondays through Fridays, omitting holidays) and Sundays are compared by means of quantile-quantile plots. The ozone distributions are slightly higher on Sundays, and the primary pollutant distributions are lower. These results raise serious questions about the validity of current concepts underlying ozone reduction in urban atmospheres.


Technometrics | 1972

The Inverse Autocorrelations of a Time Series and Their Applications

William S. Cleveland

The inverse autocorrelations of a time series are defined to be the autocorrelations associated with the inverse of the spectral density of the series. They can be estimated by calculating the autocorrelations associated with the inverse of a spectral density estimate. Two diierent methods of estimating the inverse autocorrelations arise from two different methods of estimating the spectral density—autoregressive and periodogram smoothing. The estimates of the inverse autocorrelations are used to assist in identifying a parsimonious, moving-average, autoregressive model for the series and to provide rough initial estimates of the parameters for an iterative search for the maximum of the likelihood function. The techniques discussed are applied to chemical process concentration readings, wind velocity measurements, and moon seismic data.

Collaboration


Dive into the William S. Cleveland's collaboration.

Top Co-Authors

Avatar

Ryan P. Hafen

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge