Kenneth N. Berk | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kenneth N. Berk is active.

Explore More

Publication

Featured researches published by Kenneth N. Berk.

Technometrics | 1978

Comparing Subset Regression Procedures

Kenneth N. Berk

Although it is generally felt that the all-subsets, or “best” subset, approach is better than forward selection and backward elimination, the sequential procedures are still widely used. To see what advantage there is in doing all-subsets, this paper gtves both theoretical and empirical comparisons. It is shown that the difference in favor of all-subsets can be arbitrarily large in examples where there are predictors which do poorly alone but do very well together. Also. empirical comparisons on nine data sets show big dilrerences favoring all-subsets, when the differences are measured on the data (sample values). However, fairer comparisons based on known population values show very small differences favoring all-subsets. The only exception is the one data set which has predictors which do well together but poorly alone.

Journal of the American Statistical Association | 1977

Tolerance and Condition in Regression Computations

Kenneth N. Berk

Abstract Many regression programs include a tolerance test that does not allow a variable to enter the regression if its correlation with the previously entered variables exceeds a specified level. This is done to achieve computational stability by assuring that the correlation matrix C of the independent variables is not nearly singular. However, for any specified tolerance level, there is an example in which the entering variables pass the tolerance test but the computation is extremely unstable. A bound for the condition of C is p times the trace of C -1, which can be monitored instead of tolerance to assure stability.

Journal of Quality Technology | 1991

Significance tests for saturated orthogonal arrays

Kenneth N. Berk; Richard R. Picard

Experimental designs used in industry often allow no degrees of freedom for the estimation of error. Nevertheless, analysis of variance results can (if used properly) be used to determine which factors are significant. We give a back-of-the-envelope cal..

Technometrics | 1995

Seeing a curve in multiple regression

Kenneth N. Berk; David E. Booth

Start with a multiple regression in which each predictor enters linearly. How can we tell if there is a curve so that the model is not valid? Possibly for one of the predictors an additional square or square-root term is needed. We focus on the case in which an additional term is needed rather than the monotonic case in which a power transformation or logarithm might be sufficient. Among the plots that have been used for diagnostic purposes, nine methods are applied here. All nine methods work fine when the predictors are not related to each other, but two of them are designed to work even when the predictors are arbitrary noisy functions of each other. These two are recent methods, Cooks CERES plot and the plot for an additive model with nonparametric smoothing applied to one predictor. Even these plots, however, can miss a curve in some cases and show a false curve in others. To give a measure of curve detection, the curve can be fitted nonparametrically, and this fit can be used in place of the predic...

Technometrics | 1984

Validating Regression Procedures With New Data

Kenneth N. Berk

The best way to validate the predictive ability of a statistical model is to apply it to new data. This article compares eight ways to form regression models by forming them with old data and then validating them with fresh data. One goal here is to study which methods will work as a function of the type of data. To some extent one can tell which methods will work well by looking at the data. Another goal is to study the quality of prediction when the regression is applied to new data. Prediction quality is determined in large part by the distance of the new data in relation to the old.

The American Statistician | 1987

Effective Microcomputer Statistical Software

Kenneth N. Berk

Abstract This article focuses on important aspects of microcomputer statistical software. These include documentation, control language, data entry, data listing and editing, data manipulation, graphics, statistical procedures, output, customizing, system environment, and support. The primary concern is that a package encourage good statistical practice.

Journal of Statistical Computation and Simulation | 1980

Forward and backward stepping in variable selection

Kenneth N. Berk

For stepwise regression and discriminant analysis the parameters F in and F out govern the inclusion and deletion of variables. The candidate variable with the biggest F—ratio is included if this exceeds F inthe included variable with the smallest F—ratio is deleted if this is less than F in If F in ≧F out; then return to a previous subset size implies improvement in the criterion measure. This result also holds for a generalization, stepwise multivariate analysis, which includes stepwise regression and discriminant analysis as special cases Eliminations do not occur if forward regression and backward elimination yield the same sequence of subsets. Conversely, there is a more liberal stepping rule which always eliminates if the two sequences differ.

Journal of the American Statistical Association | 1978

A Review of the Manuals for BMDP and SPSS

Kenneth N. Berk; Ivor Francis

Abstract SPSS and BMDP have much in common, but they have contrasting emphases. The SPSS manual is intended for an unsophisticated audience. It has low-level statistical explanations and carefully written directions for running the programs, but not much about computational procedures. In contrast, the BMDP manual is more sophisticated, with not much statistical explanation, brief explanation of the control language, and substantial discussion of algorithms. We summarize our review with a listing of qualities which we consider important in a manual and our ratings for SPSS and BMDP.

Archive | 2011

Inferences Based on Two Samples

Jay Devore; Kenneth N. Berk

Chapters 8 and 9 presented confidence intervals (CIs) and hypothesis testing procedures for a single mean μ, single proportion p, and a single variance σ2. Here we extend these methods to situations involving the means, proportions, and variances of two different population distributions.

Archive | 2011

Regression and Correlation

Jay Devore; Kenneth N. Berk

The general objective of a regression analysis is to determine the relationship between two (or more) variables so that we can gain information about one of them through knowing values of the other(s). Much of mathematics is devoted to studying variables that are deterministically related. Saying that x and y are related in this manner means that once we are told the value of x, the value of y is completely specified.

Explore More