Kjell A. Doksum
University of Wisconsin-Madison
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kjell A. Doksum.
Journal of the American Statistical Association | 1981
Peter J. Bickel; Kjell A. Doksum
Abstract Following Box and Cox (1964), we assume that a transform Z i = h(Yi , λ) of our original data {Yi } satisfies a linear model. Consistency properties of the Box-Cox estimates (MLEs) of λ and the parameters in the linear model, as well as the asymptotic variances of these estimates, are considered. We find that in some structured models such as transformed linear regression with small to moderate error variances, the asymptotic variances of the estimates of the parameters in the linear model are much larger when the transformation parameter λ is unknown than when it is known. In some unstructured models such as transformed one-way analysis of variance with moderate to large error variances, the cost of not knowing λ is moderate to small. The case where the error distribution in the linear model is not normal but actually unknown is considered, and robust methods in the presence of transformations are introduced for this case. Asymptotics and simulation results for the transformed additive two-way ...
Technometrics | 1992
Kjell A. Doksum; Arnljot Hóyland
Variable-stress accelerated life testing trials are experiments in which each of the units in a random sample of units of a product is run under increasingly severe conditions to get information quickly on its life distribution. We consider a fatigue failure model in which accumulated decay is governed by a continuous Gaussian process W(y) whose distribution changes at certain stress change points to < t l < < … <t k , Continuously increasing stress is also considered. Failure occurs the first time W(y) crosses a critical boundary ω. The distribution of time to failure for the models can be represented in terms of time-transformed inverse Gaussian distribution functions, and the parameters in models for experiments with censored data can be estimated using maximum likelihood methods. A common approach to the modeling of failure times for experimental units subject to increased stress at certain stress change points is to assume that the failure times follow a distribution that consists of segments of Weib...
Journal of the American Statistical Association | 1988
Dorota M. Dabrowska; Kjell A. Doksum
Abstract The generalized odds ratio for a survival variable T is defined as Λ T (t | c) = c -1[1 - Sc (t)]/Sc (t) for c > 0 and -log S(t) for c = 0, where S(t) = Pr(T > t). This ratio coincides with the integrated hazard for c = 0 and the odds ratio for c = 1. When the distribution of T depends on a covariate vector, we assume that conditionally on the covariates, log Λ T (t | c) is linear in the covariates. This model is a generalization of the proportional hazard model (PHM), which has an interpretation both as a PHM with random nuisance effects (Clayton and Cuzick 1986) and as a proportional odds-rate model with the odds rate defined from the response times of series systems. Harrington and Fleming (1982) and Bickel (1986a) considered rank tests for this semiparametric model, Clayton and Cuzick (1986) considered estimation. We use the odds-rate representation to define a class of estimates of the proportionality parameter in the two-sample case. We show that the estimates are consistent and asymptotica...
Archive | 2003
Bo Henry Lindqvist; Kjell A. Doksum
Reliability Theory in the Past and Present Centuries General Aspects of Reliability Modelling Reliability of Networks and Systems Stochastic Modelling and Optimization in Reliability Modelling in Survival and Reliability Analysis Statistical Methods for Degradation Data Statistical Methods for Maintained Systems Statistical Inference in Survival Analysis Software Reliability Methods.
Lifetime Data Analysis | 1995
Kjell A. Doksum; Sharon-Lise T. Normand
We present two stochastic models that describe the relationship between biomarker process values at random time points, event times, and a vector of covariates. In both models the biomarker processes are degradation processes that represent the decay of systems over time. In the first model the biomarker process is a Wiener process whose drift is a function of the covariate vector. In the second model the biomarker process is taken to be the difference between a stationary Gaussian process and a time drift whose drift parameter is a function of the covariates. For both models we present statistical methods for estimation of the regression coefficients. The first model is useful for predicting the residual time from study entry to the time a critical boundary is reached while the second model is useful for predicting the latency time from the infection until the time the presence of the infection is detected. We present our methods principally in the context of conducting inference in a population of HIV infected individuals.
Journal of the American Statistical Association | 1983
Kjell A. Doksum; Chi-Wing Wong
Abstract The problem of testing hypotheses in linear models when the original data have been transformed is considered. It is assumed that the transformation involves an unknown parameter that has to be estimated from the data. For certain important testing problems it is found that the asymptotic level and power is as if λ had been assumed known. Asymptotic efficiency results show that when the Box-Cox transformation is used, tests based on transformed data have good power properties. Simulation results for transformed two-sample and linear regression testing problems show this to be true for moderate to small sample sizes as well. In particular, an α-trimmed t test based on averages of trimmed transformed variables performs very well in both light-tailed and heavy-tailed skew models when compared with the usual t test and rank tests.
Journal of the American Statistical Association | 1994
Kjell A. Doksum; Stephen Blyth; Eric T. Bradlow; Xiao-Li Meng; Hongyu Zhao
Abstract We call (a model for) an experiment heterocorrelatious if the strength of the relationship between a response variable Y and a covariate X is different in different regions of the covariate space. For such experiments we introduce a correlation curve that measures heter-ocorrelaticity in terms of the variance explained by regression locally at each covariate value. More precisely, the squared correlation curve is obtained by first expressing the usual linear model “variance explained to total variance” formula in terms of the residual variance and the regression slope and then replacing these by the conditional residual variance depending on x and the slope of the conditional mean of Y given X = x. The correlation curve ρ(x) satisfies the invariance properties of correlation, it reduces to the Galton-Pearson correlation ρ in linear models, it is between −1 and 1, it is 0 when X and Y are independent, and it is ±1 when Y is a function of X. We introduce estimates of the correlation curve based on ...
Computational Statistics & Data Analysis | 2000
Kjell A. Doksum; Ja-Yong Koo
The quantile regression function gives the quantile in the conditional distribution of a response variable given the value of a covariate. It can be used to measure the effect of covariates not only in the center of a population, but also in the upper and lower tails. Moreover, it provides prediction intervals that do not rely on normality or other distributional assumptions. In a nonparametric setting, we explore a class of quantile regression spline estimators of the quantile regression function. We consider an automatic knot selection procedure involving a linear programming method, stepwise knot addition using a modified AIC, and stepwise knot deletion using a modified BIC. Because the methods estimate quantile regression functions, they possess an inherent robustness to extreme observations in the response values. We investigate the performance of prediction intervals based on automatic quantile regression splines and find that the loss of efficiency of this procedure is minimal in the normal linear homoscedastic model. In heteroscedastic linear models, it outperforms the classical normal theory prediction interval. A data example is provided to illustrate the use of the proposed methods.
Computational Statistics & Data Analysis | 2012
Sm Enayetur Raheem; S. Ejaz Ahmed; Kjell A. Doksum
In the context of a partially linear regression model, shrinkage semiparametric estimation is considered based on the Stein-rule. In this framework, the coefficient vector is partitioned into two sub-vectors: the first sub-vector gives the coefficients of interest, i.e., main effects (for example, treatment effects), and the second sub-vector is for variables that may or may not need to be controlled. When estimating the first sub-vector, the best estimate may be obtained using either the full model that includes both sub-vectors, or the reduced model which leaves out the second sub-vector. It is demonstrated that shrinkage estimators which combine two semiparametric estimators computed for the full model and the reduced model outperform the semiparametric estimator for the full model. Using the semiparametric estimate for the reduced model is best when the second sub-vector is the null vector, but this estimator suffers seriously from bias otherwise. The relative dominance picture of suggested estimators is investigated. In particular, suitability of estimating the nonparametric component based on the B-spline basis function is explored. Further, the performance of the proposed estimators is compared with an absolute penalty estimator through Monte Carlo simulation. Lasso and adaptive lasso were implemented for simultaneous model selection and parameter estimation. A real data example is given to compare the proposed estimators with lasso and adaptive lasso estimators.
Journal of the American Statistical Association | 2008
Kjell A. Doksum; Shijie Tang; Kam-Wah Tsui
We consider regression experiments involving a response variable Y and a large number of predictor variables X1, …, Xd, many of which may be irrelevant for the prediction of Y and thus must be removed before Y can be predicted from the X’s. We consider two procedures that select variables by using importance scores that measure the strength of the relationship between predictor variables and a response and keep those variables whose importance scores exceed a threshold. In the first of these procedures, scores are obtained by randomly drawing subregions (tubes) of the predictor space that constrain all but one predictor and in each subregion computing a signal-to-noise ratio (efficacy) based on a nonparametric univariate regression of Y on the unconstrained variable. The subregions are adapted to boost weak variables iteratively by searching (hunting) for the subregions in which the efficacy is maximized. The efficacy can be viewed as an approximation to a one-to-one function of the probability of identifying features. By using importance scores based on averages of maximized efficacies, we develop a variable selection algorithm called EARTH (efficacy adaptive regression tube hunting) based on examining the conditional expectation of the response given all but one of the predictor variables for a collection of randomly, adaptively, and iteratively selected regions. The second importance score method (RFVS) is based on using random forest importance values to select variable. Computer simulations show that EARTH and RFVS are successful variable selection methods compared with other procedures in nonparametric situations with a large number of irrelevant predictor variables, and that when each is combined with the model selection and prediction procedure MARS, the tree-based prediction procedure GUIDE, and the random forest method, the combinations lead to improved prediction accuracy for certain models with many irrelevant variables. We give conditions under which a version of the EARTH algorithm selects the correct model with probability tending to 1 as the sample size n tends to infinity even if d → ∞as n → ∞. We include the analysis of a real data set in which we show how a training set can be used to find a threshold for the EARTH importance scores.