Kai Ding
University of Oklahoma Health Sciences Center
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kai Ding.
Archive | 2016
George J. Knafl; Kai Ding
This book addresses how to incorporate nonlinearity in one or more predictor (or explanatory or independent) variables in regression models for different types of outcome (or response or dependent) variables. Such nonlinear dependence is often not considered in applied research. While relationships can reasonably be treated as linear in some cases, it is not unusual for them to be distinctly nonlinear. A standard linear analysis in the latter cases can produce misleading conclusions while a nonlinear analysis can provide novel insights into data not otherwise possible. A variety of examples of the benefits to the modeling of nonlinear relationships are presented throughout the book. Methods are needed for deciding whether relationships are linear or nonlinear and for fitting appropriate models when they are nonlinear. Methods for these purposes are covered in this book using what are called fractional polynomials based on power transformations of primary predictor variables with real‐valued (and so possibly fractional) powers. An adaptive approach is used to construct fractional polynomial models based on heuristic (or rule‐based) searches through power transforms of primary predictor variables. The book covers how to formulate and conduct such adaptive fractional polynomial modeling in a variety of contexts including adaptive regression of continuous outcomes, adaptive logistic regression of discrete outcomes with two or more values, and adaptive Poisson regression of count outcomes, possibly adjusted into rate outcomes with offsets. Modeling of variances/dispersions with fractional polynomials is covered as well. The book also covers alternative approaches for modeling nonlinear relationships including standard polynomials, generalized additive models computed using local regression (loess) and spline smoothing approaches, and multiple adaptive regression splines (MARS). Direct support for adaptive regression modeling based on fractional polynomials is not currently available in standard statistical software tools like SAS® version 9.4 Consequently, SAS macros have been developed for these purposes. Detailed descriptions of how to use these macros and of their output are provided. A working knowledge of SAS is assumed, so the book does not provide an introduction to the use of SAS.
Archive | 2016
George J. Knafl; Kai Ding
This chapter provides a general formulation for adaptive regression modeling of nonlinear relationships. Since formulations for special cases have been provided earlier, only overviews are presented for alternative types of regression models and alternative cross-validation scoring approaches. A detailed formulation for the adaptive regression modeling process used by the genreg macro is provided, which has only been generally described earlier.
Archive | 2016
George J. Knafl; Kai Ding
This chapter provides a description of how to use PROC GAM for generating generalized additive models (GAMs) for univariate continuous and dichotomous outcomes as well as how to evaluate and compare GAMs with likelihood cross-validation (LCV) scores. Comparison of GAMS to adaptive fractional polynomial models on the basis of LCV scores is also covered. Example code is provided for generating models for predicting the univariate continuous outcome death rate per 100,000 in terms of available predictors as also addressed in Chaps. 2, 3, 6, 7 and 16 as well as models for predicting the univariate dichotomous outcome a high mercury level in fish over 1.0 ppm versus a lower level in terms of available predictors as also addressed in Chaps. 8, 9 and 16.
Archive | 2016
George J. Knafl; Kai Ding
This chapter provides a description of how to use PROC ADAPTIVEREG for generating multivariate adaptive regression splines (MARS) models for univariate continuous and dichotomous outcomes as well as how to evaluate and compare MARS models with likelihood cross-validation (LCV) scores. Comparison of MARS models to adaptive fractional polynomial models on the basis of LCV scores is also covered as well as how to adaptively transform MARS models. Example code is provided for generating models for predicting the univariate continuous outcome death rate per 100,000 in terms of available predictors as also addressed in Chaps 2, 3, 16, and 17 as well as models for predicting the univariate dichotomous outcome a high mercury level in fish over 1.0 ppm versus a lower level in terms of available predictors as also addressed in Chaps 8, 9, 16 and 17.
Archive | 2016
George J. Knafl; Kai Ding
This chapter presents analyses of several data sets with positive valued univariate or multivariate continuous outcomes addressing the need for power transformation of those outcomes along with power transformation of predictors for those outcomes. The outcome variables include those analyzed in Chaps. 2– 5 as well as a new data set on plasma levels of beta-carotene in humans in terms of their fiber intake and vitamin usage. The chapter also provides a formulation for power-adjusted likelihood cross-validation (LCV) scores that can be maximized to choose a real valued power for transforming an outcome.
Archive | 2016
George J. Knafl; Kai Ding
This chapter formulates and demonstrates generalized additive models (GAMs) for means of continuous outcomes treated as independent and normally distributed with constant variances as in linear regression and for logits (log odds) of means of dichotomous discrete outcomes with unit dispersions as in logistic regression. GAMs provide an alternative to fractional polynomial models for modeling nonlinear relationships between univariate outcomes and predictors, and so GAMs for these two cases are also compared to adaptive fractional polynomial models. Poisson regression is not considered for brevity. Example analyses are provided of the univariate continuous outcome deathrate per 100,000 in terms of available predictors as also addressed in Chaps. 2, 3, 6 and 7 as well as the univariate dichotomous outcome a high mercury level in fish over 1.0 ppm versus a lower level in terms of available predictors as also addressed in Chaps. 8 and 9.
Archive | 2016
George J. Knafl; Kai Ding
This chapter formulates and demonstrates adaptive regression modeling of means and variances for repeatedly measured continuous outcomes treated as multivariate normal. Analyses are presented of dental measurements of the distance in mm from the center of the pituitary to the pterygomaxillary fissure in terms of the age and gender of the child while accounting for dependence of dental measurements for the same child. These are example analyses of data with no missing outcome values. Analyses are also presented of strength in terms of time and type of weightlifting program while accounting for dependence of strength measurements for the same subject. These are example analyses of data with missing outcome values. Analyses of these data sets use marginal models based on order 1 autoregressive (AR1) correlations and exchangeable correlations (EC) and estimated with maximum likelihood (ML) or generalized estimating equations (GEE). They also use transition models, with the current outcome value a function of prior outcome values, and general conditional models, with the current outcome value a function of other, past as well as prior, outcome values. The issue of moderation is addressed, that is, how the effect of a predictor on an outcome can change with values of a moderator variable. For example, how the effect of age on the child’s dental measurements can change with the gender of the child. Moderation analyses are commonly based on interactions, but can be more generally based on geometric combinations, that is, products of power transforms of primary predictors using possibly different powers.
Archive | 2016
George J. Knafl; Kai Ding
This chapter formulates and demonstrates adaptive fractional polynomial modeling of means and dispersions for repeatedly measured dichotomous and polytomous outcomes with two or more values. Marginal modeling extends from the multivariate normal outcome context to the multivariate dichotomous and polytomous outcome context. However, due to the complexity in general of computing likelihoods and quasi-likelihoods (as needed to account for non-unit dispersions) for general multivariate marginal modeling, generalized estimating equations (GEE) techniques are often used instead, thereby avoiding computation of likelihoods and quasi-likelihoods. This complicates the extension of adaptive modeling to the GEE context since it is based on cross-validation (CV) scores computed from likelihoods or likelihood-like functions, but a readily computed extended likelihood is formulated for use in adaptive GEE-based modeling of multivariate dichotomous and polytomous outcomes. Conditional modeling also extends to the multivariate dichotomous and polytomous outcome context, both transition modeling and general conditional modeling. In contrast to marginal GEE-based modeling, conditional modeling of means for multivariate dichotomous and polytomous outcomes with unit dispersions is based on pseudolikelihoods that can be used to compute pseudolikelihood CV (PLCV) scores on which to base adaptive transition and general conditional modeling of multivariate dichotomous and polytomous outcomes. These marginal and conditional models can be extended to model dispersions as well as means. Example analyses of these kinds are presented of post-baseline respiratory status over time for patients with respiratory disorder in terms of the baseline respiratory status, time, and being on an active as opposed to a placebo treatment.
Archive | 2016
George J. Knafl; Kai Ding
This chapter formulates and demonstrates adaptive fractional polynomial modeling of means and dispersions for repeatedly measured count outcomes, possibly converted to rates using offsets. Marginal modeling extends from the multivariate normal outcome context to the multivariate count/rate outcome context. However, due to the complexity in general of computing likelihoods and quasi-likelihoods (as needed to account for non-unit dispersions) for general multivariate marginal modeling, generalized estimating equations (GEE) techniques are often used instead, thereby avoiding computation of likelihoods and quasi-likelihoods. This complicates the extension of adaptive modeling to the GEE context since it is based on cross-validation (CV) scores computed from likelihoods or likelihood-like functions, but a readily computed extended likelihood is formulated for use in adaptive GEE-based modeling of multivariate count/rate outcomes. Conditional modeling also extends to the multivariate count/rate outcome context, both transition modeling and general conditional modeling. In contrast to marginal GEE-based modeling, conditional modeling of means for multivariate count/rate outcomes with unit dispersions is based on pseudolikelihoods that can be used to compute pseudolikelihood CV (PLCV) scores on which to base adaptive transition and general conditional modeling of multivariate count/rate outcomes. These marginal and conditional models can be extended to model dispersions as well as means. Example analyses of these kinds are presented of the post-baseline seizure rates per day over time for patients with epilepsy in terms of the baseline seizure rate, clinic visit, and treatment group (prescribed the drug progabide versus a placebo).
Archive | 2016
George J. Knafl; Kai Ding
This chapter describes how to use the ypower macro for adaptive regression modeling accounting for fractional polynomial transformation of positive valued univariate and multivariate continuous outcomes as well as their predictors as also covered in Chap. 6. Example code and output are provided for analyzing the univariate outcome plasma beta-carotene levels for 314 subjects in terms of their fiber intake and vitamin usage and the multivariate outcome dental measurements for 27 children in terms of their age and gender. Practice exercises are also provided for conducting analyses similar to those presented in Chaps. 6 and 7.