Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kent A. Lorenz is active.

Publication


Featured researches published by Kent A. Lorenz.


Archive | 2015

Short History of the Logistic Regression Model

Jeffrey R. Wilson; Kent A. Lorenz

The logistic regression model, as compared to the probit, Tobit, and complementary log–log models, is worth revisiting based upon the work of Cramer (http://ssrn.com/abstract=360300 or http://dx.doi.org/10.2139/ssrn.360300) and (Logit models from economics and other fields, Cambridge University Press, Cambridge, England, 2003, pp. 149–158). The ability to model the odds has made the logistic regression model a popular method of statistical analysis. The logistic regression model can be used for prospective, retrospective, or cross-sectional data while the probit, Tobit, and the complementary log–log models can only be used with prospective data because they model the probability of the event. This chapter provides a summary (http://ssrn.com/abstract=360300 or http://dx.doi.org/10.2139/ssrn.360300; Logit models from economics and other fields, Cambridge University Press, Cambridge, England, 2003, pp. 149–158).


Archive | 2015

Exact Logistic Regression Model

Jeffrey R. Wilson; Kent A. Lorenz

With the increase in the computer’s capacity to do tedious calculations, the use of exact logistic regression models has become increasingly popular in healthcare, banking, and other industries. Traditional methods (which are based on asymptotic theory) when used for analyzing small, skewed, or sparse datasets are not usually reliable. When sample sizes are small or the data are sparse or skewed, exact conditional inference is necessary and applicable (Derr, 2000). We enumerate the exact distributions of certain statistics in obtaining estimates for the parameters of interest in a logistic regression model, conditioned on the remaining parameters. This is a method of testing and estimation that uses conditional methods to obtain exact tests of parameters in binary and nominal logistic models. Exact methods are appropriate for small-sample or sparse data situations that often result in the failure (nonconvergence or separation) of the usual unconditional maximum likelihood estimation method. However, exact methods can take a great deal of time and memory as sample or model sizes increase. For sample sizes too large for the default exact method, a Monte Carlo method is provided. The chapter uses EXACT statement in PROC LOGISTIC or PROC GENMOD, and we also fit models in SAS, C+, and R. Our methods are based on: Troxler, S., Lalonde, T. L., & Wilson, J. R. (2011). Exact logistic models for nested binary data. Statistics in Medicine, 30(8).


Archive | 2015

Overdispersed Logistic Regression Model

Jeffrey R. Wilson; Kent A. Lorenz

When binary data are obtained through simple random sampling, the covariance for the responses follows the binomial model (two possible outcomes from independent observations with constant probability). However, when the data are obtained under other circumstances, the covariances of the responses differ substantially from the binomial case. For example, clustering effects or subject effects in repeated measure experiments can cause the variance of the observed proportions to be much larger than the variances observed under the binomial assumption. The phenomenon is generally referred to as overdispersion or extra variation. The presence of overdispersion can affect the standard errors and therefore also affect the conclusions made about the significance of the predictors. This chapter presents a method of analysis based on work presented in:


Archive | 2015

Generalized Method of Moments Logistic Regression Model

Jeffrey R. Wilson; Kent A. Lorenz

When analyzing longitudinal binary data, it is essential to account for both the correlation inherent from the repeated measures of the responses, as well as the correlation realized because of the feedback created between the responses at a particular time and the covariates at other times. Ignoring any of these correlations can lead to invalid conclusions. Such is the case when the covariates are time dependent and the standard logistic regression model is used. There are two types of correlations: responses with responses, and responses with covariates. We need a model that addresses both types of relationships. We postulate that there are different types of correlation presented. There is the correlation among the responses. There is the correlation between response and covariate: When responses at time t impact the covariates in time t + s; and when the covariates in time t impact the responses in time t + s. These correlations regarding feedback from Yt on to the future \( {X}_{t+s} \) and vice versa are important in obtaining the estimates of the regression coefficients. This chapter provides a means of modeling repeated responses with time-dependent and time-independent covariates. The coefficients are obtained using generalized method of moments. We fit these data with SAS Macro, (How to use SAS® for GMM logistic regression models for longitudinal data with time-dependent covariates (SUGI Paper 3252-2015)). Our methods are based on:


Archive | 2015

Hierarchical Logistic Regression Models

Jeffrey R. Wilson; Kent A. Lorenz

This chapter extends the results in Chap. 9. It is common to come into contact with data that have a hierarchical or clustered structure. Examples include patients within a hospital, students within a class, factories within an industry, or families within a neighborhood. In such cases, there is variability between the clusters, as well as variability between the units which are nested within the clusters. Hierarchical models take into account the variability at each level of the hierarchy, and thus allow for the cluster effects at different levels to be analyzed within the models (The Annals of Thoracic Surgery 72(6):2155–2168, 2001). This chapter tells how one can use the information from different levels to produce a subject-specific model. This is a three-level nested design but can be expanded to higher levels, though readily available computing may be challenge.


Archive | 2015

Standard Binary Logistic Regression Model

Jeffrey R. Wilson; Kent A. Lorenz

The logistic regression model is a type of predictive model that can be used when the response variable is binary—for example: live/die; disease/no disease; purchase/no purchase; win/lose. In short, we want to model the probability of getting a certain outcome, in effect modeling the mean of the variable (which is the same as the probability in the case of binary variables). A logistic regression model can be applied to response variables with more than two categories; however, those cases, though mentioned in this text, are less common. This chapter also addresses the fact that the logistic regression model is more effective and accurate when analyzing binary data as opposed to the simple linear regression. We present three significant problems that one may encounter if the linear regression model was fitted to binary data: 1. There are no limits on the values predicted by a linear regression, so the predicted response (mean) might be less than 0 or greater than 1, which is clearly outside the realm of possible values for a response probability. 2. The variance for each subpopulation is different and therefore not constant. Since the variance of a binary response is a function of the mean, then if the mean changes from subpopulation to subpopulation, the variance will also change. 3. Usually, the response is not a linear function of the input variable not in the data scale. Especially, as we have come to rely heavily on linear relationships, although it is not appropriate in these cases.


Archive | 2015

Two-Level Nested Logistic Regression Model

Jeffrey R. Wilson; Kent A. Lorenz

Studies including repeated measures are expected to give rise to correlated data. Such data are common in many disciplines including healthcare, banking, poll tracking, and education. Subjects or units are followed over time and are repeatedly observed under different experimental conditions, or are observed in clusters. Often times, such data are available in hierarchical structures consisting of a subset of a population of units at several levels. We review methods that include the clustering directly in the model (systematic component) as opposed to methods within the random component. These methods belong to the class of generalized linear mixed models. The basic idea behind generalized linear mixed models is conceptually straightforward (NSF-CBMS Regional Conference Series in Probability and Statistics. Institute of Mathematical Statistics and the American Statistical Association, Bethesda, MD, pp. 1–84, 2003) and incorporates random effects into the systematic component of a generalized linear model to account for the correlation. Such approaches are most useful when researchers wish to account for both fixed and random effects in the model. The desire to address the random effects in a logistic model makes it a subject-specific model. This is a conditional model that can also be used to model longitudinal or repeated measures data. We fit this model in SAS, SPSS, and R. Our method of modeling is based on:


Archive | 2015

Introduction to Binary Logistic Regression

Jeffrey R. Wilson; Kent A. Lorenz

Statistical inference with binary data presents many challenges, whether or not the observations are dependent or independent. Studies involving dependent observations tend to be longitudinal or clustered in nature, and therefore provide inefficient estimates if the correlation in the data is ignored. This chapter, then, reviews binary data under the assumption that the observations are independent. It provides an overview of the issues to be addressed in the book, as well as the different types of binary correlated data. It introduces SAS, SPSS, and R as the statistical programs used to analyze the data throughout the book and concludes with general recommendations.


Archive | 2015

Generalized Estimating Equations Logistic Regression

Jeffrey R. Wilson; Kent A. Lorenz

Many fields of study use longitudinal datasets, which usually consist of repeated measurements of a response variable, often accompanied by a set of covariates for each of the subjects/units. However, longitudinal datasets are problematic because they inherently show correlation due to a subject’s repeated set of measurements. For example, one might expect a correlation to exist when looking at a patient’s health status over time or a student’s performance over time. But in those cases, when the responses are correlated, we cannot readily obtain the underlying joint distribution; hence, there is no closed-form joint likelihood function to present, as with the standard logistic regression model. One remedy is to fit a generalized estimating equations (GEE) logistic regression model for the data, which is explored in this chapter. This chapter addresses repeated measures of the sampling unit, showing how the GEE method allows missing values within a subject without losing all the data from the subject, and time-varying predictors that can appear in the model. The method requires a large number of subjects and provides estimates of the marginal model parameters. We fit this model in SAS, SPSS, and R, basing our work on the variance means relationship methods, Ziang and Leger (Biometrics 42:121–130, 1986a, Biometrics 73:13–22, 1986b), and Liang and Zeger (Biometrika 73:13–22, 1986).


Archive | 2015

Fixed Effects Logistic Regression Model

Jeffrey R. Wilson; Kent A. Lorenz

If a researcher wants to know whether watching violent television has an impact on juvenile delinquency, that researcher could compare a student’s delinquency rate when he/she is watching violent television with his/her delinquency rate when not watching. The difference in delinquency rates between the two periods is an estimate of the violent television effect for that student. Similarly, a researcher might want to know how a child’s performance in school differs depending on how much time he/she spends playing video games. The researcher could compare how the child does when spending significant time playing video games versus when he/she does not watch violent television. Fixed effects logistic regression models are presented for both of these scenarios. These models treat each measurement on each subject as a separate observation, and the set of subject coefficients that would appear in an unconditional model are eliminated by conditional methods. This is a conditional, subject-specific model (as opposed to a population-averaged model like the GEE model). We fit this model in SAS, SPSS, and R. An excellent discussion with examples can be found in Allison (Fixed effects regression methods for longitudinal data using SAS, SAS Institute, Cary, NC, 2005).

Collaboration


Dive into the Kent A. Lorenz's collaboration.

Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge