Alan Agresti
University of Florida
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alan Agresti.
The American Statistician | 1998
Alan Agresti; Brent A. Coull
Abstract For interval estimation of a proportion, coverage probabilities tend to be too large for “exact” confidence intervals based on inverting the binomial test and too small for the interval based on inverting the Wald large-sample normal test (i.e., sample proportion ± z-score × estimated standard error). Wilsons suggestion of inverting the related score test with null rather than estimated standard error yields coverage probabilities close to nominal confidence levels, even for very small sample sizes. The 95% score interval has similar behavior as the adjusted Wald interval obtained after adding two “successes” and two “failures” to the sample. In elementary courses, with the score and adjusted Wald methods it is unnecessary to provide students with awkward sample size guidelines.
The American Statistician | 2000
Alan Agresti; Brian S Caffo
Abstract The standard confidence intervals for proportions and their differences used in introductory statistics courses have poor performance, the actual coverage probability often being much lower than intended. However, simple adjustments of these intervals based on adding four pseudo observations, half of each type, perform surprisingly well even for small samples. To illustrate, for a broad variety of parameter settings with 10 observations in each sample, a nominal 95% interval for the difference of proportions has actual coverage probability below .93 in 88% of the cases with the standard interval but in only 1% with the adjusted interval; the mean distance between the nominal and actual coverage probabilities is .06 for the standard interval, but .01 for the adjusted one. In teaching with these adjusted intervals, one can bypass awkward sample size guidelines and use the same formulas with small and large samples.
Statistics in Medicine | 2000
Beiyao Zheng; Alan Agresti
This paper studies summary measures of the predictive power of a generalized linear model, paying special attention to a generalization of the multiple correlation coefficient from ordinary linear regression. The population value is the correlation between the response and its conditional expectation given the predictors, and the sample value is the correlation between the observed response and the model predicted value. We compare four estimators of the measure in terms of bias, mean squared error and behaviour in the presence of overparameterization. The sample estimator and a jack-knife estimator usually behave adequately, but a cross-validation estimator has a large negative bias with large mean squared error. One can use bootstrap methods to construct confidence intervals for the population value of the correlation measure and to estimate the degree to which a model selection procedure may provide an overly optimistic measure of the actual predictive power.
Sociological Methodology | 1978
Alan Agresti; Barbara F. Agresti
Many variables of interest in the social sciences are measurable only at the nominal level. That is, they represent types of phenomena such as race, ethnicity, religious affiliation, or political party preference. It is sometimes of interest to measure the amount of variation, or heterogeneity, within a population with respect to one or more of these variables. By a measure of variation for a qualitative variable, we mean a description of the dispersion of the population over a number of nominal categories. We shall refer to such measures as indices of qualitative variation or diversity. The ecological sciences have made considerable use of
Statistical Modelling | 2005
Yongyi Min; Alan Agresti
For count responses, the situation of excess zeros (relative to what standard models allow) often occurs in biomedical and sociological applications. Modeling repeated measures of zero-inflated count data presents special challenges. This is because in addition to the problem of extra zeros, the correlation between measurements upon the same subject at different occasions needs to be taken into account. This article discusses random effect models for repeated measurements on this type of response variable. A useful model is the hurdle model with random effects, which separately handles the zero observations and the positive counts. In maximum likelihood model fitting, we consider both a normal distribution and a nonparametric approach for the random effects. A special case of the hurdle model can be used to test for zero inflation. Random effects can also be introduced in a zero-inflated Poisson or negative binomial model, but such a model may encounter fitting problems if there is zero deflation at any settings of the explanatory variables. A simple alternative approach adapts the cumulative logit model with random effects, which has a single set of parameters for describing effects. We illustrate the proposed methods with examples.
Journal of the American Statistical Association | 1994
Joseph B. Lang; Alan Agresti
Abstract We discuss model-fitting methods for analyzing simultaneously the joint and marginal distributions of multivariate categorical responses. The models are members of a broad class of generalized logit and loglinear models. We fit them by improving a maximum likelihood algorithm that uses Lagranges method of undetermined multipliers and a Newton-Raphson iterative scheme. We also discuss goodness-of-fit tests and adjusted residuals, and give asymptotic distributions of model parameter estimators. For this class of models, inferences are equivalent for Poisson and multinomial sampling assumptions. Simultaneous models for joint and marginal distributions may be useful in a variety of applications, including studies dealing with longitudinal data, multiple indicators in opinion research, cross-over designs, social mobility, and inter-rater agreement. The models are illustrated for one such application, using data from a recent General Social Survey regarding opinions about various types of government s...
Contemporary Sociology | 1995
Alan Agresti; Clifford C. Clogg; Edward S. Shihadeh
Preliminaries The Linear-by-Linear Interaction Model Association Models for Two-Way Tables The ANOAS Approach Other Models for Two-Way Tables Symmetry-Type Models Multiple Dimensions of Association Bivariate Association in Multiple Groups Logit-Type Regression Models for Ordinal Dependent Variables
Test | 2005
Ivy Liu; Alan Agresti
This article review methodologies used for analyzing ordered categorical (ordinal) response variables. We begin by surveying models for data with a single ordinal response variable. We also survey recently proposed strategies for modeling ordinal response variables when the data have some type of clustering or when repeated measurement occurs at various occasions for each subject, such as in longitudinal studies. Primary models in that case includemarginal models andcluster-specific (conditional) models for which effects apply conditionally at the cluster level. Related discussion refers to multi-level and transitional models. The main emphasis is on maximum likelihood inference, although we indicate certain models (e.g., marginal models, multi-level models) for which this can be computationally difficult. The Bayesian approach has also received considerable attention for categorical data in the past decade, and we survey recent Bayesian approaches to modeling ordinal response variables. Alternative, non-model-based, approaches are also available for certain types of inference.
Psychometrika | 1979
Alan Agresti; Dennis Wackerly; James M. Boyett
A procedure is proposed for approximating attained significance levels of exact conditional tests. The procedure utilizes a sampling from the null distribution of tables having the same marginal frequencies as the observed table. Application of the approximation through a computer subroutine yields precise approximations for practically any table dimensions and sample size.
Sociological Methodology | 2000
Alan Agresti; James G. Booth; James P. Hobert; Brian S Caffo
In many applications observations have some type of clustering, with observations within clusters tending to be correlated. A common instance of this occurs when each subject in the sample undergoes repeated measurement, in which case a cluster consists of the set of observations for the subject. One approach to modeling clustered data introduces cluster-level random effects into the model. The use of random effects in linear models for normal responses is well established. By contrast, random effects have only recently seen much use in models for categorical data. This chapter surveys a variety of potential social science applications of random effects modeling of categorical data. Applications discussed include repeated measurement for binary or ordinal responses, shrinkage to improve multiparameter estimation of a set of proportions or rates, multivariate latent variable modeling, hierarchically structured modeling, and cluster sampling. The models discussed belong to the class of generalized linear mixed models (GLMMs), an extension of ordinary linear models that permits nonnormal response variables and both fixed and random effects in the predictor term. The models are GLMMs for either binomial or Poisson response variables, although we also present extensions to multicategory (nominal or ordinal) responses. We also summarize some of the technical issues of model-fitting that complicate the fitting of GLMMs even with existing software.