Andrew Gelman
Columbia University
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Andrew Gelman.
Journal of Computational and Graphical Statistics | 1998
Stephen P. Brooks; Andrew Gelman
We generalize the method proposed by Gelman and Rubin (1992a) for monitoring the convergence of iterative simulations by comparing between and within variances of multiple chains, in order to obtain a family of tests for convergence. We review methods of inference from simulations in order to develop convergence-monitoring summaries that are relevant for the purposes for which the simulations are used. We recommend applying a battery of tests for mixing based on the comparison of inferences from individual sequences and from the mixture of sequences. Finally, we discuss multivariate analogues, for assessing convergence of several parameters simultaneously.
The Annals of Applied Statistics | 2008
Andrew Gelman; Aleks Jakulin; Maria Grazia Pittau; Yu-Sung Su
We propose a new prior distribution for classical (nonhierarchical) logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student-t prior distributions on the coefficients. As a default choice, we recommend the Cauchy distribution with center 0 and scale 2.5, which in the simplest setting is a longer-tailed version of the distribution attained by assuming one-half additional success and one-half additional failure in a logistic regression. Cross-validation on a corpus of datasets shows the Cauchy class of prior distributions to outperform existing implementations of Gaussian and Laplace priors. We recommend this prior distribution as a default choice for routine applied use. It has the advantage of always giving answers, even when there is complete separation in logistic regression (a common problem, even when the sample size is large and the number of predictors is small), and also automatically applying more shrinkage to higher-order interactions. This can be useful in routine data analysis as well as in automated procedures such as chained equations for missing-data imputation. We implement a procedure to fit generalized linear models in R with the Student-t prior distribution by incorporating an approximate EM algorithm into the usual iteratively weighted least squares. We illustrate with several applications, including a series of logistic regressions predicting voting preferences, a small bioassay experiment, and an imputation model for a public health data set.
The American Statistician | 2006
Andrew Gelman; Hal Stern
It is common to summarize statistical comparisons by declarations of statistical significance or nonsignificance. Here we discuss one problem with such declarations, namely that changes in statistical significance are often not themselves statistically significant. By this, we are not merely making the commonplace observation that any particular threshold is arbitrary—for example, only a small change is required to move an estimate from a 5.1% significance level to 4.9%, thus moving it into statistical significance. Rather, we are pointing out that even large changes in significance levels can correspond to small, nonsignificant changes in the underlying quantities. The error we describe is conceptually different from other oft-cited problems—that statistical significance is not the same as practical importance, that dichotomization into significant and nonsignificant results encourages the dismissal of observed differences in favor of the usually less interesting null hypothesis of no difference, and that any particular threshold for declaring significance is arbitrary. We are troubled by all of these concerns and do not intend to minimize their importance. Rather, our goal is to bring attention to this additional error of interpretation. We illustrate with a theoretical example and two applied examples. The ubiquity of this statistical error leads us to suggest that students and practitioners be made more aware that the difference between “significant” and “not significant” is not itself statistically significant.
American Journal of Political Science | 1990
Andrew Gelman; Gary King
In this paper we prove theoretically and demonstrate empirically that all existing measures of incumbency advantage in the congressional elections literature are biased or inconsistent. We then provide an unbiased estimator based on a very simple linear regression model. We apply this new method to congressional elections since 1900, providing the first evidence of a positive incumbency advantage in the first half of the century.
Journal of Research on Educational Effectiveness | 2012
Andrew Gelman; Jennifer Hill; Masanao Yajima
Abstract Applied researchers often find themselves making statistical inferences in settings that would seem to require multiple comparisons adjustments. We challenge the Type I error paradigm that underlies these corrections. Moreover we posit that the problem of multiple comparisons can disappear entirely when viewed from a hierarchical Bayesian perspective. We propose building multilevel models in the settings where multiple comparisons arise. Multilevel models perform partial pooling (shifting estimates toward each other), whereas classical procedures typically keep the centers of intervals stationary, adjusting for multiple comparisons by making the intervals wider (or, equivalently, adjusting the p values corresponding to intervals of fixed width). Thus, multilevel models address the multiple comparisons problem and also yield more efficient estimates, especially in settings with low group-level variation, which is where multiple comparisons are a particular concern.
The American Statistician | 1998
Robert E. Kass; Bradley P. Carlin; Andrew Gelman; Radford M. Neal
Abstract Markov chain Monte Carlo (MCMC) methods make possible the use of flexible Bayesian models that would otherwise be computationally infeasible. In recent years, a great variety of such applications have been described in the literature. Applied statisticians who are new to these methods may have several questions and concerns, however: How much effort and expertise are needed to design and use a Markov chain sampler? How much confidence can one have in the answers that MCMC produces? How does the use of MCMC affect the rest of the model-building process? At the Joint Statistical Meetings in August, 1996, a panel of experienced MCMC users discussed these and other issues, as well as various “tricks of the trade” This article is an edited recreation of that discussion. Its purpose is to offer advice and guidance to novice users of MCMC—and to not-so-novice users as well. Topics include building confidence in simulation results, methods for speeding and assessing convergence, estimating standard error...
Statistics and Computing | 2014
Andrew Gelman; Jessica Hwang; Aki Vehtari
We review the Akaike, deviance, and Watanabe-Akaike information criteria from a Bayesian perspective, where the goal is to estimate expected out-of-sample-prediction error using a bias-corrected adjustment of within-sample error. We focus on the choices involved in setting up these measures, and we compare them in three simple examples, one theoretical and two applied. The contribution of this paper is to put all these information criteria into a Bayesian predictive context and to better understand, through small examples, how these methods can apply in practice.
Annals of Statistics | 2005
Andrew Gelman
Analysis of variance (ANOVA) is an extremely important method in exploratory and confirmatory data analysis. Unfortunately, in complex problems (e.g., split-plot designs), it is not always easy to set up an appropriate ANOVA. We propose a hierarchical analysis that automatically gives the correct ANOVA comparisons even in complex scenarios. The inferences for all means and variances are performed under a model with a separate batch of effects for each row of the ANOVA table. We connect to classical ANOVA by working with finite-sample variance components: fixed and random effects models are characterized by inferences about existing levels of a factor and new levels, respectively. We also introduce a new graphical display showing inferences about the standard deviations of each batch of effects. We illustrate with two examples from our applied data analysis, first illustrating the usefulness of our hierarchical computations and displays, and second showing how the ideas of ANOVA are helpful in understanding a previously fit hierarchical model.Bayesian inference and fixed and random effects. Professor Gelman writes “Bayesians see analysis of variance as an inflexible classical method.” He adopts a hierarchical Bayesian framework to “identify ANOVA with the structuring of parameters into batches.” In this framework he sidesteps “the overloaded terms fixed and random” and defines effects “as constant if they are identical for all groups in a population and varying if they are allowed to differ from group to group.” Applying this approach to his first example (a Latin square with five treatments randomized to a 5×5 array of plots), variance components have to be estimated for row, column and treatment effects. In our opinion, his approach provides an insightful connection between analysis of variance and hierarchical modeling. It renders an informative and easy to interpret display of variance components that is a nice alternative for traditional analysis of variance. However, we wonder whether sidestepping the terms fixed and random is always wise. Furthermore, currently his approach is rather descriptive, and does not contain truly Bayesian inference. Both points will be briefly discussed in the sequel. To look into the question of fixed versus random and the use of hierarchical modeling, we carried out a small experiment. We constructed a dataset for the example in Section 2.2.2: 20 machines randomly divided into four treatment groups, with six outcome measures for each machine. We asked a statistician who is very skilled in multilevel analysis to analyze these data. The result was a hierarchical multivariate data structure with six outcomes nested within 20 machines, and the treatments coded as dummy variables at the machine level. Variance components were estimated for machines and measures. The treatment effects were tested by constraining all treatments to be equal and using a likelihood-ratio test. Comparing this procedure with the discussion of this example in Gelman’s paper shows that this is not what he had in mind. It certainly contradicts
Journal of the American Statistical Association | 2007
Andrew Gelman; Jeffrey Fagan; Alex Kiss
Recent studies by police departments and researchers confirm that police stop persons of racial and ethnic minority groups more often than whites relative to their proportions in the population. However, it has been argued that stop rates more accurately reflect rates of crimes committed by each ethnic group, or that stop rates reflect elevated rates in specific social areas, such as neighborhoods or precincts. Most of the research on stop rates and police–citizen interactions has focused on traffic stops, and analyses of pedestrian stops are rare. In this article we analyze data from 125,000 pedestrian stops by the New York Police Department over a 15-month period. We disaggregate stops by police precinct and compare stop rates by racial and ethnic group, controlling for previous race-specific arrest rates. We use hierarchical multilevel models to adjust for precinct-level variability, thus directly addressing the question of geographic heterogeneity that arises in the analysis of pedestrian stops. We find that persons of African and Hispanic descent were stopped more frequently than whites, even after controlling for precinct variability and race-specific estimates of crime participation.
Journal of the American Statistical Association | 1996
Andrew Gelman; Frédéric Y. Bois; Jiming Jiang
Abstract We describe a general approach using Bayesian analysis for the estimation of parameters in physiological pharmacokinetic models. The chief statistical difficulty in estimation with these models is that any physiological model that is even approximately realistic will have a large number of parameters, often comparable to the number of observations in a typical pharmacokinetic experiment (e.g., 28 measurements and 15 parameters for each subject). In addition, the parameters are generally poorly identified, akin to the well-known ill-conditioned problem of estimating a mixture of declining exponentials. Our modeling includes (a) hierarchical population modeling, which allows partial pooling of information among different experimental subjects; (b) a pharmacokinetic model including compartments for well-perfused tissues, poorly perfused tissues, fat, and the liver; and (c) informative prior distributions for population parameters, which is possible because the parameters represent real physiological...
