Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where James P. Hobert is active.

Publication


Featured researches published by James P. Hobert.


Journal of The Royal Statistical Society Series B-statistical Methodology | 1999

Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm

James G. Booth; James P. Hobert

Summary. Two new implementations of the EM algorithm are proposed for maximum likelihood fitting of generalized linear mixed models. Both methods use random (independent and identically distributed) sampling to construct Monte Carlo approximations at the E-step. One approach involves generating random samples from the exact conditional distribution of the random effects (given the data) by rejection sampling, using the marginal distribution as a candidate. The second method uses a multivariate t importance sampling approximation. In many applications the two methods are complementary. Rejection sampling is more efficient when sample sizes are small, whereas importance sampling is better with larger sample sizes. Monte Carlo approximation using random samples allows the Monte Carlo error at each iteration to be assessed by using standard central limit theory combined with Taylor series methods. Specifically, we construct a sandwich variance estimate for the maximizer at each approximate E-step. This suggests a rule for automatically increasing the Monte Carlo sample size after iterations in which the true EM step is swamped by Monte Carlo error. In contrast, techniques for assessing Monte Carlo error have not been developed for use with alternative implementations of Monte Carlo EM algorithms utilizing Markov chain Monte Carlo E-step approximations. Three different data sets, including the infamous salamander data of McCullagh and Nelder, are used to illustrate the techniques and to compare them with the alternatives. The results show that the methods proposed can be considerably more efficient than those based on Markov chain Monte Carlo algorithms. However, the methods proposed may break down when the intractable integrals in the likelihood function are of high dimension.


Journal of the American Statistical Association | 1996

The Effect of Improper Priors on Gibbs Sampling in Hierarchical Linear Mixed Models

James P. Hobert; George Casella

Abstract Often, either from a lack of prior information or simply for convenience, variance components are modeled with improper priors in hierarchical linear mixed models. Although the posterior distributions for these models are rarely available in closed form, the usual conjugate structure of the prior specification allows for painless calculation of the Gibbs conditionals. Thus the Gibbs sampler may be used to explore the posterior distribution without ever having established propriety of the posterior. An example is given showing that the output from a Gibbs chain corresponding to an improper posterior may appear perfectly reasonable. Thus one cannot expect the Gibbs output to provide a “red flag,” informing the user that the posterior is improper. The user must demonstrate propriety before a Markov chain Monte Carlo technique is used. A theorem is given that classifies improper priors according to the propriety of the resulting posteriors. Applications concerning Bayesian analysis of animal breeding...


Sociological Methodology | 2000

Random-effects modeling of categorical response data

Alan Agresti; James G. Booth; James P. Hobert; Brian S Caffo

In many applications observations have some type of clustering, with observations within clusters tending to be correlated. A common instance of this occurs when each subject in the sample undergoes repeated measurement, in which case a cluster consists of the set of observations for the subject. One approach to modeling clustered data introduces cluster-level random effects into the model. The use of random effects in linear models for normal responses is well established. By contrast, random effects have only recently seen much use in models for categorical data. This chapter surveys a variety of potential social science applications of random effects modeling of categorical data. Applications discussed include repeated measurement for binary or ordinal responses, shrinkage to improve multiparameter estimation of a set of proportions or rates, multivariate latent variable modeling, hierarchically structured modeling, and cluster sampling. The models discussed belong to the class of generalized linear mixed models (GLMMs), an extension of ordinary linear models that permits nonnormal response variables and both fixed and random effects in the predictor term. The models are GLMMs for either binomial or Poisson response variables, although we also present extensions to multicategory (nominal or ordinal) responses. We also summarize some of the technical issues of model-fitting that complicate the fitting of GLMMs even with existing software.


Journal of the American Statistical Association | 1998

Standard Errors of Prediction in Generalized Linear Mixed Models

James G. Booth; James P. Hobert

Abstract The unconditional mean squared error of prediction (UMSEP) is widely used as a measure of prediction variance for inferences concerning linear combinations of fixed and random effects in the classical normal theory mixed model. But the UMSEP is inappropriate for generalized linear mixed models where the conditional variance of the random effects depends on the data. When the random effects describe variation between independent small domains and domain-specific prediction is of interest, we propose a conditional mean squared error of prediction (CMSEP) as a general measure of prediction variance. The CMSEP is shown to be the sum of the conditional variance and a positive correction that accounts for the sampling variability of parameter estimates. We derive a second-order-correct estimate of the CMSEP that consists of three components: (a) a plug-in estimate of the conditional variance, (b) a plug-in estimate of a Taylor series approximation to the correction term, and (c) a bootstrap estimate of...


Statistical Modelling | 2003

Negative binomial loglinear mixed models

James G. Booth; George Casella; Herwig Friedl; James P. Hobert

The Poisson loglinear model is a common choice for explaining variability in counts. However, in many practical circumstances the restriction that the mean and variance are equal is not realistic. Overdispersion with respect to the Poisson distribution can be modeled explicitly by integrating with respect to a mixture distribution, and use of the conjugate gamma mixing distribution leads to a negative binomial loglinear model. This paper extends the negative binomial loglinear model to the case of dependent counts, where dependence among the counts is handled by including linear combinations of random effects in the linear predictor. If we assume that the vector of random effects is multivariate normal, then complex forms of dependence can be modelled by appropriate specification of the covariance structure. Although the likelihood function for the resulting model is not tractable, maximum likelihood estimates (and standard errors) can be found using the NLMIXED procedure in SAS or, in more complicated examples, using a Monte Carlo EM algorithm. An alternate approach is to leave the random effects completely unspecified and attempt to estimate them using nonparametric maximum likelihood. The methodologies are illustrated with several examples.


Statistical Modelling | 2001

A survey of Monte Carlo algorithms for maximizing the likelihood of a two-stage hierarchical model

James G. Booth; James P. Hobert; Wolfgang Jank

Likelihood inference with hierarchical models is often complicated by the fact that the likelihood function involves intractable integrals. Numerical integration (e.g. quadrature) is an option if the dimension of the integral is low but quickly becomes unreliable as the dimension grows. An alternative approach is to approximate the intractable integrals using Monte Carlo averages. Several different algorithms based on this idea have been proposed. In this paper we discuss the relative merits of simulated maximum likelihood, Monte Carlo EM, Monte Carlo Newton-Raphson and stochastic approximation.


Journal of Computational and Graphical Statistics | 1998

Functional Compatibility, Markov Chains and Gibbs Sampling with Improper Posteriors

James P. Hobert; George Casella

Abstract The members of a set of conditional probability density functions are called compatible if there exists a joint probability density function that generates them. We generalize this concept by calling the conditionals functionally compatible if there exists a non-negative function that behaves like a joint density as far as generating the conditionals according to the probability calculus, but whose integral over the whole space is not necessarily finite. A necessary and sufficient condition for functional compatibility is given that provides a method of calculating this function, if it exists. A Markov transition function is then constructed using a set of functionally compatible conditional densities and it is shown, using the compatibility results, that the associated Markov chain is positive recurrent if and only if the conditionals are compatible. A Gibbs Markov chain, constructed via “Gibbs conditionals” from a hierarchical model with an improper posterior, is a special case. Therefore, the ...


Journal of the American Statistical Association | 2004

Geometric Ergodicity of van Dyk and Meng's Algorithm for the Multivariate Student's t Model

Dobrin Marchev; James P. Hobert

Let π denote the posterior distribution that results when a random sample of size n from a d-dimensional location-scale Students t distribution (with ν degrees of freedom) is combined with the standard noninformative prior. van Dyk and Meng developed an efficient Markov chain Monte Carlo (MCMC) algorithm for sampling from π and provided considerable empirical evidence to suggest that their algorithm converges to stationarity much faster than the standard data augmentation algorithm. In addition to its practical importance, this algorithm is interesting from a theoretical standpoint because it is based upon a Markov chain that is not positive recurrent. In this article, we formally analyze the relevant sub-Markov chain underlying van Dyk and Mengs algorithm. In particular, we establish drift and minorization conditions that show that, for many (d, ν, n) triples, the sub-Markov chain is geometrically ergodic. This is the first general, rigorous analysis of an MCMC algorithm based upon a nonpositive recurrent Markov chain. Moreover, our results are important from a practical standpoint because (1) geometric ergodicity guarantees the existence of central limit theorems that can be used to construct Monte Carlo standard errors and (2) the drift and minorization conditions themselves allow for the calculation of exact upper bounds on the total variation distance to stationarity. The results are illustrated using a simple numerical example.


Electronic Journal of Statistics | 2013

The Polya-Gamma Gibbs sampler for Bayesian logistic regression is uniformly ergodic

Hee Min Choi; James P. Hobert

One of the most widely used data augmentation algorithms is Albert and Chib’s (1993) algorithm for Bayesian probit regression. Polson, Scott and Windle (2013) recently introduced an analogous algorithm for Bayesian logistic regression. The main difference between the two is that Albert and Chib’s (1993) truncated normals are replaced by so-called Polya-Gamma random variables. In this note, we establish that the Markov chain underlying Polson et al.’s (2013) algorithm is uniformly ergodic. This theoretical result has important practical benefits. In particular, it guarantees the existence of central limit theorems that can be used to make an informed decision about how long the simulation should be run.


Journal of the American Statistical Association | 2000

Hierarchical Models: A Current Computational Perspective

James P. Hobert

cal report, University of Antwerp. Rousseeuw, P. J., and Yohai, V. J. (1984), “Robust Regression by Means of S-Estimators,” in Robust and Nonlinear Time Series, eds. J. Franke, W. Hardle, and R. D. Martin, New York: Springer-Verlag, pp. 256-272. Rubun, D. B., and Thomas, N. (1996), “Matching Using Estimated Propensity Scores: Relating Theory to Practice,” Biornetrics, 52, 254268. Sheather, S . J., McKean, J. W., and Hettmansperger, T. P. (19971, “Finite Sample Stability Properties of the Least Median of Squares Estimator,” Journal of Statistical Computation and Simulation, 58, 31 1-383. Simpson, D. G., and Chang, Y. I. (19971, “Reweighting Approximate GM Estimators: Asymptotics and Residual-Based Graphics,” Journal of Statistical Planning and lnference, 57, 273-293. Simpson, D. G., Ruppert, D., and Carroll, R. J. (1992), “On One-Step GM Estimates and Stability of Inferences in Linear Regression,” Journal of the American Statistical Association, 81, 439-450. Simpson, D. G., and Yohai, V. J. (19981, “Functional Stability of One-Step GM Estimators in Approximately Linear Regression,” The Annals of Statistics, 26, 1147-1 169.

Collaboration


Dive into the James P. Hobert's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Qian Qin

University of Florida

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge