Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Joseph L. Schafer is active.

Publication


Featured researches published by Joseph L. Schafer.


Statistical Science | 2007

Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data

Joseph Kang; Joseph L. Schafer

When outcomes are missing for reasons beyond an investigators control, there are two different ways to adjust a parameter estimate for covariates that may be related both to the outcome and to missingness. One approach is to model the relationships between the covariates and the outcome and use those relationships to predict the missing values. Another is to model the probabilities of missingness given the covariates and incorporate them into a weighted or stratified estimate. Doubly robust (DR) procedures apply both types of model simultaneously and produce a consistent estimate of the parameter if either of the two models has been correctly specified. In this article, we show that DR estimates can be constructed in many ways. We compare the performance of various DR and non-DR estimates of a population mean in a simulated example where both models are incorrect but neither is grossly misspecified. Methods that use inverse-probabilities as weights, whether they are DR or not, are sensitive to misspecification of the propensity model when some estimated propensities are small. Many DR methods perform better than simple inverse-probability weighting. None of the DR methods we tried, however, improved upon the performance of simple regression-based prediction of the missing values. This study does not represent every missing-data problem that will arise in practice. But it does demonstrate that, in at least some settings, two wrong models are not better than one.We congratulate Drs. Kang and Schafer (KS henceforth) for a careful and thought-provoking contribution to the literature regarding the so-called “double robustness” property, a topic that still engenders some confusion and disagreement. The authors’ approach of focusing on the simplest situation of estimation of the population mean μ of a response y when y is not observed on all subjects according to a missing at random (MAR) mechanism (equivalently, estimation of the mean of a potential outcome in a causal model under the assumption of no unmeasured confounders) is commendable, as the fundamental issues can be explored without the distractions of the messier notation and considerations required in more complicated settings. Indeed, as the article demonstrates, this simple setting is sufficient to highlight a number of key points. As noted eloquently by Molenberghs (2005), in regard to how such missing data/causal inference problems are best addressed, two “schools” may be identified: the “likelihood-oriented” school and the “weighting-based” school. As we have emphasized previously (Davidian, Tsiatis and Leon, 2005), we prefer to view inference from the vantage point of semi-parametric theory, focusing on the assumptions embedded in the statistical models leading to different “types” of estimators (i.e., “likelihood-oriented” or “weighting-based”) rather than on the forms of the estimators themselves. In this discussion, we hope to complement the presentation of the authors by elaborating on this point of view. Throughout, we use the same notation as in the paper.


Structural Equation Modeling | 2007

PROC LCA: A SAS Procedure for Latent Class Analysis

Stephanie T. Lanza; Linda M. Collins; David R. Lemmon; Joseph L. Schafer

Latent class analysis (LCA) is a statistical method used to identify a set of discrete, mutually exclusive latent classes of individuals based on their responses to a set of observed categorical variables. In multiple-group LCA, both the measurement part and structural part of the model can vary across groups, and measurement invariance across groups can be empirically tested. LCA with covariates extends the model to include predictors of class membership. In this article, we introduce PROC LCA, a new SAS procedure for conducting LCA, multiple-group LCA, and LCA with covariates. The procedure is demonstrated using data on alcohol use behavior in a national sample of high school seniors.


Psychological Methods | 2008

Average Causal Effects from Nonrandomized Studies: A Practical Guide and Simulated Example.

Joseph L. Schafer; Joseph Kang

In a well-designed experiment, random assignment of participants to treatments makes causal inference straightforward. However, if participants are not randomized (as in observational study, quasi-experiment, or nonequivalent control-group designs), group comparisons may be biased by confounders that influence both the outcome and the alleged cause. Traditional analysis of covariance, which includes confounders as predictors in a regression model, often fails to eliminate this bias. In this article, the authors review Rubins definition of an average causal effect (ACE) as the average difference between potential outcomes under different treatments. The authors distinguish an ACE and a regression coefficient. The authors review 9 strategies for estimating ACEs on the basis of regression, propensity scores, and doubly robust methods, providing formulas for standard errors not given elsewhere. To illustrate the methods, the authors simulate an observational study to assess the effects of dieting on emotional distress. Drawing repeated samples from a simulated population of adolescent girls, the authors assess each method in terms of bias, efficiency, and interval coverage. Throughout the article, the authors offer insights and practical guidance for researchers who attempt causal inference with observational data.


arXiv: Methodology | 2006

Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data

Joseph Kang; Joseph L. Schafer

When outcomes are missing for reasons beyond an investigators control, there are two different ways to adjust a parameter estimate for covariates that may be related both to the outcome and to missingness. One approach is to model the relationships between the covariates and the outcome and use those relationships to predict the missing values. Another is to model the probabilities of missingness given the covariates and incorporate them into a weighted or stratified estimate. Doubly robust (DR) procedures apply both types of model simultaneously and produce a consistent estimate of the parameter if either of the two models has been correctly specified. In this article, we show that DR estimates can be constructed in many ways. We compare the performance of various DR and non-DR estimates of a population mean in a simulated example where both models are incorrect but neither is grossly misspecified. Methods that use inverse-probabilities as weights, whether they are DR or not, are sensitive to misspecification of the propensity model when some estimated propensities are small. Many DR methods perform better than simple inverse-probability weighting. None of the DR methods we tried, however, improved upon the performance of simple regression-based prediction of the missing values. This study does not represent every missing-data problem that will arise in practice. But it does demonstrate that, in at least some settings, two wrong models are not better than one.We congratulate Drs. Kang and Schafer (KS henceforth) for a careful and thought-provoking contribution to the literature regarding the so-called “double robustness” property, a topic that still engenders some confusion and disagreement. The authors’ approach of focusing on the simplest situation of estimation of the population mean μ of a response y when y is not observed on all subjects according to a missing at random (MAR) mechanism (equivalently, estimation of the mean of a potential outcome in a causal model under the assumption of no unmeasured confounders) is commendable, as the fundamental issues can be explored without the distractions of the messier notation and considerations required in more complicated settings. Indeed, as the article demonstrates, this simple setting is sufficient to highlight a number of key points. As noted eloquently by Molenberghs (2005), in regard to how such missing data/causal inference problems are best addressed, two “schools” may be identified: the “likelihood-oriented” school and the “weighting-based” school. As we have emphasized previously (Davidian, Tsiatis and Leon, 2005), we prefer to view inference from the vantage point of semi-parametric theory, focusing on the assumptions embedded in the statistical models leading to different “types” of estimators (i.e., “likelihood-oriented” or “weighting-based”) rather than on the forms of the estimators themselves. In this discussion, we hope to complement the presentation of the authors by elaborating on this point of view. Throughout, we use the same notation as in the paper.


Statistics in Medicine | 2007

Robustness of a multivariate normal approximation for imputation of incomplete binary data.

Coen Bernaards; Thomas R. Belin; Joseph L. Schafer


Journal of The Royal Statistical Society Series A-statistics in Society | 2006

Latent class logistic regression: application to marijuana use and attitudes among high school seniors

Hwan Chung; Brian P. Flaherty; Joseph L. Schafer


Statistical Science | 1994

[Multiple-Imputation Inferences with Uncongenial Sources of Input]: Comment

Joseph L. Schafer


Archive | 2008

Individual and contextual differences in treatment effects: New Methodologies for experimental and observational research

Joseph L. Schafer; Joseph Kang


Archive | 2008

Marginal causal modeling when the treatment is measured with error

Joseph L. Schafer; Joseph Kang


Poster presented at the Society for Prevention Research conference | 2007

Using latent class variables as predictors

H. Jung; Joseph L. Schafer; Ty A. Ridenour; Stephanie T. Lanza

Collaboration


Dive into the Joseph L. Schafer's collaboration.

Top Co-Authors

Avatar

Joseph Kang

Northwestern University

View shared research outputs
Top Co-Authors

Avatar

David R. Lemmon

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Stephanie T. Lanza

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Linda M. Collins

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ty A. Ridenour

University of Pittsburgh

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge