Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Daniel W. Schafer is active.

Publication


Featured researches published by Daniel W. Schafer.


Journal of the American Statistical Association | 1986

Residuals in Generalized Linear Models

Donald A. Pierce; Daniel W. Schafer

Abstract Generalized linear models are regression-type models for data not normally distributed, appropriately fitted by maximum likelihood rather than least squares. Typical examples are models for binomial or Poisson data, with a linear regression model for a given, ordinarily nonlinear, function of the expected values of the observations. Use of such models has become very common in recent years, and there is a clear need to study the issue of appropriate residuals to be used for diagnostic purposes. Several definitions of residuals are possible for generalized linear models. The statistical package GLIM (Baker and Nelder 1978) routinely prints out residuals , where V(μ) is the function relating the variance to the mean of y and is the maximum likelihood estimate of the ith mean as fitted to the regression model. These residuals are the signed square roots of the contributions to the Pearson goodness-of-fit statistic. Another choice of residual is the signed square root of the contribution to the devia...


Journal of the American Statistical Association | 1992

The Errors-in-Variables Problem: Considerations Provided by Radiation Dose-Response Analyses of the A-Bomb Survivor Data

Donald A. Pierce; Daniel O. Stram; Michael Væth; Daniel W. Schafer

Abstract Some basic issues in the errors-in-variables problem are discussed, in terms of considerations that arose in analyses of radiation effects on atomic bomb survivors. The setting essentially involves generalized linear models for the response variables, a very nonnormal distribution of the true covariable, and multiplicative errors in the observed covariable. Consideration is given to distinctions between structural and functional modeling. It is argued that careful attention to the apparent distribution of true covariables is critical in either case, and a quasi-structural approach to functional models is suggested. The focus is on the case in which the expected response is linear in the true covariable and strong assumptions are tentatively made about the model for covariate errors. For settings such as just described, which differ from that of much of the classical work in the area, it is emphasized that an attractive approach is based on weighted regression of the response on the expected value...


PLOS ONE | 2011

GENE-Counter: A Computational Pipeline for the Analysis of RNA-Seq Data for Gene Expression Differences

Jason S. Cumbie; Jeffrey A. Kimbrel; Yanming Di; Daniel W. Schafer; Larry J. Wilhelm; Samuel E. Fox; Christopher M. Sullivan; Aron D. Curzon; James C. Carrington; Todd C. Mockler; Jeff H. Chang

GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.


Radiation Research | 2006

Some Statistical Implications of Dose Uncertainty in Radiation Dose–Response Analyses

Daniel W. Schafer; Ethel S. Gilbert

Abstract Schafer, D. W. and Gilbert, E. S. Some Statistical Implications of Dose Uncertainty in Radiation Dose–Response Analyses. Radiat. Res. 166, 303–312 (2006). Statistical dose–response analyses in radiation epidemiology can produce misleading results if they fail to account for radiation dose uncertainties. While dosimetries may differ substantially depending on the ways in which the subjects were exposed, the statistical problems typically involve a predominantly linear dose–response curve, multiple sources of uncertainty, and uncertainty magnitudes that are best characterized as proportional rather than additive. We discuss some basic statistical issues in this setting, including the bias and shape distortion induced by classical and Berkson uncertainties, the effect of uncertain dose-prediction model parameters on estimated dose–response curves, and some notes on statistical methods for dose–response estimation in the presence of radiation dose uncertainties.


Journal of the American Statistical Association | 1992

Diagnostics for Overdispersion

Lisa M. Ganio; Daniel W. Schafer

Abstract Diagnostic tools are proposed for assessing the dependence of extrabinomial or extra-Poisson variation on explanatory variables and for comparing several common models for overdispersion. These tools are based on tests for regression terms in the dispersion parameter of a generalized linear model, using double exponential family and “pseudolikelihood” formulations. Score tests do not require the full fitting of models for variation and lead to easy graphical and numerical procedures based on squared residuals (deviance or Pearson). Robust modifications of these are motivated by Levenes test in linear models. The diagnostic tools are intended primarily to ensure prudent modeling of the variance to make correct inferences about parameters in the mean.


Radiation Research | 2004

A reanalysis of thyroid neoplasms in the Israeli tinea capitis study accounting for dose uncertainties

Jay H. Lubin; Daniel W. Schafer; Elaine Ron; Marilyn Stovall; Raymond J. Carroll

Abstract Lubin, J. H., Schafer, D. W., Ron, E., Stovall, M. and Carroll, R. J. A Reanalysis of Thyroid Neoplasms in the Israeli Tinea Capitis Study Accounting for Dose Uncertainties. Radiat. Res. 161, 359–368 (2004). In the 1940s and 1950s, children in Israel were treated for tinea capitis by irradiation to the scalp to induce epilation. Follow-up studies of these patients and of other radiation- exposed populations show an increased risk of malignant and benign thyroid tumors. Those analyses, however, assume that thyroid dose for individuals is estimated precisely without error. Failure to account for uncertainties in dosimetry may affect standard errors and bias dose–response estimates. For the Israeli tinea capitis study, we discuss sources of uncertainties and adjust dosimetry for uncertainties in the prediction of true dose from X-ray treatment parameters. We also account for missing ages at exposure for patients with multiple X-ray treatments, since only ages at first treatment are known, and for missing data on treatment center, which investigators use to define exposure. Our reanalysis of the dose response for thyroid cancer and benign thyroid tumors indicates that uncertainties in dosimetry have minimal effects on dose–response estimation and for inference on the modifying effects of age at first exposure, time since exposure, and other factors. Since the components of the dose uncertainties we describe are likely to be present in other epidemiological studies of patients treated with radiation, our analysis may provide a model for considering the potential role of these uncertainties.


Computational Statistics & Data Analysis | 2001

Maximum likelihood computations for regression with measurement error

Roger Higdon; Daniel W. Schafer

Abstract This paper presents a general computational method for maximum likelihood analysis for generalized regression with measurement error in a single explanatory variable. The method is the EM algorithm with Gauss–Hermite quadrature in the E-step. Although computationally intensive, this method provides maximum likelihood estimation under a broad range of distributional assumptions. This is important because maximum likelihood estimators can be more efficient than commonly used moment estimators and likelihood ratio tests and confidence intervals can be substantially superior to those based on asymptotic normality with approximate standard errors.


Computational Statistics & Data Analysis | 2008

Likelihood analysis of the multivariate ordinal probit regression model for repeated ordinal responses

Yonghai Li; Daniel W. Schafer

We consider the analysis of longitudinal ordinal data, meaning regression-like analysis when the response variable is categorical with ordered categories, and is measured repeatedly over time (or space) on the experimental or sampling units. Particular attention is given to the multivariate ordinal probit regression model, in which the correlation between ordered categorical responses on the same unit at different times (or locations) is modeled with a latent variable that has a multivariate normal distribution. An algorithm for maximum likelihood analysis of this model is proposed and the analysis is demonstrated on an example. Simulations clarify the extent to which maximum likelihood estimators can be more efficient than generalized estimating equations (GEE) estimators of regression coefficients and the extent to which likelihood ratio tests can be more accurate than tests based on standard errors and approximate normality of GEE estimators.


Journal of Statistical Computation and Simulation | 2002

Likelihood Analysis and Flexible Structural Modeling for Measurement Error Model Regression

Daniel W. Schafer

A computational approach is presented for likelihood analysis of regression models with measurement errors in explanatory variables. If y, x, and w represent the response, an unobservable true value of an explanatory variable, and an observable measurement of x, then the likelihood function is based on the density of the observable variables: @(y,w) = ∫ƒ(y,w|x)ƒ(x)dx. For realistic model specifications the integral must be approximated numerically. While one could conceivably use a general-purpose optimization routine for finding estimates that maximize the approximate likelihood, that tends not to work very well. The approximate density, however, has the form of a finite mixture model so that the standard EM Algorithm for that problem can be applied. The resulting approach is practically important since it easily permits realistic distributional modeling and can be accomplished through iterative application of readily available routines.


Journal of Business & Economic Statistics | 1987

Measurement-Error Diagnostics and the Sex Discrimination Problem

Daniel W. Schafer

Numerical and graphical diagnostic tools are presented for studying the effect of explanatory-variable measurement errors on regression results when very rough knowledge about the measurement-error variances or reliabilities is available. These methods are exhibited on an example of sex discrimination in which male and female salaries are compared after adjustment for individual qualifications but the observed qualification variables are only proxies for what is actually desired. A measurement-error trace is suggested, much like a ridge trace, to exhibit effects of measurement errors of various sizes on the estimate of the adjusted sex difference in log salaries. An approximate Bayesian analysis for the simple functional model is also presented.

Collaboration


Dive into the Daniel W. Schafer's collaboration.

Top Co-Authors

Avatar

Yanming Di

Oregon State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Elaine Ron

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Jay H. Lubin

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Marilyn Stovall

University of Texas MD Anderson Cancer Center

View shared research outputs
Top Co-Authors

Avatar

Richard S. Bennett

United States Environmental Protection Agency

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge