Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David C. Wheeler is active.

Publication


Featured researches published by David C. Wheeler.


Environment and Planning A | 2007

Diagnostic Tools and a Remedial Method for Collinearity in Geographically Weighted Regression

David C. Wheeler

Geographically weighted regression (GWR) is drawing attention as a statistical method to estimate regression models with spatially varying relationships between explanatory variables and a response variable. Local collinearity in weighted explanatory variables leads to GWR coefficient estimates that are correlated locally and across space, have inflated variances, and are at times counterintuitive and contradictory in sign to the global regression estimates. The presence of local collinearity in the absence of global collinearity necessitates the use of diagnostic tools in the local regression model building process to highlight areas in which the results are not reliable for statistical inference. The method of ridge regression can also be integrated into the GWR framework to constrain and stabilize regression coefficients and lower prediction error. This paper presents numerous diagnostic tools and ridge regression in GWR and demonstrates the utility of these techniques with an example using the Columbus crime dataset.


Environment and Planning A | 2011

A Simulation-Based Study of Geographically Weighted Regression as a Method for Investigating Spatially Varying Relationships

Antonio Páez; Steven Farber; David C. Wheeler

Large variability and correlations among the coefficients obtained from the method of geographically weighted regression (GWR) have been identified in previous research. This is an issue that poses a serious challenge for the utility of the method as a tool to investigate multivariate relationships. The objectives of this paper are to assess: (1) the ability of GWR to discriminate between a spatially constant processes and one with spatially varying relationships; and (2) to accurately retrieve spatially varying relationships. Extensive numerical experiments are used to investigate situations where the underlying process is stationary and nonstationary, and to assess the degree to which spurious intercoefficient correlations are introduced. Two different implementations of GWR and cross-validation approaches are assessed. Results suggest that judicious application of GWR can be used to discern whether the underlying process is nonstationary. Furthermore, evidence of spurious correlations indicates that caution must be exercised when drawing conclusions regarding spatial relationships retrieved using this approach, particularly when working with small samples.


Journal of Geographical Systems | 2007

An assessment of coefficient accuracy in linear regression models with spatially varying coefficients

David C. Wheeler; Catherine A. Calder

The realization in the statistical and geographical sciences that a relationship between an explanatory variable and a response variable in a linear regression model is not always constant across a study area has led to the development of regression models that allow for spatially varying coefficients. Two competing models of this type are geographically weighted regression (GWR) and Bayesian regression models with spatially varying coefficient processes (SVCP). In the application of these spatially varying coefficient models, marginal inference on the regression coefficient spatial processes is typically of primary interest. In light of this fact, there is a need to assess the validity of such marginal inferences, since these inferences may be misleading in the presence of explanatory variable collinearity. In this paper, we present the results of a simulation study designed to evaluate the sensitivity of the spatially varying coefficients in the competing models to various levels of collinearity. The simulation study results show that the Bayesian regression model produces more accurate inferences on the regression coefficients than does GWR. In addition, the Bayesian regression model is overall fairly robust in terms of marginal coefficient inference to moderate levels of collinearity, and degrades less substantially than GWR with strong collinearity.


International Journal of Health Geographics | 2007

A comparison of spatial clustering and cluster detection techniques for childhood leukemia incidence in Ohio, 1996 – 2003

David C. Wheeler

BackgroundSpatial cluster detection is an important tool in cancer surveillance to identify areas of elevated risk and to generate hypotheses about cancer etiology. There are many cluster detection methods used in spatial epidemiology to investigate suspicious groupings of cancer occurrences in regional count data and case-control data, where controls are sampled from the at-risk population. Numerous studies in the literature have focused on childhood leukemia because of its relatively large incidence among children compared with other malignant diseases and substantial public concern over elevated leukemia incidence. The main focus of this paper is an analysis of the spatial distribution of leukemia incidence among children from 0 to 14 years of age in Ohio from 1996–2003 using individual case data from the Ohio Cancer Incidence Surveillance System (OCISS).Specifically, we explore whether there is statistically significant global clustering and if there are statistically significant local clusters of individual leukemia cases in Ohio using numerous published methods of spatial cluster detection, including spatial point process summary methods, a nearest neighbor method, and a local rate scanning method. We use the K function, Cuzick and Edwards method, and the kernel intensity function to test for significant global clustering and the kernel intensity function and Kulldorffs spatial scan statistic in SaTScan to test for significant local clusters.ResultsWe found some evidence, although inconclusive, of significant local clusters in childhood leukemia in Ohio, but no significant overall clustering. The findings from the local cluster detection analyses are not consistent for the different cluster detection techniques, where the spatial scan method in SaTScan does not find statistically significant local clusters, while the kernel intensity function method suggests statistically significant clusters in areas of central, southern, and eastern Ohio. The findings are consistent for the different tests of global clustering, where no significant clustering is demonstrated with any of the techniques when all age cases are considered together.ConclusionThis comparative study for childhood leukemia clustering and clusters in Ohio revealed several research issues in practical spatial cluster detection. Among them, flexibility in cluster shape detection should be an issue for consideration.


Environment and Planning A | 2009

Simultaneous Coefficient Penalization and Model Selection in Geographically Weighted Regression: The Geographically Weighted Lasso

David C. Wheeler

In the field of spatial analysis, the interest of some researchers in modeling relationships between variables locally has led to the development of regression models with spatially varying coefficients. One such model that has been widely applied is geographically weighted regression (GWR). In the application of GWR, marginal inference on the spatial pattern of regression coefficients is often of interest, as is, less typically, prediction and estimation of the response variable. Empirical research and simulation studies have demonstrated that local correlation in explanatory variables can lead to estimated regression coefficients in GWR that are strongly correlated and, hence, problematic for inference on relationships between variables. The author introduces a penalized form of GWR, called the ‘geographically weighted lasso’ (GWL) which adds a constraint on the magnitude of the estimated regression coefficients to limit the effects of explanatory-variable correlation. The GWL also performs local model selection by potentially shrinking some of the estimated regression coefficients to zero in some locations of the study area. Two versions of the GWL are introduced: one designed to improve prediction of the response variable, and one more oriented toward constraining regression coefficients for inference. The results of applying the GWL to simulated and real datasets show that this method stabilizes regression coefficients in the presence of collinearity and produces lower prediction and estimation error of the response variable than does GWR and another constrained version of GWR—geographically weighted ridge regression.


International Encyclopedia of Human Geography | 2010

Geographically Weighted Regression

David C. Wheeler; Antonio Páez

Geographically weighted regression (GWR) was introduced to the geography literature by Brunsdon et al. (1996) to study the potential for relationships in a regression model to vary in geographical space, or what is termed parametric nonstationarity. GWR is based on the non-parametric technique of locally weighted regression developed in statistics for curve-fitting and smoothing applications, where local regression parameters are estimated using subsets of data proximate to a model estimation point in variable space. The innovation with GWR is using a subset of data proximate to the model calibration location in geographical space instead of variable space. While the emphasis in traditional locally weighted regression in statistics has been on curve-fitting, that is estimating or predic ting the response variable, GWR has been presented as a method to conduct inference on spatially varying relationships, in an attempt to extend the original emphasis on prediction to confirmatory analysis (Paez and Wheeler 2009).


International Journal of Cancer | 2012

Prospective study of ultraviolet radiation exposure and risk of cancer in the United States

Shih-Wen Lin; David C. Wheeler; Yikyung Park; Elizabeth K. Cahoon; Albert R. Hollenbeck; D. Michal Freedman; Christian C. Abnet

Ecologic studies have reported that solar ultraviolet radiation (UVR) exposure is associated with cancer; however, little evidence is available from prospective studies. We aimed to assess the association between an objective measure of ambient UVR exposure and risk of total and site‐specific cancer in a large, regionally diverse cohort [450,934 white, non‐Hispanic subjects (50–71 years) in the prospective National Institutes of Health (NIH)‐AARP Diet and Health Study] after accounting for individual‐level confounding risk factors. Estimated erythemal UVR exposure from satellite Total Ozone Mapping Spectrometer (TOMS) data from NASA was linked to the US Census Bureau 2000 census tract (centroid) of baseline residence for each subject. We used Cox proportional hazards models adjusted for multiple potential confounders to estimate hazard ratios (HRs) and 95% confidence intervals (CIs) for quartiles of UVR exposure. Restricted cubic splines examined nonlinear relationships. Over 9 years of follow‐up, UVR exposure was inversely associated with total cancer risk (N = 75,917; highest versus lowest quartile; HR = 0.97, 95% CI = 0.95–0.99; p‐trend < 0.001). In site‐specific cancer analyses, UVR exposure was associated with increased melanoma risk (highest versus lowest quartile; HR = 1.22, 95% CI = 1.13–1.32; p‐trend < 0.001) and decreased risk of non‐Hodgkins lymphoma (HR = 0.82, 95% CI = 0.74–0.92) and colon (HR = 0.88, 95% CI = 0.82–0.96), squamous cell lung (HR = 0.86, 95% CI = 0.75–0.98), pleural (HR = 0.57, 95% CI = 0.38–0.84), prostate (HR = 0.91, 95% CI = 0.88–0.95), kidney (HR = 0.83, 95% CI = 0.73–0.94) and bladder (HR = 0.88, 95% CI = 0.81–0.96) cancers (all p‐trend < 0.05). We also found nonlinear associations for some cancer sites, including the thyroid and pancreas. Our results add to mounting evidence for the influential role of UVR exposure on cancer.


Journal of Geographical Systems | 2009

Comparing spatially varying coefficient models: a case study examining violent crime rates and their relationships to alcohol outlets and illegal drug arrests

David C. Wheeler; Lance A. Waller

In this paper, we compare and contrast a Bayesian spatially varying coefficient process (SVCP) model with a geographically weighted regression (GWR) model for the estimation of the potentially spatially varying regression effects of alcohol outlets and illegal drug activity on violent crime in Houston, Texas. In addition, we focus on the inherent coefficient shrinkage properties of the Bayesian SVCP model as a way to address increased coefficient variance that follows from collinearity in GWR models. We outline the advantages of the Bayesian model in terms of reducing inflated coefficient variance, enhanced model flexibility, and more formal measuring of model uncertainty for prediction. We find spatially varying effects for alcohol outlets and drug violations, but the amount of variation depends on the type of model used. For the Bayesian model, this variation is controllable through the amount of prior influence placed on the variance of the coefficients. For example, the spatial pattern of coefficients is similar for the GWR and Bayesian models when a relatively large prior variance is used in the Bayesian model.


Journal of Agricultural Biological and Environmental Statistics | 2015

Characterization of Weighted Quantile Sum Regression for Highly Correlated Data in a Risk Analysis Setting

Caroline K. Carrico; Chris Gennings; David C. Wheeler; Pam Factor-Litvak

In risk evaluation, the effect of mixtures of environmental chemicals on a common adverse outcome is of interest. However, due to the high dimensionality and inherent correlations among chemicals that occur together, the traditional methods (e.g. ordinary or logistic regression) suffer from collinearity and variance inflation, and shrinkage methods have limitations in selecting among correlated components. We propose a weighted quantile sum (WQS) approach to estimating a body burden index, which identifies “bad actors” in a set of highly correlated environmental chemicals. We evaluate and characterize the accuracy of WQS regression in variable selection through extensive simulation studies through sensitivity and specificity (i.e., ability of the WQS method to select the bad actors correctly and not incorrect ones). We demonstrate the improvement in accuracy this method provides over traditional ordinary regression and shrinkage methods (lasso, adaptive lasso, and elastic net). Results from simulations demonstrate that WQS regression is accurate under some environmentally relevant conditions, but its accuracy decreases for a fixed correlation pattern as the association with a response variable diminishes. Nonzero weights (i.e., weights exceeding a selection threshold parameter) may be used to identify bad actors; however, components within a cluster of highly correlated active components tend to have lower weights, with the sum of their weights representative of the set.Supplementary materials accompanying this paper appear on-line.


Journal of Agricultural Biological and Environmental Statistics | 2008

Mountains, valleys, and rivers: The transmission of raccoon rabies over a heterogeneous landscape.

David C. Wheeler; Lance A. Waller

Landscape features may serve as either barriers or gateways to the spread of certain infectious diseases, and understanding the way geographic structure impacts disease spread could lead to improved containment strategies. Here, we focus on the spacetime diffusion of a raccoon rabies outbreak across several states in the Eastern United States. While focusing on pattern, we move toward closer links between pattern and process by considering statistical estimation of local pattern features to gain insight on landscape influences on the underlying process. Specifically, we quantify the impact that landscape features, such as mountains and rivers, have on the speed of infectious disease diffusion. This work combines statistical modeling with operations in a geographic information system (GIS) to link observed patterns of disease diffusion with local landscape values. We explore three analytic approaches. First, we use spatial prediction (kriging) to provide a descriptive pattern of the spread of the virus. Second, we use Bayesian areal wombling to detect barriers for infectious disease transmission and examine spatial coincidence with potential features. Finally, we input landscape variables into a hierarchical Bayesian model with spatially varying coefficients to obtain model-based estimates of their local impacts on transmission time in counties.

Collaboration


Dive into the David C. Wheeler's collaboration.

Top Co-Authors

Avatar

Lance A. Waller

University of Illinois at Chicago

View shared research outputs
Top Co-Authors

Avatar

Joanne S. Colt

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Mary H. Ward

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Alison Johnson

Albert Einstein College of Medicine

View shared research outputs
Top Co-Authors

Avatar

Chris Gennings

Virginia Commonwealth University

View shared research outputs
Top Co-Authors

Avatar

Dalsu Baris

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Debra T. Silverman

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Elizabeth K. Cahoon

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Melissa C. Friesen

National Institutes of Health

View shared research outputs
Researchain Logo
Decentralizing Knowledge