Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Qingzhao Yu is active.

Publication


Featured researches published by Qingzhao Yu.


Journal of Epidemiology and Community Health | 2011

Impact of small group size on neighbourhood influences in multilevel models

Katherine P. Theall; Richard Scribner; Stephanie T. Broyles; Qingzhao Yu; Jigar Chotalia; Neal Simonsen; Matthias Schonlau; Bradley P. Carlin

Background Given the growing availability of multilevel data from national surveys, researchers interested in contextual effects may find themselves with a small number of individuals per group. Although there is a growing body of literature on sample size in multilevel modelling, few have explored the impact of group sizes of less than five. Methods In a simulated analysis of real data, the impact of a group size of less than five was examined on both a continuous and dichotomous outcome in a simple two-level multilevel model. Models with group sizes one to five were compared with models with complete data. Four different linear and logistic models were examined: empty models; models with a group-level covariate; models with an individual-level covariate and models with an aggregated group-level covariate. The study evaluated further whether the impact of small group size differed depending on the total number of groups. Results When the number of groups was large (N=459), neither fixed nor random components were affected by small group size, even when 90% of tracts had only one individual per tract and even when an aggregated group-level covariate was examined. As the number of groups decreased, the SE estimates of both fixed and random effects were inflated. Furthermore, group-level variance estimates were more affected than were fixed components. Conclusions Datasets in which there is a small to moderate number of groups, with the majority of very small group size (n<5), size may fail to find or even consider a group-level effect when one may exist and also may be underpowered to detect fixed effects.


Social Science & Medicine | 2009

Misspecification of the effect of race in fixed effects models of health inequalities

Richard Scribner; Katherine P. Theall; Neal Simonsen; Karen Mason; Qingzhao Yu

The purpose of this study is to characterize the different results obtained when analyzing health inequalities data in which individuals are nested within their neighborhoods and a single level model is used to characterize risk rather than a multilevel model. The inability of single level models to characterize between neighborhood variance in risk may affect the level of risk attributed to black race if blacks are differentially distributed in high risk neighborhoods. The research replicates in Los Angeles an approach applied by a different group of researchers in Massachusetts (Subramanian, Chen, Rehkopf, Waterman, & Krieger, 2005). Single level and multilevel models were used to analyze Los Angeles County, California, US all-cause mortality data for the years 1989-1991, modeled as 29,936 cells (deaths and population denominators cross-tabulated by age, gender, and race/ethnicity) nested within 1552 census tracts. Overall blacks had 1.27 times the risk of mortality compared to whites. However, multilevel models demonstrated considerable between census tract variance in mortality for both blacks and whites which was partially explained by neighborhood poverty. Comparing the results of equivalent single level and multilevel models, the mortality odds ratio for blacks compared to the white reference group reversed itself, indicating greater risk for blacks in the single level model and lower risk in the multilevel model. Adding an area based socioeconomic measure (ABSM) to the single level model reduced but did not remove the discrepancy. Predictions of mortality risk for the interaction of race and age group demonstrate that all single level models exaggerated the mortality risk associated with black race. We conclude that characterizing health inequalities in mortality for blacks using single level models, which do not account for the cross level interaction created by the greater likelihood of black residence in neighborhoods where the risk of mortality is greater regardless of race, can exaggerate the risk of mortality attributable to the individual level effects of black race.


Statistics in Medicine | 2009

Hierarchical additive modeling of nonlinear association with spatial correlations—An application to relate alcohol outlet density and neighborhood assault rates

Qingzhao Yu; Bin Li; Richard Scribner

Previous studies have suggested a link between alcohol outlets and assaults. In this paper, we explore the effects of alcohol availability on assaults at the census tract level over time. In addition, we use a natural experiment to check whether a sudden loss of alcohol outlets is associated with deeper decreasing in assault violence. Several features of the data raise statistical challenges: (1) the association between covariates (for example, the alcohol outlet density of each census tract) and the assault rates may be complex and therefore cannot be described using a linear model without covariates transformation, (2) the covariates may be highly correlated with each other, (3) there are a number of observations that have missing inputs, and (4) there is spatial association in assault rates at the census tract level. We propose a hierarchical additive model, where the nonlinear correlations and the complex interaction effects are modeled using the multiple additive regression trees and the residual spatial association in the assault rates that cannot be explained in the model are smoothed using a conditional autoregressive (CAR) method. We develop a two-stage algorithm that connects the nonparametric trees with CAR to look for important covariates associated with the assault rates, while taking into account the spatial association of assault rates in adjacent census tracts. The proposed method is applied to the Los Angeles assault data (1990-1999). To assess the efficiency of the method, the results are compared with those obtained from a hierarchical linear model.


Spatial and Spatio-temporal Epidemiology | 2012

Multilevel spatiotemporal change-point models for evaluating the effect of an alcohol outlet control policy on changes in neighborhood assaultive violence rates.

Yanjun Xu; Qingzhao Yu; Richard Scribner; Katherine P. Theall; Scott Scribner; Neal Simonsen

Many previous studies have suggested a link between alcohol outlets and assaultive violence rates. In 1997 the City of New Orleans adopted a series of policies, e.g., increased license fee, additional enforcement staff, and expanded powers for the alcohol license board. The policies were specifically enacted to address the proliferation of problem alcohol outlets believed to be the source of a variety of social problems including assaultive violence. In this research, we evaluate the impact of a city level policy in New Orleans to address the problem alcohol outlets and their influence on assaultive violence. The spatial association between rates of assaultive violence at the census tract level (n=170) over a ten year period raises a challenge in statistical analysis. To meet this challenge we developed a hierarchical change-point model that controls for important covariates of assaultive violence and accounts for unexplained spatial and temporal variability. While our model is somewhat complex, its hierarchical Bayesian analysis is accessible via the WinBUGS software program. Keeping other effects fixed, the implementation of the new city level policy was associated with a decrease in the positive association between census tract level rates of assaultive violence and alcohol outlet density. Comparing several candidate change-point models using the DIC criterion, the positive association began decreasing the year of the policy implementation. The magnitude of the association continued to decrease for roughly two years and then stabilized. We also created maps of the fitted assaultive violence rates in New Orleans, as well as spatial residual maps which, together with Morans Is, suggest that the spatial variation of the data is well accounted for by our model. We reach the conclusion that the implementation of the policy is associated with a significant decrease in the positive relationship between assaultive violence and the off-sale alcohol outlet density.


Journal of Applied Statistics | 2011

Weighted bagging: a modification of AdaBoost from the perspective of importance sampling

Qingzhao Yu

We motivate the success of AdaBoost (ADA) in classification problems by appealing to an importance sampling perspective. Based on this insight, we propose the Weighted Bagging (WB) algorithm, a regularization method that naturally extends ADA to solve both classification and regression problems. WB uses a part of the available data to build models, and a separate part to modify the weights of observations. The method is used with categorical and regression tress and is compared with ADA, Boosting, Bagging, Random Forest and Support Vector Machine. We apply these methods to some real data sets and report some results of simulations. These applications and simulations show the effectiveness of WB.


Spatial and Spatio-temporal Epidemiology | 2017

Exploring racial disparity in obesity: A mediation analysis considering geo-coded environmental factors

Qingzhao Yu; Richard Scribner; Claudia Leonardi; Lu Zhang; Chi Park; Liwei Chen; Neal Simonsen

Research shows aconsistent racial disparity in obesity between white and black adults in the United States. Accounting for the disparity is a challenge given the variety of the contributing factors, the nature of the association, and the multilevel relationships among the factors. We used the multivariable mediation analysis (MMA) method to explore the racial disparity in obesity considering not only the individual behavior but also geospatially derived environmental risk factors. Results from generalized linear models (GLM) were compared with those from multiple additive regression trees (MART) which allow for hierarchical data structure, and fitting of nonlinear and complex interactive relationships. As results, both individual and geographically defined factors contributed to the racial disparity in obesity. MART performed better than GLM models in that MART explained a larger proportion of the racial disparity in obesity. However, there remained disparities that cannot be explained by factors collected in this study.


The Annals of Applied Statistics | 2011

Bayesian Synthesis: Combining subjective analyses, with an application to ozone data

Qingzhao Yu; Steven N. MacEachern; Mario Peruggia

Bayesian model averaging enables one to combine the disparate predictions of a number of models in a coherent fashion, leading to superior predictive performance. The improvement in performance arises from averaging models that make different predictions. In this work, we tap into perhaps the biggest driver of different predictions— different analysts—in order to gain the full benefits of model averaging. In a standard implementation of our method, several data analysts work independently on portions of a data set, eliciting separate models which are eventually updated and combined through a specific weighting method. We call this modeling procedure Bayesian Synthesis. The methodology helps to alleviate concerns about the sizable gap between the foundational underpinnings of the Bayesian paradigm and the practice of Bayesian statistics. In experimental work we show that human modeling has predictive performance superior to that of many automatic modeling techniques, including AIC, BIC, Smoothing Splines, CART, Bagged CART, Bayes CART, BMA and LARS, and only slightly inferior to that of BART. We also show that Bayesian Synthesis further improves predictive performance. Additionally, we examine the predictive performance of a simple average across analysts, which we dub Convex Synthesis, and find that it also produces an improvement. Compared to competing modeling methods (including single human analysis), the data-splitting approach has these additional benefits: (1) it exhibits superior predictive performance for real data sets; (2) it makes more efficient use of human knowledge; (3) it avoids multiple uses of the data in the Bayesian framework: and (4) it provides better calibrated assessment of predictive accuracy.


Journal of Applied Statistics | 2011

Spatio-temporal analysis of a plant disease in a non-uniform crop: a Monte Carlo approach

Bin Li; R. S. Sanderlin; Rebecca A. Melanson; Qingzhao Yu

Identification of the type of disease pattern and spread in a field is critical in epidemiological investigations of plant diseases. For example, an aggregation pattern of infected plants suggests that, at the time of observation, the pathogen is spreading from a proximal source. Conversely, a random pattern suggests a lack of spread from a proximal source. Most of the existing methods of spatial pattern analysis work with only one variety of plant at each location and with uniform genetic disease susceptibility across the field. Pecan orchards, used in this study, and other orchard crops are usually composed of different varieties with different levels of susceptibility to disease. A new measure is suggested to characterize the spatio-temporal transmission patterns of disease; a Monte Carlo test procedure is proposed to test whether the transmission of disease is random or aggregated. In addition, we propose a mixed-transmission model, which allows us to quantify the degree of aggregation effect.


American Journal of Preventive Medicine | 2017

Street Connectivity and Obesity Risk: Evidence From Electronic Health Records

Claudia Leonardi; Neal Simonsen; Qingzhao Yu; Chi Park; Richard Scribner

INTRODUCTION This study aimed to determine the feasibility of using electronic health record (EHR) data from a federally qualified health center (FQHC) to assess the association between street connectivity, a measure of walkability for the local environment, and BMI obtained from EHRs. METHODS The study included patients who visited Daughters of Charity clinics in 2012-2013. A total of 31,297 patients were eligible, of which 28,307 were geocoded. BMI and sociodemographic information were compiled into a de-identified database. The street connectivity measure was intersection density, calculated as the number of three-way or greater intersections per unit area. Multilevel analyses of BMI, measured on 17,946 patients who were aged ≥20 years, not pregnant, had complete sociodemographic information, and a BMI value that was not considered an outlier, were conducted using random intercept models. RESULTS Overall, on average, patients were aged 44.1 years, had a BMI of 30.2, and were mainly non-Hispanic black (59.4%). An inverse association between BMI and intersection density was observed in multilevel models controlling for age, gender, race, and marital status. Tests for multiple interactions were conducted and a significant interaction between race and intersection density indicated the decrease in BMI was strongest for non-Hispanic whites (decreased by 2) compared with blacks or Hispanics (decreased by 0.6) (p=0.0121). CONCLUSIONS EHRs were successfully used to assess the relationship between street connectivity and BMI in a multilevel framework. Increasing street connectivity levels measured as intersection density were inversely associated with directly measured BMI obtained from EHRs, demonstrating the feasibility of the approach.


Statistical Methods in Medical Research | 2017

A Bayesian sequential design using alpha spending function to control type I error.

Han Zhu; Qingzhao Yu

We propose in this article a Bayesian sequential design using alpha spending functions to control the overall type I error in phase III clinical trials. We provide algorithms to calculate critical values, power, and sample sizes for the proposed design. Sensitivity analysis is implemented to check the effects from different prior distributions, and conservative priors are recommended. We compare the power and actual sample sizes of the proposed Bayesian sequential design with different alpha spending functions through simulations. We also compare the power of the proposed method with frequentist sequential design using the same alpha spending function. Simulations show that, at the same sample size, the proposed method provides larger power than the corresponding frequentist sequential design. It also has larger power than traditional Bayesian sequential design which sets equal critical values for all interim analyses. When compared with other alpha spending functions, O’Brien-Fleming alpha spending function has the largest power and is the most conservative in terms that at the same sample size, the null hypothesis is the least likely to be rejected at early stage of clinical trials. And finally, we show that adding a step of stop for futility in the Bayesian sequential design can reduce the overall type I error and reduce the actual sample sizes.

Collaboration


Dive into the Qingzhao Yu's collaboration.

Top Co-Authors

Avatar

Richard Scribner

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar

Bin Li

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar

Neal Simonsen

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Denise M. Danos

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar

Claudia Leonardi

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Han Zhu

Pharmaceutical Product Development

View shared research outputs
Top Co-Authors

Avatar

Mei-Chin Hsieh

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge