Yan D. Zhao
University of Texas Southwestern Medical Center
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yan D. Zhao.
Statistics in Biopharmaceutical Research | 2010
Yan D. Zhao; Alex Dmitrienko; Roy N. Tamura
This article deals with clinical trials with a sensitive subpopulation of patients, that is, a subgroup that is more likely to benefit from the treatment than the overall population. Given a sensitive subgroup defined by a prespecified classifier, for example, a clinical marker or pharmacogenomic marker, the trial’s outcome is declared positive if the treatment effect is established in the overall population or in the subgroup. We provide a summary of key considerations in clinical trials with a sensitive subgroup, including multiplicity and enrichment adjustments as well as optimality considerations in the analysis strategy. The methodology proposed in this article is illustrated using a neuroscience clinical trial and its operating characteristics are assessed via a simulation study.
Model Assisted Statistics and Applications | 2012
Dewi Rahardja; Yan D. Zhao
Misclassified binary data result from using a fallible classifier for classifying units into two categories. If an infallible classifier is also available, a random subsample of this misclassified data can be further classified using the infallible classifier. For such data, the existing methods for exact confidence interval are too conservative, and the existing Bayesian credible interval suffers from computational difficulty. We derive a closed-form Bayesian algorithm which draws a posterior sample of the proportion parameter from the exact marginal posterior distribution. Our simulations show that our Bayesian algorithm is easy to implement and has nominal coverage.
Journal of Statistical Computation and Simulation | 2013
Dewi Rahardja; Yan D. Zhao
We consider data with a nominal grouping variable and a binary response variable. The grouping variable is measured without error, but the response variable is measured using a fallible device subject to misclassification. To achieve model identifiability, we use the double-sampling scheme which requires obtaining a subsample of the original data or another independent sample. This sample is then classified by both the fallible device and another infallible device regarding the response variable. We propose two Wald tests for testing the association between the two variables and illustrate the test using traffic data. The Type-I error rate and power of the tests are examined using simulations and a modified Wald test is recommended.
Communications in Statistics - Simulation and Computation | 2010
Dewi Rahardja; Yan D. Zhao; Hao Helen Zhang
We propose a fully Bayesian model with a non-informative prior for analyzing misclassified binary data with a validation substudy. In addition, we derive a closed-form algorithm for drawing all parameters from the posterior distribution and making statistical inference on odds ratios. Our algorithm draws each parameter from a beta distribution, avoids the specification of initial values, and does not have convergence issues. We apply the algorithm to a data set and compare the results with those obtained by other methods. Finally, the performance of our algorithm is assessed using simulation studies.
Model Assisted Statistics and Applications | 2012
Yan D. Zhao; Dewi Rahardja; Alex Dmitrienko
The van Elteren (vE) test, a stratified Wilcoxon-Mann-Whitney test, is a widely used nonparametric method for comparing two treatments adjusting for stratum effects. Although the vE test produces a p-value for testing the null hypothesis of no treatment effect, additional heuristic methods need to be used to determine which treatment is better. Moreover, such heuristic methods may lead to inconclusive decisions in some situations. Furthermore, it is unclear on how to quantify treatment effect size when the van Elteren test is used. In this paper, we define a competing probability (CP) inherently related to the vE test and derive point and interval estimators for CP. The CP serves as an effect size measure and can be used to determine which treatment is better by comparing the CP with 0.5.
Journal of Statistical Computation and Simulation | 2012
Dewi Rahardja; Yan D. Zhao
We consider two-sample binary data consisting of two independent studies. One study is the main study where individuals are classified using a fallible classifier prone to error; the other study is a validation study where individuals are classified using both the fallible classifier and a gold standard which does not misclassify individuals. For such data, we propose a Bayesian model for making statistical inference for all model parameters and particularly the risk difference. We derive a closed-form algorithm for sampling from the posterior distribution. We then illustrate our algorithm using a real data example and conduct Monte Carlo simulation studies to show that our algorithm performs very well under various scenarios.
Model Assisted Statistics and Applications | 2011
Dewi Rahardja; Yan D. Zhao; Xian Jin Xie
We consider the problem of constructing confidence interval s (CIs) for the blending coefficient of different liquid, suc h as the blended underground storage tank (UST) leak data for compliance. For this problem, confidence intervals based on F iellers Method have been proposed. This method utilizes a blending coefficient estimator which is a ratio of two correlated norma l random variables. However, this method assumes normally distributed random errors in the UST leak model and therefore may be inappropriate for the UST leak data which typically have heavy-tailed empirical distributions. In this paper we develo p a Bayesian approach assuming non-normal random errors with the Power Exponential Distribution (PED). A real-data example using Cary blended site data is given to illustrate both the Fiellers C Is and the Bayesian credible intervals. Monte Carlo simulations are conducted to compare the coverage probability and average width of CIs for both methods. For data with heavy-tailed distributions, the simulations show that both Fiellers an d Bayesian intervals perform adequately in terms of coverage. However, Bayesian intervals perform better in terms of yielding CIs with shorter expected width.
Model Assisted Statistics and Applications | 2011
Dewi Rahardja; Yan D. Zhao; Hongmei Zhang
We consider misclassified binary data with a validation substudy. For such data various methods have been developed for estimating the odds ratio. It is well-known that the maximum likelihood estimator (MLE) of the odds ratio is efficient but requires iterative algorithms to compute. In this article, we derive a closed-form formula for the MLE and its asymptotic standard error. We compute the closed-form MLE on a data set that has been analyzed by other methods, and the results are compared.
Model Assisted Statistics and Applications | 2011
Yan D. Zhao; Dewi Rahardja
Structural equation models (SEM) are widely used in many fields including economics and social science. Typical nonlinear SEMs consist of two parts: a linear measurement model relating observed measurements to underlying latent variables, and a nonlinear structural model describing relationships among the latent variables. For such models, we propose a pseudo likelihood approach based on a hypothetical normal mixture assumption on the latent variables. To obtain pseudo likelihood parameter estimates, a Monte Carlo EM algorithm is developed. Standard errors for the structural parameter estimates are obtained by combining an empirical observed information matrix and a bootstrap estimated covariance matrix. For nonlinear SEMs with latent variables with various distributions, we conduct simulations to show our approach produced unbiased parameter estimates and confidence intervals with nominal coverage.
Statistical Methodology | 2011
Dewi Rahardja; Yan D. Zhao