Yingqi Zhao
University of Wisconsin-Madison
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Yingqi Zhao.
Journal of the American Statistical Association | 2012
Yingqi Zhao; Donglin Zeng; A. John Rush; Michael R. Kosorok
There is increasing interest in discovering individualized treatment rules (ITRs) for patients who have heterogeneous responses to treatment. In particular, one aims to find an optimal ITR that is a deterministic function of patient-specific characteristics maximizing expected clinical outcome. In this article, we first show that estimating such an optimal treatment rule is equivalent to a classification problem where each subject is weighted proportional to his or her clinical outcome. We then propose an outcome weighted learning approach based on the support vector machine framework. We show that the resulting estimator of the treatment rule is consistent. We further obtain a finite sample bound for the difference between the expected outcome using the estimated ITR and that of the optimal treatment rule. The performance of the proposed approach is demonstrated via simulation studies and an analysis of chronic depression data.
Journal of the American Statistical Association | 2015
Yingqi Zhao; Donglin Zeng; Eric B. Laber; Michael R. Kosorok
Dynamic treatment regimes (DTRs) are sequential decision rules for individual patients that can adapt over time to an evolving illness. The goal is to accommodate heterogeneity among patients and find the DTR which will produce the best long-term outcome if implemented. We introduce two new statistical learning methods for estimating the optimal DTR, termed backward outcome weighted learning (BOWL), and simultaneous outcome weighted learning (SOWL). These approaches convert individualized treatment selection into an either sequential or simultaneous classification problem, and can thus be applied by modifying existing machine learning techniques. The proposed methods are based on directly maximizing over all DTRs a nonparametric estimator of the expected long-term outcome; this is fundamentally different than regression-based methods, for example, Q-learning, which indirectly attempt such maximization and rely heavily on the correctness of postulated regression models. We prove that the resulting rules are consistent, and provide finite sample bounds for the errors using the estimated rules. Simulation results suggest the proposed methods produce superior DTRs compared with Q-learning especially in small samples. We illustrate the methods using data from a clinical trial for smoking cessation. Supplementary materials for this article are available online.
Biometrics | 2013
Bibhas Chakraborty; Eric B. Laber; Yingqi Zhao
A dynamic treatment regime consists of a set of decision rules that dictate how to individualize treatment to patients based on available treatment and covariate history. A common method for estimating an optimal dynamic treatment regime from data is Q-learning which involves nonsmooth operations of the data. This nonsmoothness causes standard asymptotic approaches for inference like the bootstrap or Taylor series arguments to breakdown if applied without correction. Here, we consider the m-out-of-n bootstrap for constructing confidence intervals for the parameters indexing the optimal dynamic regime. We propose an adaptive choice of m and show that it produces asymptotically correct confidence sets under fixed alternatives. Furthermore, the proposed method has the advantage of being conceptually and computationally much simple than competing methods possessing this same theoretical property. We provide an extensive simulation study to compare the proposed method with currently available inference procedures. The results suggest that the proposed method delivers nominal coverage while being less conservative than alternatives. The proposed methods are implemented in the qLearn R-package and have been made available on the Comprehensive R-Archive Network (http://cran.r-project.org/). Analysis of the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study is used as an illustrative example.
Biometrics | 2015
Yaoyao Xu; Menggang Yu; Yingqi Zhao; Quefeng Li; Sijian Wang; Jun Shao
To facilitate comparative treatment selection when there is substantial heterogeneity of treatment effectiveness, it is important to identify subgroups that exhibit differential treatment effects. Existing approaches model outcomes directly and then define subgroups according to interactions between treatment and covariates. Because outcomes are affected by both the covariate-treatment interactions and covariate main effects, direct modeling outcomes can be hard due to model misspecification, especially in presence of many covariates. Alternatively one can directly work with differential treatment effect estimation. We propose such a method that approximates a target function whose value directly reflects correct treatment assignment for patients. The function uses patient outcomes as weights rather than modeling targets. Consequently, our method can deal with binary, continuous, time-to-event, and possibly contaminated outcomes in the same fashion. We first focus on identifying only directional estimates from linear rules that characterize important subgroups. We further consider estimation of comparative treatment effects for identified subgroups. We demonstrate the advantages of our method in simulation studies and in analyses of two real data sets.
Clinical Trials | 2014
Bibhas Chakraborty; Eric B. Laber; Yingqi Zhao
Background A dynamic treatment regime (DTR) comprises a sequence of decision rules, one per stage of intervention, that recommends how to individualize treatment to patients based on evolving treatment and covariate history. These regimes are useful for managing chronic disorders, and fit into the larger paradigm of personalized medicine. The Value of a DTR is the expected outcome when the DTR is used to assign treatments to a population of interest. Purpose The Value of a data-driven DTR, estimated using data from a Sequential Multiple Assignment Randomized Trial, is both a data-dependent parameter and a non-smooth function of the underlying generative distribution. These features introduce additional variability that is not accounted for by standard methods for conducting statistical inference, for example, the bootstrap or normal approximations, if applied without adjustment. Our purpose is to provide a feasible method for constructing valid confidence intervals (CIs) for this quantity of practical interest. Methods We propose a conceptually simple and computationally feasible method for constructing valid CIs for the Value of an estimated DTR based on subsampling. The method is self-tuning by virtue of an approach called the double bootstrap. We demonstrate the proposed method using a series of simulated experiments. Results The proposed method offers considerable improvement in terms of coverage rates of the CIs over the standard bootstrap approach. Limitations In this article, we have restricted our attention to Q-learning for estimating the optimal DTR. However, other methods can be employed for this purpose; to keep the discussion focused, we have not explored these alternatives. Conclusion Subsampling-based CIs provide much better performance compared to standard bootstrap for the Value of an estimated DTR.
Clinical Trials | 2014
Yingqi Zhao; Eric B. Laber
Background Recent advances in medical research suggest that the optimal treatment rules should be adaptive to patients over time. This has led to an increasing interest in studying dynamic treatment regime, a sequence of individualized treatment rules, one per stage of clinical intervention, which maps present patient information to a recommended treatment. There has been a recent surge of statistical work for estimating optimal dynamic treatment regimes from randomized and observational studies. The purpose of this article is to review recent methodological progress and applied issues associated with estimating optimal dynamic treatment regimes. Methods We discuss sequential multiple assignment randomized trials, a clinical trial design used to study treatment sequences. We use a common estimator of an optimal dynamic treatment regime that applies to sequential multiple assignment randomized trials data as a platform to discuss several practical and methodological issues. Results We provide a limited survey of practical issues associated with modeling sequential multiple assignment randomized trials data. We review some existing estimators of optimal dynamic treatment regimes and discuss practical issues associated with these methods including model building, missing data, statistical inference, and choosing an outcome when only non-responders are re-randomized. We mainly focus on the estimation and inference of dynamic treatment regimes using sequential multiple assignment randomized trials data. Dynamic treatment regimes can also be constructed from observational data, which may be easier to obtain in practice; however, care must be taken to account for potential confounding.
Journal of the American Statistical Association | 2011
Nivedita V. Nadkarni; Yingqi Zhao; Michael R. Kosorok
An inverse regression methodology for assessing predictor performance in the censored data setup is developed along with inference procedures and a computational algorithm. The technique developed here allows for conditioning on the unobserved failure time along with a weighting mechanism that accounts for the censoring. The implementation is nonparametric and computationally fast. This provides an efficient methodological tool that can be used especially in cases where the usual modeling assumptions are not applicable to the data under consideration. It can also be a good diagnostic tool that can be used in the model selection process. We have provided theoretical justification of consistency and asymptotic normality of the methodology. Simulation studies and two data analyses are provided to illustrate the practical utility of the procedure.
Journal of the American Statistical Association | 2016
Stanislav Minsker; Yingqi Zhao; Guang Cheng
Abstract Individualized treatment rules (ITRs) tailor treatments according to individual patient characteristics. They can significantly improve patient care and are thus becoming increasingly popular. The data collected during randomized clinical trials are often used to estimate the optimal ITRs. However, these trials are generally expensive to run, and, moreover, they are not designed to efficiently estimate ITRs. In this article, we propose a cost-effective estimation method from an active learning perspective. In particular, our method recruits only the “most informative” patients (in terms of learning the optimal ITRs) from an ongoing clinical trial. Simulation studies and real-data examples show that our active clinical trial method significantly improves on competing methods. We derive risk bounds and show that they support these observed empirical advantages. Supplementary materials for this article are available online.
Biometrics | 2014
Yingqi Zhao; Michael R. Kosorok
Kang, Janes and Huang propose an interesting boosting method to combine biomarkers for treatment selection. The method requires modeling the treatment effects using markers. We discuss an alternative method, outcome weighted learning. This method sidesteps the need for modeling the outcomes, and thus can be more robust to model misspecification.
Biometrics | 2011
Yingqi Zhao; Donglin Zeng; Amy H. Herring; Amy Ising; Anna E. Waller; David B. Richardson; Michael R. Kosorok
A real-time surveillance method is developed with emphasis on rapid and accurate detection of emerging outbreaks. We develop a model with relatively weak assumptions regarding the latent processes generating the observed data, ensuring a robust prediction of the spatiotemporal incidence surface. Estimation occurs via a local linear fitting combined with day-of-week effects, where spatial smoothing is handled by a novel distance metric that adjusts for population density. Detection of emerging outbreaks is carried out via residual analysis. Both daily residuals and AR model-based detrended residuals are used for detecting abnormalities in the data given that either a large daily residual or an increasing temporal trend in the residuals signals a potential outbreak, with the threshold for statistical significance determined using a resampling approach.
