Huazhen Lin
Southwestern University of Finance and Economics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Huazhen Lin.
Annals of Statistics | 2006
Jianqing Fan; Huazhen Lin; Yong Zhou
This paper considers a proportional hazards model, which allows one to examine the extent to which covariates interact nonlinearly with an exposure variable, for analysis of lifetime data. A local partial-likelihood technique is proposed to estimate nonlinear interactions. Asymptotic normality of the proposed estimator is established. The baseline hazard function, the bias and the variance of the local likelihood estimator are consistently estimated. In addition, a one-step local partial-likelihood estimator is presented to facilitate the computation of the proposed procedure and is demonstrated to be as efficient as the fully iterated local partial-likelihood estimator. Furthermore, a penalized local likelihood estimator is proposed to select important risk variables in the model. Numerical examples are used to illustrate the effectiveness of the proposed procedures.
Computational Statistics & Data Analysis | 2013
Huazhen Lin; Heng Peng
The maximum rank correlation (MRC) approach is the most common method used in the literature to estimate the regression coefficients in the semiparametric linear transformation regression model. However, the objective function G n ( s ) in the MRC approach is not continuous. The optimization of G n ( s ) requires an extensive search for which the computational cost grows in the order of n d , where d is the dimension of X . Given the lack of smoothing, issues related to variable selection, the variance estimate and other inferences by MRC are not well developed in the model. In this paper, we combine the concept underlying the penalized method, rank correlation and smoothing technique and propose a nonconcave penalized smoothed rank correlation method to select variables and estimate parameters for the semiparametric linear transformation model. The proposed estimator is computationally simple, n 1 / 2 - consistent and asymptotically normal. A sandwich formula is proposed to estimate the variances of the proposed estimates. We also illustrate the usefulness of the methodology with real data from a body fat prediction study.
Statistics in Medicine | 2008
Xiao Hua Zhou; Huazhen Lin
In this paper, we propose a new semi-parametric maximum likelihood (ML) estimate of a receiver operating characteristic (ROC) curve that satisfies the property of invariance of the ROC curve and is easy to compute. We show that our new estimator is sqrt[n]-consistent and has an asymptotically normal distribution. Our extensive simulation studies show that the proposed method is efficient and robust. Finally, we illustrate the application of the proposed estimator in a real data set.
Statistics in Medicine | 2009
Huazhen Lin; Danping Liu; Xiao Hua Zhou
The missing data problem is common in longitudinal or hierarchical structure studies. In this paper, we propose a correlated random-effects model to fit normal longitudinal or cluster data when the missingness mechanism is nonignorable. Computational challenges arise in the model fitting due to intractable numerical integrations. We obtain the estimates of the parameters based on an accurate approximation of the log likelihood, which has higher-order accuracy but with less computational burden than the existing approximation. We apply the proposed method it to a real data set arising from an autism study.
Bioinformatics | 2015
Kevin He; Yanming Li; J. Zhu; Hongliang Liu; Jeffrey E. Lee; Christopher I. Amos; Terry Hyslop; Jiashun Jin; Huazhen Lin; Qinyi Wei; Yi Li
MOTIVATION Technological advances that allow routine identification of high-dimensional risk factors have led to high demand for statistical techniques that enable full utilization of these rich sources of information for genetics studies. Variable selection for censored outcome data as well as control of false discoveries (i.e. inclusion of irrelevant variables) in the presence of high-dimensional predictors present serious challenges. This article develops a computationally feasible method based on boosting and stability selection. Specifically, we modified the component-wise gradient boosting to improve the computational feasibility and introduced random permutation in stability selection for controlling false discoveries. RESULTS We have proposed a high-dimensional variable selection method by incorporating stability selection to control false discovery. Comparisons between the proposed method and the commonly used univariate and Lasso approaches for variable selection reveal that the proposed method yields fewer false discoveries. The proposed method is applied to study the associations of 2339 common single-nucleotide polymorphisms (SNPs) with overall survival among cutaneous melanoma (CM) patients. The results have confirmed that BRCA2 pathway SNPs are likely to be associated with overall survival, as reported by previous literature. Moreover, we have identified several new Fanconi anemia (FA) pathway SNPs that are likely to modulate survival of CM patients. AVAILABILITY AND IMPLEMENTATION The related source code and documents are freely available at https://sites.google.com/site/bestumich/issues. CONTACT [email protected].
Statistica Sinica | 2012
Huazhen Lin; Xiao Hua Zhou; Gang Li
In this article, we study a direct receiver operating characteristic (ROC) curve regression model with completely unknown link and baseline functions. A semiparametric procedure is proposed to estimate both the parametric and non-parametric components of the model. The resulting parameter estimates and ROC curve estimates are shown to be consistent and asymptotically normal with a n -1/2 convergence rate. With arbitrary link and baseline functions, our model is more robust than existing direct ROC regression models that require either complete or partially complete specification of the link and baseline functions. Moreover, the robustness of our new method is gained at little cost to efficiency, as evidenced by the parametric convergence rate of our estimators and by the simulation study. An illustrative example is given using a hearing test data set.
Statistics in Medicine | 2008
Huazhen Lin; Paul S. F. Yip; Richard M. Huggins
This paper proposes a double-nonparametric procedure to estimate the population size for reporting delay data without the specification of the incidence function and the delay distribution function. Asymptotic results of the proposed estimator are given. Simulation studies show that the proposed procedure works better than the existing estimating procedures. The method has been applied to a suicide reporting system in Hong Kong for improving monitoring and surveillance purposes.
Computational Statistics & Data Analysis | 2013
Huazhen Lin; Yi Li; Ming Tan
In practice, when both survival and quantitative outcomes are of interest, we encounter outcomes of mixed type: a censored outcome and a quantitative outcome. Joint modeling of the survival and quantitative outcomes rather than analyzing the outcomes separately has become a method of choice for analyzing mixed outcome data because of improved efficiency. However, the joint modeling provides two separate indexes for measuring the covariate (e.g., treatment) effect, making its interpretation difficult when the covariate inconsistently affects the quantitative and survival outcomes. By assigning a single rank to each outcome to represent the disease severity, this paper provides a unitary effect summary of the covariates on mixed outcome data while accounting for censoring. The method is applied to an analysis of the AIDS Clinical Trials Group protocol 175 (ACTG 175) data.
Statistics in Medicine | 2018
Ye He; Huazhen Lin; Dongsheng Tu
In this paper, we introduce a single-index threshold Cox proportional hazard model to select and combine biomarkers to identify patients who may be sensitive to a specific treatment. A penalized smoothed partial likelihood is proposed to estimate the parameters in the model. A simple, efficient, and unified algorithm is presented to maximize this likelihood function. The estimators based on this likelihood function are shown to be consistent and asymptotically normal. Under mild conditions, the proposed estimators also achieve the oracle property. The proposed approach is evaluated through simulation analyses and application to the analysis of data from two clinical trials, one involving patients with locally advanced or metastatic pancreatic cancer and one involving patients with resectable lung cancer.
Computational Statistics & Data Analysis | 2018
Kevin He; Jian Kang; Hyokyoung Grace Hong; J. Zhu; Yanming Li; Huazhen Lin; Han Xu; Yi Li
Modern bio-technologies have produced a vast amount of high-throughput data with the number of predictors far greater than the sample size. In order to identify more novel biomarkers and understand biological mechanisms, it is vital to detect signals weakly associated with outcomes among ultrahigh-dimensional predictors. However, existing screening methods, which typically ignore correlation information, are likely to miss weak signals. By incorporating the inter-feature dependence, a covariance-insured screening approach is proposed to identify predictors that are jointly informative but marginally weakly associated with outcomes. The validity of the method is examined via extensive simulations and a real data study for selecting potential genetic factors related to the onset of multiple myeloma.