Yichao Wu
North Carolina State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yichao Wu.
Journal of the American Statistical Association | 2007
Yichao Wu; Yufeng Liu
The support vector machine (SVM) has been widely applied for classification problems in both machine learning and statistics. Despite its popularity, however, SVM has some drawbacks in certain situations. In particular, the SVM classifier can be very sensitive to outliers in the training sample. Moreover, the number of support vectors (SVs) can be very large in many applications. To circumvent these drawbacks, we propose the robust truncated hinge loss SVM (RSVM), which uses a truncated hinge loss. The RSVM is shown to be more robust to outliers and to deliver more accurate classifiers using a smaller set of SVs than the standard SVM. Our theoretical results show that the RSVM is Fisher-consistent, even when there is no dominating class, a scenario that is particularly challenging for multicategory classification. Similar results are obtained for a class of margin-based classifiers.
Journal of the American Statistical Association | 2012
Lan Wang; Yichao Wu; Runze Li
Ultra-high dimensional data often display heterogeneity due to either heteroscedastic variance or other forms of non-location-scale covariate effects. To accommodate heterogeneity, we advocate a more general interpretation of sparsity, which assumes that only a small number of covariates influence the conditional distribution of the response variable, given all candidate covariates; however, the sets of relevant covariates may differ when we consider different segments of the conditional distribution. In this framework, we investigate the methodology and theory of nonconvex, penalized quantile regression in ultra-high dimension. The proposed approach has two distinctive features: (1) It enables us to explore the entire conditional distribution of the response variable, given the ultra-high-dimensional covariates, and provides a more realistic picture of the sparsity pattern; (2) it requires substantially weaker conditions compared with alternative methods in the literature; thus, it greatly alleviates the difficulty of model checking in the ultra-high dimension. In theoretic development, it is challenging to deal with both the nonsmooth loss function and the nonconvex penalty function in ultra-high-dimensional parameter space. We introduce a novel, sufficient optimality condition that relies on a convex differencing representation of the penalized loss function and the subdifferential calculus. Exploring this optimality condition enables us to establish the oracle property for sparse quantile regression in the ultra-high dimension under relaxed conditions. The proposed method greatly enhances existing tools for ultra-high-dimensional data analysis. Monte Carlo simulations demonstrate the usefulness of the proposed procedure. The real data example we analyzed demonstrates that the new approach reveals substantially more information as compared with alternative methods. This article has online supplementary material.
PLOS ONE | 2009
Carla M. P. Ribeiro; Harry L. Hurd; Yichao Wu; Mary E. B. Martino; Lisa Jones; Brian Brighton; Richard C. Boucher; Wanda K. O'Neal
Prolonged macrolide antibiotic therapy at low doses improves clinical outcome in patients affected with diffuse panbronchiolitis and cystic fibrosis. Consensus is building that the therapeutic effects are due to anti-inflammatory, rather than anti-microbial activities, but the mode of action is likely complex. To gain insights into how the macrolide azithromycin (AZT) modulates inflammatory responses in airways, well-differentiated primary cultures of human airway epithelia were exposed to AZT alone, an inflammatory stimulus consisting of soluble factors from cystic fibrosis airways, or AZT followed by the inflammatory stimulus. RNA microarrays were conducted to identify global and specific gene expression changes. Analysis of gene expression changes revealed that the AZT treatment alone altered the gene profile of the cells, primarily by significantly increasing the expression of lipid/cholesterol genes and decreasing the expression of cell cycle/mitosis genes. The increase in cholesterol biosynthetic genes was confirmed by increased filipin staining, an index of free cholesterol, after AZT treatment. AZT also affected genes with inflammatory annotations, but the effect was variable (both up- and down-regulation) and gene specific. AZT pretreatment prevented the up-regulation of some genes, such as MUC5AC and MMP9, triggered by the inflammatory stimulus, but the up-regulation of other inflammatory genes, e.g., cytokines and chemokines, such as interleukin-8, was not affected. On the other hand, HLA genes were increased by AZT. Notably, secreted IL-8 protein levels did not reflect mRNA levels, and were, in fact, higher after AZT pretreatment in cultures exposed to the inflammatory stimulus, suggesting that AZT can affect inflammatory pathways other than by altering gene expression. These findings suggest that the specific effects of AZT on inflamed and non-inflamed airway epithelia are likely relevant to its clinical activity, and their apparent complexity may help explain the diverse immunomodulatory roles of macrolides.
Journal of the American Statistical Association | 2008
Jianqing Fan; Yichao Wu
Estimation of longitudinal data covariance structure poses significant challenges because the data usually are collected at irregular time points. A viable semiparametric model for covariance matrixes has been proposed that allows one to estimate the variance function nonparametrically and to estimate the correlation function parametrically by aggregating information from irregular and sparse data points within each subject. But the asymptotic properties of the quasi-maximum likelihood estimator (QMLE) of parameters in the covariance model are largely unknown. We address this problem in the context of more general models for the conditional mean function, including parametric, nonparametric, or semiparametric. We also consider the possibility of rough mean regression function and introduce the difference-based method to reduce biases in the context of varying-coefficient partially linear mean regression models. This provides a more robust estimator of the covariance function under a wider range of situations. Under some technical conditions, consistency and asymptotic normality are obtained for the QMLE of the parameters in the correlation function. Simulation studies and a real data example are used to illustrate the proposed approach.
Journal of Computational and Graphical Statistics | 2007
Yufeng Liu; Yichao Wu
Variable selection is an important aspect of high-dimensional statistical modeling, particularly in regression and classification. In the regularization framework, various penalty functions are used to perform variable selection by putting relatively large penalties on small coefficients. The L1 penalty is a popular choice because of its convexity, but it produces biased estimates for the large coefficients. The L0 penalty is attractive for variable selection because it directly penalizes the number of non zero coefficients. However, the optimization involved is discontinuous and non convex, and therefore it is very challenging to implement. Moreover, its solution may not be stable. In this article, we propose a new penalty that combines the L0 and L1 penalties. We implement this new penalty by developing a global optimization algorithm using mixed integer programming (MIP). We compare this combined penalty with several other penalties via simulated examples as well as real applications. The results show that the new penalty outperforms both the L0 and L1 penalties in terms of variable selection while maintaining good prediction accuracy.
Electronic Journal of Statistics | 2008
Hao Helen Zhang; Yufeng Liu; Yichao Wu; Ji Zhu
The Support Vector Machine (SVM) is a popular classification paradigm in machine learning and has achieved great success in real applications. However, the standard SVM can not select variables automatically and therefore its solution typically utilizes all the input variables without discrimination. This makes it difficult to identify important predictor variables, which is often one of the primary goals in data analysis. In this paper, we propose two novel types of regularization in the context of the multicategory SVM (MSVM) for simultaneous classification and variable selection. The MSVM generally requires estimation of multiple discriminating functions and applies the argmax rule for prediction. For each individual variable, we propose to characterize its importance by the supnorm of its coefficient vector associated with different functions, and then minimize the MSVM hinge loss function subject to a penalty on the sum of supnorms. To further improve the supnorm penalty, we propose the adaptive regularization, which allows different weights imposed on different variables according to their relative importance. Both types of regularization automate variable selection in the process of building classifiers, and lead to sparse multi-classifiers with enhanced interpretability and improved accuracy, especially for high dimensional low sample size data. One big advantage of the supnorm penalty is its easy implementation via standard linear programming. Several simulated examples and one real gene data analysis demonstrate the outstanding performance of the adaptive supnorm penalty in various data settings.
arXiv: Machine Learning | 2010
Jianqing Fan; Yang Feng; Yichao Wu
Variable selection in high dimensional space has challenged many contemporary statistical problems from many frontiers of scientific disciplines. Recent technological advances have made it possible to collect a huge amount of covariate information such as microarray, proteomic and SNP data via bioimaging technology while observing survival information on patients in clin- ical studies. Thus, the same challenge applies in survival analysis in order to understand the association between genomics information and clinical infor- mation about the survival time. In this work, we extend the sure screening procedure (6) to Coxs proportional hazards model with an iterative version available. Numerical simulation studies have shown encouraging performance of the proposed method in comparison with other techniques such as LASSO. This demonstrates the utility and versatility of the iterative sure independence screening scheme.
Journal of the American Statistical Association | 2011
Yufeng Liu; Hao Helen Zhang; Yichao Wu
Margin-based classifiers have been popular in both machine learning and statistics for classification problems. Among numerous classifiers, some are hard classifiers while some are soft ones. Soft classifiers explicitly estimate the class conditional probabilities and then perform classification based on estimated probabilities. In contrast, hard classifiers directly target the classification decision boundary without producing the probability estimation. These two types of classifiers are based on different philosophies and each has its own merits. In this article, we propose a novel family of large-margin classifiers, namely large-margin unified machines (LUMs), which covers a broad range of margin-based classifiers including both hard and soft ones. By offering a natural bridge from soft to hard classification, the LUM provides a unified algorithm to fit various classifiers and hence a convenient platform to compare hard and soft classification. Both theoretical consistency and numerical performance of LUMs are explored. Our numerical study sheds some light on the choice between hard and soft classifiers in various classification problems.
Annals of Statistics | 2013
Jinyuan Chang; Cheng Yong Tang; Yichao Wu
We study a marginal empirical likelihood approach in scenarios when the number of variables grows exponentially with the sample size. The marginal empirical likelihood ratios as functions of the parameters of interest are systematically examined, and we find that the marginal empirical likelihood ratio evaluated at zero can be used to differentiate whether an explanatory variable is contributing to a response variable or not. Based on this finding, we propose a unified feature screening procedure for linear models and the generalized linear models. Different from most existing feature screening approaches that rely on the magnitudes of some marginal estimators to identify true signals, the proposed screening approach is capable of further incorporating the level of uncertainties of such estimators. Such a merit inherits the self-studentization property of the empirical likelihood approach, and extends the insights of existing feature screening methods. Moreover, we show that our screening approach is less restrictive to distributional assumptions, and can be conveniently adapted to be applied in a broad range of scenarios such as models specified using general moment conditions. Our theoretical results and extensive numerical examples by simulations and data analysis demonstrate the merits of the marginal empirical likelihood approach.
Journal of Nonparametric Statistics | 2011
Yufeng Liu; Yichao Wu
Quantile regression (QR) is a very useful statistical tool for learning the relationship between the response variable and covariates. For many applications, one often needs to estimate multiple conditional quantile functions of the response variable given covariates. Although one can estimate multiple quantiles separately, it is of great interest to estimate them simultaneously. One advantage of simultaneous estimation is that multiple quantiles can share strength among them to gain better estimation accuracy than individually estimated quantile functions. Another important advantage of joint estimation is the feasibility of incorporating simultaneous non-crossing constraints of QR functions. In this paper, we propose a new kernel-based multiple QR estimation technique, namely simultaneous non-crossing quantile regression (SNQR). We use kernel representations for QR functions and apply constraints on the kernel coefficients to avoid crossing. Both unregularised and regularised SNQR techniques are considered. Asymptotic properties such as asymptotic normality of linear SNQR and oracle properties of the sparse linear SNQR are developed. Our numerical results demonstrate the competitive performance of our SNQR over the original individual QR estimation.