Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Weixin Yao is active.

Publication


Featured researches published by Weixin Yao.


Journal of the American Statistical Association | 2009

Bayesian Mixture Labeling by Highest Posterior Density

Weixin Yao; Bruce G. Lindsay

A fundamental problem for Bayesian mixture model analysis is label switching, which occurs as a result of the nonidentifiability of the mixture components under symmetric priors. We propose two labeling methods to solve this problem. The first method, denoted by PM(ALG), is based on the posterior modes and an ascending algorithm generically denoted ALG. We use each Markov chain Monte Carlo sample as the starting point in an ascending algorithm, and label the sample based on the mode of the posterior to which it converges. Our natural assumption here is that the samples converged to the same mode should have the same labels. The PM(ALG) labeling method has some computational advantages over other popular labeling methods. Additionally, it automatically matches the “ideal” labels in the highest posterior density credible regions. The second method does labeling by maximizing the normal likelihood of the labeled Gibbs samples. Using a Monte Carlo simulation study and a real dataset, we demonstrate the success of our new methods in dealing with the label switching problem.


Computational Statistics & Data Analysis | 2012

Robust fitting of mixture regression models

Xiuqin Bai; Weixin Yao; John E. Boyer

The existing methods for fitting mixture regression models assume a normal distribution for error and then estimate the regression parameters by the maximum likelihood estimate (MLE). In this article, we demonstrate that the MLE, like the least squares estimate, is sensitive to outliers and heavy-tailed error distributions. We propose a robust estimation procedure and an EM-type algorithm to estimate the mixture regression models. Using a Monte Carlo simulation study, we demonstrate that the proposed new estimation method is robust and works much better than the MLE when there are outliers or the error distribution has heavy tails. In addition, the proposed robust method works comparably to the MLE when there are no outliers and the error is normal. A real data application is used to illustrate the success of the proposed robust estimation procedure.


Computational Statistics & Data Analysis | 2014

Robust mixture regression model fitting by Laplace distribution

Weixing Song; Weixin Yao; Yanru Xing

A robust estimation procedure for mixture linear regression models is proposed by assuming that the error terms follow a Laplace distribution. Using the fact that the Laplace distribution can be written as a scale mixture of a normal and a latent distribution, this procedure is implemented by an EM algorithm which incorporates two types of missing information from the mixture class membership and the latent variable. Finite sample performance of the proposed algorithm is evaluated by simulations. The proposed method is compared with other procedures, and a sensitivity study is also conducted based on a real data set.


Journal of Nonparametric Statistics | 2012

Local Modal Regression

Weixin Yao; Bruce G. Lindsay; Runze Li

A local modal estimation procedure is proposed for the regression function in a nonparametric regression model. A distinguishing characteristic of the proposed procedure is that it introduces an additional tuning parameter that is automatically selected using the observed data in order to achieve both robustness and efficiency of the resulting estimate. We demonstrate both theoretically and empirically that the resulting estimator is more efficient than the ordinary local polynomial regression (LPR) estimator in the presence of outliers or heavy-tail error distribution (such as t-distribution). Furthermore, we show that the proposed procedure is as asymptotically efficient as the LPR estimator when there are no outliers and the error distribution is a Gaussian distribution. We propose an expectation–maximisation-type algorithm for the proposed estimation procedure. A Monte Carlo simulation study is conducted to examine the finite sample performance of the proposed method. The simulation results confirm the theoretical findings. The proposed methodology is further illustrated via an analysis of a real data example.


Computational Statistics & Data Analysis | 2014

Robust mixture regression using the t-distribution

Weixin Yao; Yan Wei; Chun Yu

The traditional estimation of mixture regression models is based on the normal assumption of component errors and thus is sensitive to outliers or heavy-tailed errors. A robust mixture regression model based on the t-distribution by extending the mixture of t-distributions to the regression setting is proposed. However, this proposed new mixture regression model is still not robust to high leverage outliers. In order to overcome this, a modified version of the proposed method, which fits the mixture regression based on the t-distribution to the data after adaptively trimming high leverage points, is also proposed. Furthermore, it is proposed to adaptively choose the degrees of freedom for the t-distribution using profile likelihood. The proposed robust mixture regression estimate has high efficiency due to the adaptive choice of degrees of freedom.


Statistics and Computing | 2012

Model based labeling for mixture models

Weixin Yao

Label switching is one of the fundamental problems for Bayesian mixture model analysis. Due to the permutation invariance of the mixture posterior, we can consider that the posterior of a m-component mixture model is a mixture distribution with m! symmetric components and therefore the object of labeling is to recover one of the components. In order to do labeling, we propose to first fit a symmetric m!-component mixture model to the Markov chain Monte Carlo (MCMC) samples and then choose the label for each sample by maximizing the corresponding classification probabilities, which are the probabilities of all possible labels for each sample. Both parametric and semi-parametric ways are proposed to fit the symmetric mixture model for the posterior. Compared to the existing labeling methods, our proposed method aims to approximate the posterior directly and provides the labeling probabilities for all possible labels and thus has a model explanation and theoretical support. In addition, we introduce a situation in which the “ideally” labeled samples are available and thus can be used to compare different labeling methods. We demonstrate the success of our new method in dealing with the label switching problem using two examples.


Journal of the American Statistical Association | 2012

Mixture of Regression Models With Varying Mixing Proportions: A Semiparametric Approach

Mian Huang; Weixin Yao

In this article, we study a class of semiparametric mixtures of regression models, in which the regression functions are linear functions of the predictors, but the mixing proportions are smoothing functions of a covariate. We propose a one-step backfitting estimation procedure to achieve the optimal convergence rates for both regression parameters and the nonparametric functions of mixing proportions. We derive the asymptotic bias and variance of the one-step estimate, and further establish its asymptotic normality. A modified expectation-maximization-type (EM-type) estimation procedure is investigated. We show that the modified EM algorithms preserve the asymptotic ascent property. Numerical simulations are conducted to examine the finite sample performance of the estimation procedures. The proposed methodology is further illustrated via an analysis of a real dataset.


Computational Statistics & Data Analysis | 2013

Robust variable selection through MAVE

Weixin Yao; Qin Wang

Dimension reduction and variable selection play important roles in high dimensional data analysis. The sparse MAVE, a model-free variable selection method, is a nice combination of shrinkage estimation, Lasso, and an effective dimension reduction method, MAVE (minimum average variance estimation). However, it is not robust to outliers in the dependent variable because of the use of least-squares criterion. A robust variable selection method based on sparse MAVE is developed, together with an efficient estimation algorithm to enhance its practical applicability. In addition, a robust cross-validation is also proposed to select the structural dimension. The effectiveness of the new approach is verified through simulation studies and a real data analysis.


Communications in Statistics-theory and Methods | 2012

Bayesian Mixture Labeling and Clustering

Weixin Yao

Label switching is one of the fundamental issues for Bayesian mixture modeling. It occurs due to the nonidentifiability of the components under symmetric priors. Without solving the label switching, the ergodic averages of component specific quantities will be identical and thus useless for inference relating to individual components, such as the posterior means, predictive component densities, and marginal classification probabilities. The author establishes the equivalence between the labeling and clustering and proposes two simple clustering criteria to solve the label switching. The first method can be considered as an extension of K-means clustering. The second method is to find the labels by minimizing the volume of labeled samples and this method is invariant to the scale transformation of the parameters. Using a simulation example and the application of two real data sets, the author demonstrates the success of these new methods in dealing with the label switching problem.


Journal of Multivariate Analysis | 2012

An adaptive estimation of MAVE

Qin Wang; Weixin Yao

Minimum average variance estimation (MAVE, Xia et al. (2002) [29]) is an effective dimension reduction method. It requires no strong probabilistic assumptions on the predictors, and can consistently estimate the central mean subspace. It is applicable to a wide range of models, including time series. However, the least squares criterion used in MAVE will lose its efficiency when the error is not normally distributed. In this article, we propose an adaptive MAVE which can be adaptive to different error distributions. We show that the proposed estimate has the same convergence rate as the original MAVE. An EM algorithm is proposed to implement the new adaptive MAVE. Using both simulation studies and a real data analysis, we demonstrate the superior finite sample performance of the proposed approach over the existing least squares based MAVE when the error distribution is non-normal and the comparable performance when the error is normal.

Collaboration


Dive into the Weixin Yao's collaboration.

Top Co-Authors

Avatar

Sijia Xiang

Zhejiang University of Finance and Economics

View shared research outputs
Top Co-Authors

Avatar

Mian Huang

Shanghai University of Finance and Economics

View shared research outputs
Top Co-Authors

Avatar

Longhai Li

University of Saskatchewan

View shared research outputs
Top Co-Authors

Avatar

Bruce G. Lindsay

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Weixing Song

Kansas State University

View shared research outputs
Top Co-Authors

Avatar

Kun Chen

University of Connecticut

View shared research outputs
Top Co-Authors

Avatar

Qin Wang

Virginia Commonwealth University

View shared research outputs
Top Co-Authors

Avatar

Runze Li

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chun Yu

Jiangxi University of Finance and Economics

View shared research outputs
Researchain Logo
Decentralizing Knowledge