Is this you? Create Your Porfile

Yazhou Ren

University of Electronic Science and Technology of China

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yazhou Ren is active.

Explore More

Publication

Featured researches published by Yazhou Ren.

Neurocomputing | 2012

Local and global structure preserving based feature selection

Yazhou Ren; Guoji Zhang; Guoxian Yu; Xuan Li

Feature selection is of great importance in data mining tasks, especially for exploring high dimensional data. Laplacian Score, a recently proposed feature selection method, makes use of local manifold structure of samples to select features and achieves good performance. However, it ignores the global structure of samples and the selected features are of high redundancy. To address these issues, we propose a feature selection method based on local and global structure preserving, LGFS in short. LGFS first uses two graphs, nearest neighborhood graph and farthest neighborhood graph to describe the underlying local and global structure of samples, respectively. It then defines a criterion to prefer the features which have good ability on local and global structure preserving. To remove redundancy among the selected features, Extended LGFS (E-LGFS) is introduced by taking advantage of normalized mutual information to measure the dependency between a pair of features. We conduct extensive experiments on two artificial data sets, six UCI data sets and two public available face databases to evaluate LGFS and E-LGFS. The experimental results show our methods can achieve higher accuracies than other unsupervised comparing methods.

international conference on data mining | 2013

Weighted-Object Ensemble Clustering

Yazhou Ren; Carlotta Domeniconi; Guoji Zhang; Guoxian Yu

Ensemble clustering, also known as consensus clustering, aims to generate a stable and robust clustering through the consolidation of multiple base clusterings. In recent years many ensemble clustering methods have been proposed, most of which treat each clustering and each object as equally important. Some approaches make use of weights associated with clusters, or with clusterings, when assembling the different base clusterings. Boosting algorithms developed for classification have also led to the idea of considering weighted objects during the clustering process. However, not much effort has been put towards incorporating weighted objects into the consensus process. To fill this gap, in this paper we propose an approach called Weighted-Object Ensemble Clustering (WOEC). We first estimate how difficult it is to cluster an object by constructing the co-association matrix that summarizes the base clustering results, and we then embed the corresponding information as weights associated to objects. We propose three different consensus techniques to leverage the weighted objects. All three reduce the ensemble clustering problem to a graph partitioning one. We present extensive experimental results which demonstrate that our WOEC approach outperforms state-of-the-art consensus clustering methods and is robust to parameter settings.

Knowledge and Information Systems | 2017

Weighted-object ensemble clustering: methods and analysis

Yazhou Ren; Carlotta Domeniconi; Guoji Zhang; Guoxian Yu

Ensemble clustering has attracted increasing attention in recent years. Its goal is to combine multiple base clusterings into a single consensus clustering of increased quality. Most of the existing ensemble clustering methods treat each base clustering and each object as equally important, while some approaches make use of weights associated with clusters, or to clusterings, when assembling the different base clusterings. Boosting algorithms developed for classification have led to the idea of considering weighted objects during the clustering process. However, not much effort has been put toward incorporating weighted objects into the consensus process. To fill this gap, in this paper, we propose a framework called Weighted-Object Ensemble Clustering (WOEC). We first estimate how difficult it is to cluster an object by constructing the co-association matrix that summarizes the base clustering results, and we then embed the corresponding information as weights associated with objects. We propose three different consensus techniques to leverage the weighted objects. All three reduce the ensemble clustering problem to a graph partitioning one. We experimentally demonstrate the gain in performance that our WOEC methodology achieves with respect to state-of-the-art ensemble clustering methods, as well as its stability and robustness.

Information Processing and Management | 2017

A balanced modularity maximization link prediction model in social networks

Jie-Hua Wu; Guoji Zhang; Yazhou Ren

Abstract Link prediction has been becoming an important research topic due to the rapid growth of social networks. Community-based link prediction methods are proposed to incorporate community information in order to achieve accurate prediction. However, the performance of such methods is sensitive to the selection of community detection algorithms, and they also fail to capture the correlation between link formulation and community evolution. In this paper we introduce a balanced Modularity-Maximization Link Prediction (MMLP) model to address this issue. The idea of MMLP is to integrate the formulation of two types of links into a partitioned network generative model. We proposed a probabilistic algorithm to emphasize the role of innerLinks, which correspondingly maximizes the network modularity. Then, a trade-off technique is designed to maintain the network in a stable state of equilibrium. We also present an effective feature aggregation method by exploring two variations of network features. Our proposed method can overcome the limit of several community-based methods and the extensive experimental results on both synthetic and real-world benchmark data demonstrate its effectiveness and robustness.

Neurocomputing | 2018

Robust multi-view data clustering with multi-view capped-norm K-means

Shudong Huang; Yazhou Ren; Zenglin Xu

Abstract Real-world data sets are often comprised of multiple representations or views which provide different and complementary aspects of information. Multi-view clustering is an important approach to analyze multi-view data in a unsupervised way. Previous studies have shown that better clustering accuracy can be achieved using integrated information from all the views rather than just relying on each view individually. That is, the hidden patterns in data can be better explored by discovering the common latent structure shared by multiple views. However, traditional multi-view clustering methods are usually sensitive to noises and outliers, which greatly impair the clustering performance in practical problems. Furthermore, existing multi-view clustering methods, e.g. graph-based methods, are with high computational complexity due to the kernel/affinity matrix construction or the eigendecomposition. To address these problems, we propose a novel robust multi-view clustering method to integrate heterogeneous representations of data. To make our method robust to the noises and outliers, especially the extreme data outliers, we utilize the capped-norm loss as the objective. The proposed method is of low complexity, and in the same level as the classic K-means algorithm, which is a major advantage for unsupervised learning. We derive a new efficient optimization algorithm to solve the multi-view clustering problem. Finally, extensive experiments on benchmark data sets show that our proposed method consistently outperforms the state-of-the-art clustering methods.

Journal of Information Processing Systems | 2017

Weighted Local Naive Bayes Link Prediction

Jie-Hua Wu; Guoji Zhang; Yazhou Ren; Xia-Yan Zhang; Qiao Yang

Weighted network link prediction is a challenge issue in complex network analysis. Unsupervised methods based on local structure are widely used to handle the predictive task. However, the results are still far from satisfied as major literatures neglect two important points: common neighbors produce different influence on potential links; weighted values associated with links in local structure are also different. In this paper, we adapt an effective link prediction model—local naive Bayes model into a weighted scenario to address this issue. Correspondingly, we propose a weighted local naive Bayes (WLNB) probabilistic link prediction framework. The main contribution here is that a weighted cluster coefficient has been incorporated, allowing our model to inference the weighted contribution in the predicting stage. In addition, WLNB can extensively be applied to several classic similarity metrics. We evaluate WLNB on different kinds of real-world weighted datasets. Experimental results show that our proposed approach performs better (by AUC and Prec) than several alternative methods for link prediction in weighted complex networks.

european conference on machine learning | 2014

Boosted mean shift clustering

Yazhou Ren; Uday Kamath; Carlotta Domeniconi; Guoji Zhang

Mean shift is a nonparametric clustering technique that does not require the number of clusters in input and can find clusters of arbitrary shapes. While appealing, the performance of the mean shift algorithm is sensitive to the selection of the bandwidth, and can fail to capture the correct clustering structure when multiple modes exist in one cluster. DBSCAN is an efficient density based clustering algorithm, but it is also sensitive to its parameters and typically merges overlapping clusters. In this paper we propose Boosted Mean Shift Clustering (BMSC) to address these issues. BMSC partitions the data across a grid and applies mean shift locally on the cells of the grid, each providing a number of intermediate modes (iModes). A mode-boosting technique is proposed to select points in denser regions iteratively, and DBSCAN is utilized to partition the obtained iModes iteratively. Our proposed BMSC can overcome the limitations of mean shift and DBSCAN, while preserving their desirable properties. Complexity analysis shows its potential to deal with large-scale data and extensive experimental results on both synthetic and real benchmark data demonstrate its effectiveness and robustness to parameter settings.

international conference on machine learning and cybernetics | 2011

Random subspace based semi-supervised feature selection

Yazhou Ren; Guoji Zhang; Guoxian Yu

Feature selection is important in data mining, especially in mining high-dimensional data. In this paper, a random subspace based semi-supervised feature selection (RSSSFS) method with pairwise constraints is proposed. Firstly, several graphs are constructed by different random subspaces of samples, and then RSSSFS combines these graphs into a mixture graph on which RSSSFS does feature selection. The RSSSFS score reflects both the locality preserving power and pairwise constraints. We compare RSSSFS with Laplacian Score and Constraint Score algorithms. Experimental results on several UCI data sets demonstrate its effectiveness.

international joint conference on artificial intelligence | 2017

Robust softmax regression for multi-class classification with self-paced learning

Yazhou Ren; Peng Zhao; Yongpan Sheng; Dezhong Yao; Zenglin Xu

Softmax regression, a generalization of Logistic regression (LR) in the setting of multi-class classification, has been widely used in many machine learning applications. However, the performance of softmax regression is extremely sensitive to the presence of noisy data and outliers. To address this issue, we propose a model of robust softmax regression (RoSR) originated from the self-paced learning (SPL) paradigm for multi-class classification. Concretely, RoSR equipped with the soft weighting scheme is able to evaluate the importance of each data instance. Then, data instances participate in the classification problem according to their weights. In this way, the influence of noisy data and outliers (which are typically with small weights) can be significantly reduced. However, standard SPL may suffer from the imbalanced class influence problem, where some classes may have little influence in the training process if their instances are not sensitive to the loss. To alleviate this problem, we design two novel soft weighting schemes that assign weights and select instances locally for each class. Experimental results demonstrate the effectiveness of the proposed methods.

international conference on neural information processing | 2017

Semi-supervised Multi-label Linear Discriminant Analysis

Yanming Yu; Guoxian Yu; Xia Chen; Yazhou Ren

Multi-label dimensionality reduction methods often ask for sufficient labeled samples and ignore abundant unlabeled ones. To leverage abundant unlabeled samples and scarce labeled ones, we introduce a method called Semi-supervised Multi-label Linear Discriminant Analysis (SMLDA). SMLDA measures the dependence between pairwise samples in the original space and in the projected subspace to utilize unlabeled samples. After that, it optimizes the target projective matrix by minimizing the distance of within-class samples, whilst maximizing the distance of between-class samples and the dependence term. Extensive empirical study on multi-label datasets shows that SMLDA outperforms other related methods across various evaluation metrics, and the dependence term is an effective alternative to the widely-used smoothness term.

Explore More