Is this you? Create Your Porfile

Zenglin Xu

University of Electronic Science and Technology of China

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zenglin Xu is active.

Explore More

Publication

Featured researches published by Zenglin Xu.

IEEE Transactions on Neural Networks | 2010

Discriminative Semi-Supervised Feature Selection Via Manifold Regularization

Zenglin Xu; Irwin King; Michael R. Lyu; Rong Jin

Feature selection has attracted a huge amount of interest in both research and application communities of data mining. We consider the problem of semi-supervised feature selection, where we are given a small amount of labeled examples and a large amount of unlabeled examples. Since a small number of labeled samples are usually insufficient for identifying the relevant features, the critical problem arising from semi-supervised feature selection is how to take advantage of the information underneath the unlabeled data. To address this problem, we propose a novel discriminative semi-supervised feature selection method based on the idea of manifold regularization. The proposed approach selects features through maximizing the classification margin between different classes and simultaneously exploiting the geometry of the probability distribution that generates both labeled and unlabeled data. In comparison with previous semi-supervised feature selection algorithms, our proposed semi-supervised feature selection method is an embedded feature selection method and is able to find more discriminative features. We formulate the proposed feature selection method into a convex-concave optimization problem, where the saddle point corresponds to the optimal solution. To find the optimal solution, the level method, a fairly recent optimization method, is employed. We also present a theoretic proof of the convergence rate for the application of the level method to our problem. Empirical evaluation on several benchmark data sets demonstrates the effectiveness of the proposed semi-supervised feature selection method.

IEEE Transactions on Neural Networks | 2011

Efficient Sparse Generalized Multiple Kernel Learning

Haiqin Yang; Zenglin Xu; Jieping Ye; Irwin King; Michael R. Lyu

Kernel methods have been successfully applied in various applications. To succeed in these applications, it is crucial to learn a good kernel representation, whose objective is to reveal the data similarity precisely. In this paper, we address the problem of multiple kernel learning (MKL), searching for the optimal kernel combination weights through maximizing a generalized performance measure. Most MKL methods employ the -norm simplex constraints on the kernel combination weights, which therefore involve a sparse but non-smooth solution for the kernel weights. Despite the success of their efficiency, they tend to discard informative complementary or orthogonal base kernels and yield degenerated generalization performance. Alternatively, imposing the -norm constraint on the kernel weights will keep all the information in the base kernels. This leads to non-sparse solutions and brings the risk of being sensitive to noise and incorporating redundant information. To tackle these problems, we propose a generalized MKL (GMKL) model by introducing an elastic-net-type constraint on the kernel weights. More specifically, it is an MKL model with a constraint on a linear combination of the -norm and the squared -norm on the kernel weights to seek the optimal kernel combination weights. Therefore, previous MKL problems based on the -norm or the -norm constraints can be regarded as special cases. Furthermore, our GMKL enjoys the favorable sparsity property on the solution and also facilitates the grouping effect. Moreover, the optimization of our GMKL is a convex optimization problem, where a local solution is the global optimal solution. We further derive a level method to efficiently solve the optimization problem. A series of experiments on both synthetic and real-world datasets have been conducted to show the effectiveness and efficiency of our GMKL.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

Bayesian Nonparametric Models for Multiway Data Analysis

Zenglin Xu; Feng Yan; Yuan Qi

Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches—such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)—amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g., missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models for multiway data analysis. We name these models InfTucker. These new models essentially conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or

european conference on computer vision | 2008

An Effective Approach to 3D Deformable Surface Tracking

Jianke Zhu; Steven C. H. Hoi; Zenglin Xu; Michael R. Lyu

international world wide web conferences | 2007

Web page classification with heterogeneous data fusion

Zenglin Xu; Irwin King; Michael R. Lyu

processes with nonlinear covariance functions. Moreover, on network data, our models reduce to nonparametric stochastic blockmodels and can be used to discover latent groups and predict missing interactions. To learn the models efficiently from data, we develop a variational inference technique and explore properties of the Kronecker product for computational efficiency. Compared with a classical variational implementation, this technique reduces both time and space complexities by several orders of magnitude. On real multiway and network data, our new models achieved significantly higher prediction accuracy than state-of-art tensor decomposition methods and blockmodels.

Neural Networks | 2009

A novel kernel-based maximum a posteriori classification method

Zenglin Xu; Kaizhu Huang; Jianke Zhu; Irwin King; Michael R. Lyu

The key challenge with 3D deformable surface tracking arises from the difficulty in estimating a large number of 3D shape parameters from noisy observations. A recent state-of-the-art approach attacks this problem by formulating it as a Second Order Cone Programming (SOCP) feasibility problem. The main drawback of this solution is the high computational cost. In this paper, we first reformulate the problem into an unconstrained quadratic optimization problem. Instead of handling a large set of complicated SOCP constraints, our new formulation can be solved very efficiently by resolving a set of sparse linear equations. Based on the new framework, a robust iterative method is employed to handle large outliers. We have conducted an extensive set of experiments to evaluate the performance on both synthetic and real-world testbeds, from which the promising results show that the proposed algorithm not only achieves better tracking accuracy, but also executes significantly faster than the previous solution.

international conference on supercomputing | 2016

BLASX: A High Performance Level-3 BLAS Library for Heterogeneous Multi-GPU Computing

Linnan Wang; Wei Wu; Zenglin Xu; Jianxiong Xiao; Yi Yang

Web pages are more than text and they contain much contextual and structural information, e.g., the title, the meta data, the anchor text,etc., each of which can be seen as a data source or are presentation. Due to the different dimensionality and different representing forms of these heterogeneous data sources, simply putting them together would not greatly enhance the classification performance. We observe that via a kernel function, different dimensions and types of data sources can be represented into acommon format of kernel matrix, which can be seen as a generalized similarity measure between a pair of web pages. In this sense, a kernel learning approach is employed to fuse these heterogeneous data sources. The experimental results on a collection of the ODP database validate the advantages of the proposed method over traditional methods based on any single data source and the uniformly weighted combination of them.

conference on information and knowledge management | 2008

Semi-supervised text categorization by active search

Zenglin Xu; Rong Jin; Kaizhu Huang; Michael R. Lyu; Irwin King

Kernel methods have been widely used in pattern recognition. Many kernel classifiers such as Support Vector Machines (SVM) assume that data can be separated by a hyperplane in the kernel-induced feature space. These methods do not consider the data distribution and are difficult to output the probabilities or confidences for classification. This paper proposes a novel Kernel-based Maximum A Posteriori (KMAP) classification method, which makes a Gaussian distribution assumption instead of a linear separable assumption in the feature space. Robust methods are further proposed to estimate the probability densities, and the kernel trick is utilized to calculate our model. The model is theoretically and empirically important in the sense that: (1) it presents a more generalized classification model than other kernel-based algorithms, e.g., Kernel Fisher Discriminant Analysis (KFDA); (2) it can output probability or confidence for classification, therefore providing potential for reasoning under uncertainty; and (3) multi-way classification is as straightforward as binary classification in this model, because only probability calculation is involved and no one-against-one or one-against-others voting is needed. Moreover, we conduct an extensive experimental comparison with state-of-the-art classification methods, such as SVM and KFDA, on both eight UCI benchmark data sets and three face data sets. The results demonstrate that KMAP achieves very promising performance against other models.

international symposium on neural networks | 2009

Supervised Self-taught Learning: Actively transferring knowledge from unlabeled data

Kaizhu Huang; Zenglin Xu; Irwin King; Michael R. Lyu; Colin Campbell

Basic Linear Algebra Subprograms (BLAS) are a set of low level linear algebra kernels widely adopted by applications involved with the deep learning and scientific computing. The massive and economic computing power brought forth by the emerging GPU architectures drives interest in implementation of compute-intensive level 3 BLAS on multi-GPU systems. In this paper, we investigate existing multi-GPU level 3 BLAS and present that 1) issues, such as the improper load balancing, inefficient communication, insufficient GPU stream level concurrency and data caching, impede current implementations from fully harnessing heterogeneous computing resources; 2) and the inter-GPU Peer-to-Peer(P2P) communication remains unexplored. We then present BLASX: a highly optimized multi-GPU level-3 BLAS. We adopt the concepts of algorithms-by-tiles treating a matrix tile as the basic data unit and operations on tiles as the basic task. Tasks are guided with a dynamic asynchronous runtime, which is cache and locality aware. The communication cost under BLASX becomes trivial as it perfectly overlaps communication and computation across multiple streams during asynchronous task progression. It also takes the current tile cache scheme one step further by proposing an innovative 2-level hierarchical tile cache, taking advantage of inter-GPU P2P communication. As a result, linear speedup is observable with BLASX under multi-GPU configurations; and the extensive benchmarks demonstrate that BLASX consistently outperforms the related leading industrial and academic implementations such as cuBLAS-XT, SuperMatrix, MAGMA.

Data Mining and Knowledge Discovery | 2018

Robust graph regularized nonnegative matrix factorization for clustering

Shudong Huang; Hongjun Wang; Tao Li; Tianrui Li; Zenglin Xu

In automated text categorization, given a small number of labeled documents, it is very challenging, if not impossible, to build a reliable classifier that is able to achieve high classification accuracy. To address this problem, a novel web-assisted text categorization framework is proposed in this paper. Important keywords are first automatically identified from the available labeled documents to form the queries. Search engines are then utilized to retrieve from the Web a multitude of relevant documents, which are then exploited by a semi-supervised framework. To our best knowledge, this work is the first study of this kind. Extensive experimental study shows the encouraging results of the proposed text categorization framework: using Google as the web search engine, the proposed framework is able to reduce the classification error by 30% when compared with the state-of-the-art supervised text categorization method.

Explore More