Jianhua Xu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jianhua Xu is active.

Explore More

Publication

Featured researches published by Jianhua Xu.

international symposium on neural networks | 2001

Kernel MSE algorithm: a unified framework for KFD, LS-SVM and KRR

Jianhua Xu; Xuegong Zhang; Yanda Li

We generalize the conventional minimum squared error (MSE) method to yield a new nonlinear learning machine by using the kernel idea and adding different regularization terms. We name it kernel minimum squared error (KMSE) algorithm, which can deal with linear and nonlinear classification and regression problems. With proper choices of the output coding schemes and regularization terms, we prove that KMSE is identical to the kernel Fisher discriminant (KFD) except for an unimportant scale factor, and it is directly equivalent to the least square version for support vector machine (LS-SVM). For continuous real output values, we find that KMSE is the kernel ridge regression (KRR) with a bias. Therefore KMSE can act as a general framework that includes KFD, LS-SVM and KRR as its particular cases. In addition, we simplify the formula to estimate the projecting direction of KFD. Experiments on artificial and real world data sets in numerical computation aspects demonstrate that KMSE is a class of powerful kernel learning machines.

Pattern Recognition | 2013

Fast multi-label core vector machine

Jianhua Xu

The existing multi-label support vector machine (Rank-SVM) has an extremely high computational complexity due to a large number of variables in its quadratic programming. When the Frank-Wolfe (FW) method is applied, a large-scale linear programming still needs to be solved at any iteration. Therefore it is highly desirable to design and implement a new efficient SVM-type multi-label algorithm. Binary core vector machine (CVM), as a variant of traditional SVM, is formulated as a quadratic programming with a unit simplex constraint, in which each linear programming in FW has an analytical solution. In this paper, we combine Rank-SVM with CVM to construct a novel SVM-type multi-label classifier (Rank-CVM) which is described as the same optimization form as binary CVM. At its any iteration of FW, there exist analytical solution and step size, and several useful recursive formulae for proxy solution, gradient vector, and objective function value, all of which reduce computational cost greatly. Experimental study on nine benchmark data sets shows that when Rank-CVM performs as statistically well as its rival Rank-SVM according to five performance measures, our method runs averagely about 13 times faster and has less support vectors than Rank-SVM in the training phase under C/C++ environment.

Neurocomputing | 2011

An extended one-versus-rest support vector machine for multi-label classification

Jianhua Xu

Hybrid strategy, which generalizes a specific single-label algorithm while one or two data decomposition tricks are applied implicitly or explicitly, has become an effective and efficient tool to design and implement various multi-label classification algorithms. In this paper, we extend traditional binary support vector machine by introducing an approximate ranking loss as its empirical loss term to build a novel support vector machine for multi-label classification, resulting into a quadratic programming problem with different upper bounds of variables to characterize label correlation of individual instance. Further, our optimization problem can be solved via combining one-versus-rest data decomposition trick with modified binary support vector machine, which dramatically reduces computational cost. Experimental study on ten multi-label data sets illustrates that our method is a powerful candidate for multi-label classification, compared with four state-of-the-art multi-label classification approaches.

BMC Bioinformatics | 2014

Prediction of piRNAs using transposon interaction and a support vector machine

Kai Wang; Chun Liang; Jinding Liu; Huamei Xiao; Shuiqing Huang; Jianhua Xu; Fei Li

BackgroundPiwi-interacting RNAs (piRNAs) are a class of small non-coding RNA primarily expressed in germ cells that can silence transposons at the post-transcriptional level. Accurate prediction of piRNAs remains a significant challenge.ResultsWe developed a program for piRNA annotation (Piano) using piRNA-transposon interaction information. We downloaded 13,848 Drosophila piRNAs and 261,500 Drosophila transposons. The piRNAs were aligned to transposons with a maximum of three mismatches. Then, piRNA-transposon interactions were predicted by RNAplex. Triplet elements combining structure and sequence information were extracted from piRNA-transposon matching/pairing duplexes. A support vector machine (SVM) was used on these triplet elements to classify real and pseudo piRNAs, achieving 95.3 ± 0.33% accuracy and 96.0 ± 0.5% sensitivity. The SVM classifier can be used to correctly predict human, mouse and rat piRNAs, with overall accuracy of 90.6%. We used Piano to predict piRNAs for the rice stem borer, Chilo suppressalis, an important rice insect pest that causes huge yield loss. As a result, 82,639 piRNAs were predicted in C. suppressalis.ConclusionsPiano demonstrates excellent piRNA prediction performance by using both structure and sequence features of transposon-piRNAs interactions. Piano is freely available to the academic community at http://ento.njau.edu.cn/Piano.html.

Expert Systems With Applications | 2012

An efficient multi-label support vector machine with a zero label

Jianhua Xu

Existing multi-label support vector machine (Rank-SVM) has an extremely high computational complexity and lacks an intrinsic zero point to determine relevant labels. In this paper, we propose a novel support vector machine for multi-label classification through both simplifying Rank-SVM and adding a zero label, resulting into a quadratic programming problem in which each class has an independent equality constraint. When Frank-Wolfe method is used to solve our quadratic programming problem iteratively, our entire linear programming problem of each step is divided into a series of sub-problems, which dramatically reduces computational cost. It is illustrated that for famous Yeast data set our training procedure runs about 12 times faster than Rank-SVM does under C++ environment. Experiments from five benchmark data sets show that our method is a powerful candidate for multi-label classification, compared with five state-of-the-art multi-label classification techniques.

Pattern Recognition | 2014

Multi-label core vector machine with a zero label

Jianhua Xu

Abstract Multi-label core vector machine (Rank-CVM) is an efficient and effective algorithm for multi-label classification. But there still exist two aspects to be improved: reducing training and testing computational costs further, and detecting relevant labels effectively. In this paper, we extend Rank-CVM via adding a zero label to construct its variant with a zero label, i.e., Rank-CVMz, which is formulated as the same quadratic programming form with a unit simplex constraint and non-negative ones as Rank-CVM, and then is solved by Frank–Wolfe method efficiently. Attractively, our Rank-CVMz has fewer variables to be solved than Rank-CVM, which speeds up training procedure dramatically. Further, the relevant labels are effectively detected by the zero label. Experimental results on 12 benchmark data sets demonstrate that our method achieves a competitive performance, compared with six existing multi-label algorithms according to six indicative instance-based measures. Moreover, on the average, our Rank-CVMz runs 83 times faster and has slightly fewer support vectors than its origin Rank-CVM.

international conference on wavelet analysis and pattern recognition | 2007

A multi-label classification algorithm based on triple class support vector machine

Shu-Peng Wan; Jianhua Xu

Multi-label classification problem is a special learning task in which its classes are not mutually exclusive and each sample may belong to several classes simultaneously. A novel multi-label classification algorithm based on both one-versus-one decomposition method and triple class support vector machine (SVM) is presented in this paper. One-versus-one decomposition technique is used to pairwise divide a multi-label classification problem into many binary class ones, in which some samples possibly are associated with two labels at the same time. Triple class SVM is a generalization of traditional binary class SVM, where those samples with double labels are considered as a mixed class located between positive and negative classes. Experimental results on benchmark datasets Yeast and Scene demonstrate that our proposed algorithm is comparable with some existed methods, such as rank-SVM, binary-SVM, ML-kNN and etc, according to several evaluation criteria of multi-label learning algorithms.

Genomics, Proteomics & Bioinformatics | 2008

Identification of MicroRNA Precursors with Support Vector Machine and String Kernel

Jianhua Xu; Fei Li

MicroRNAs (miRNAs) are one family of short (21–23 nt) regulatory non-coding RNAs processed from long (70–110 nt) miRNA precursors (pre-miRNAs). Identifying true and false precursors plays an important role in computational identification of miRNAs. Some numerical features have been extracted from precursor sequences and their secondary structures to suit some classification methods; however, they may lose some usefully discriminative information hidden in sequences and structures. In this study, pre-miRNA sequences and their secondary structures are directly used to construct an exponential kernel based on weighted Levenshtein distance between two sequences. This string kernel is then combined with support vector machine (SVM) for detecting true and false pre-miRNAs. Based on 331 training samples of true and false human pre-miRNAs, 2 key parameters in SVM are selected by 5-fold cross validation and grid search, and 5 realizations with different 5-fold partitions are executed. Among 16 independent test sets from 3 human, 8 animal, 2 plant, 1 virus, and 2 artificially false human pre-miRNAs, our method statistically outperforms the previous SVM-based technique on 11 sets, including 3 human, 7 animal, and 1 false human pre-miRNAs. In particular, pre-miRNAs with multiple loops that were usually excluded in the previous work are correctly identified in this study with an accuracy of 92.66%.

granular computing | 2010

Constructing a Fast Algorithm for Multi-label Classification with Support Vector Data Description

Jianhua Xu

For multi-label classification, problem transform algorithms have received more attention due to their good performance and low computational complexity. But how to speed up training and test procedures is still a challenging issue. In this paper, one-by-one data decomposition trick is adopted to divide a k-label problem into k sub-problems, where a specific sub-problem only consists of instances with a specific class. We train each sub-classifier using support vector data description that learns a smallest hyper-sphere to capture the majority of training instances of each class, and integrate k sub-classifiers into an entire multi-label classification algorithm using both pseudo posterior probabilities and linear ridge regression. Our new method has the lowest time complexity, compared with existing problem transform support vector machines for multi-label classification. Experimental results on the Yeast dataset illustrate that our algorithm works better than several state-of-the-art ones.

Knowledge Based Systems | 2016

A multi-label feature extraction algorithm via maximizing feature variance and feature-label dependence simultaneously

Jianhua Xu; Jiali Liu; Jing Yin; Chengyu Sun

We derive a least-squares formulation for MDDMp technique.A novel multi-label feature extraction algorithm is proposed.Our algorithm maximizes both feature variance and feature-label dependence.Experiments show that our algorithm is a competitive candidate. Dimensionality reduction is an important pre-processing procedure for multi-label classification to mitigate the possible effect of dimensionality curse, which is divided into feature extraction and selection. Principal component analysis (PCA) and multi-label dimensionality reduction via dependence maximization (MDDM) represent two mainstream feature extraction techniques for unsupervised and supervised paradigms. They produce many small and a few large positive eigenvalues respectively, which could deteriorate the classification performance due to an improper number of projection directions. It has been proved that PCA proposed primarily via maximizing feature variance is associated with a least-squares formulation. In this paper, we prove that MDDM with orthonormal projection directions also falls into the least-squares framework, which originally maximizes Hilbert-Schmidt independence criterion (HSIC). Then we propose a novel multi-label feature extraction method to integrate two least-squares formulae through a linear combination, which maximizes both feature variance and feature-label dependence simultaneously and thus results in a proper number of positive eigenvalues. Experimental results on eight data sets show that our proposed method can achieve a better performance, compared with other seven state-of-the-art multi-label feature extraction algorithms.

Explore More