Is this you? Create Your Porfile

Daqi Gao

East China University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daqi Gao is active.

Explore More

Publication

Featured researches published by Daqi Gao.

Pattern Recognition | 2011

A novel multi-view learning developed from single-view patterns

Zhe Wang; Songcan Chen; Daqi Gao

The existing multi-view learning (MVL) learns how to process patterns with multiple information sources. In generalization this MVL is proven to have a significant advantage over the usual single-view learning (SVL). However, in most real-world cases we only have single source patterns to which the existing MVL is unable to be directly applied. This paper aims to develop a new MVL technique for single source patterns. To this end, we first reshape the original vector representation of single source patterns into multiple matrix representations. In doing so, we can change the original architecture of a given base classifier into different sub-ones. Each newly generated sub-classifier can classify the patterns represented with the matrix. Here each sub-classifier is taken as one view of the original base classifier. As a result, a set of sub-classifiers with different views are come into being. Then, one joint rather than separated learning process for the multi-view sub-classifiers is developed. In practice, the original base classifier employs the vector-pattern-oriented Ho-Kashyap classifier with regularization learning (called MHKS) as a paradigm which is not limited to MHKS. Thus, the proposed joint multi-view learning is named as MultiV-MHKS. Finally, the feasibility and effectiveness of the proposed MultiV-MHKS is demonstrated by the experimental results on benchmark data sets. More importantly, we have demonstrated that the proposed multi-view approach generally has a tighter generalization risk bound than its single-view one in terms of the Rademacher complexity analysis.

Knowledge Based Systems | 2014

Multi-view learning with Universum

Zhe Wang; Yujin Zhu; Wenwen Liu; Zhihua Chen; Daqi Gao

The traditional Multi-view Learning (MVL) studies how to process patterns with multiple information sources. In practice, the MVL is proven to have a significant advantage over the Single-view Learning (SVL). But in most real-world cases, there are only single-source patterns to be dealt with and the existing MVL is unable to be directly applied. In order to solve this problem, an alternative MVL technique was developed for the single-source patterns through reshaping the original vector representation of the single-source patterns into multiple matrix representations in our previous work. Doing so can effectively bring an improved classification performance. This paper aims to generalize the previous MVL through taking advantage of the Universum examples which do not belong to either class of the classification problem. The newly-proposed generalization can not only inherit the advantage of the previous MVL, but also get a prior domain knowledge of the whole data distribution. To our knowledge, it introduces the Universum technique into the MVL for the first time. In the implementation, our previous MVL named MultiV-MHKS is selected as the learning paradigm and incorporate MultiV-MHKS with the Universum technique, which forms a more flexible MVL with the Universum called UMultiV-MHKS for short. The subsequent experiments validate that the proposed UMultiV-MHKS can effectively improve the classification performance over both the original MultiV-MHKS and some other state-of-the-art algorithms. Finally, it is demonstrated that the UMultiV-MHKS can get a tighter generalization risk bound in terms of the Rademacher complexity.

Knowledge Based Systems | 2017

Entropy-based fuzzy support vector machine for imbalanced datasets

Qi Fan; Zhe Wang; Dongdong Li; Daqi Gao; Hongyuan Zha

Abstract Imbalanced problem occurs when the size of the positive class is much smaller than that of the negative one. Positive class usually refers to the main interest of the classification task. Although conventional Support Vector Machine (SVM) results in relatively robust classification performance on imbalanced datasets, it treats all samples with the same importance leading to the decision surface biasing toward the negative class. To overcome this inherent drawback, Fuzzy SVM (FSVM) is proposed by applying fuzzy membership to training samples such that different samples provide different contributions to the classifier. However, how to evaluate an appropriate fuzzy membership is the main issue to FSVM. In this paper, we propose a novel fuzzy membership evaluation which determines the fuzzy membership based on the class certainty of samples. That is, the samples with higher class certainty are assigned to larger fuzzy memberships. As the entropy is utilized to measure the class certainty, the fuzzy membership evaluation is named as entropy-based fuzzy membership evaluation. Therefore, the Entropy-based FSVM (EFSVM) is proposed by using the entropy-based fuzzy membership. EFSVM can pay more attention to the samples with higher class certainty, i.e. enhancing the importance of samples with high class certainty. Meanwhile, EFSVM guarantees the importance of the positive class by assigning positive samples to relatively large fuzzy memberships. The contributions of this work are: (1) proposing a novel entropy-based fuzzy membership evaluation method which enhances the importance of certainty samples, (2) guaranteeing the importance of the positive samples to result in a more flexible decision surface. Experiments on imbalanced datasets validate that EFSV outperforms the compared algorithms.

Knowledge Based Systems | 2015

Gravitational fixed radius nearest neighbor for imbalanced problem

Yujin Zhu; Zhe Wang; Daqi Gao

We use the gravitational scenario into the fixed radius nearest neighbor rule.The proposed GFRNN deals with imbalanced classification problem.GFRNN does not need any manual parameter setting or coordination.Comparison experiments on 40 datasets validate its effectiveness and efficiency. This paper proposes a novel learning model that introduces the calculation of the pairwise gravitation of the selected patterns into the classical fixed radius nearest neighbor method, in order to overcome the drawback of the original nearest neighbor rule when dealing with imbalanced data. The traditional k nearest neighbor rule is considered to lose power on imbalanced datasets because the final decision might be dominated by the patterns from negative classes in spite of the distance measurements. Differently from the existing modified nearest neighbor learning model, the proposed method named GFRNN has a simple structure and thus becomes easy to work. Moreover, all parameters of GFRNN do not need initializing or coordinating during the whole learning procedure. In practice, GFRNN first selects patterns as candidates out of the training set under the fixed radius nearest neighbor rule, and then introduces the metric based on the modified law of gravitation in the physical world to measure the distance between the query pattern and each candidate. Finally, GFRNN makes the decision based on the sum of all the corresponding gravitational forces from the candidates on the query pattern. The experimental comparison validates both the effectiveness and the efficiency of GFRNN on forty imbalanced datasets, comparing to nine typical methods. As a conclusion, the contribution of this paper is constructing a new simple nearest neighbor architecture to deal with imbalanced classification effectively without any manually parameter coordination, and further expanding the family of the nearest neighbor based rules.

Knowledge Based Systems | 2014

Multi-kernel classification machine with reduced complexity

Zhe Wang; Changming Zhu; Zengxin Niu; Daqi Gao; Xiang Feng

Multiple Kernel Learning (MKL) has been demonstrated to improve classification performance effectively. But it will cause a large complexity in some large-scale cases. In this paper, we aim to reduce both the time and space complexities of MKL, and thus propose an efficient multi-kernel classification machine based on the Nystrom approximation. Firstly, we generate different kernel matrices Kps for given data. Secondly, we apply the Nystrom approximation technique into each Kp so as to obtain its corresponding approximation matrix K∼p. Thirdly, we fuse multiple generated K∼ps into the final ensemble matrix G∼ with one certain heuristic rule. Finally, we select the Kernelized Modification of Ho–Kashyap algorithm with Squared approximation of the misclassification errors (KMHKS) as the incorporated paradigm, and meanwhile apply the G∼ into KMHKS. In doing so, we propose a multi-kernel classification machine with reduced complexity named Nystrom approximation matrix with Multiple KMHKSs (NMKMHKS). The experimental results here validate both the effectiveness and efficiency of the proposed NMKMHKS. The contributions of NMKMHKS are that: (1) compared with the existing MKL, NMKMHKS reduces the computational complexity of finding the solution scale from O(Mn3) to O(Mnm2), where M is the number of kernels, n is the number of training samples, and m is the number of the selected columns from Kp. Meanwhile, NMKMHKS reduces the space complexity of storing the kernel matrices from O(Mn2) to O(n2); (2) compared with the original KMHKS, NMKMHKS improves the classification performance but keeps a comparable space complexity; (3) the better recognition of NMKMHKS can be got in a strong correlation between multiple used Kps; and (4) NMKMHKS has a tighter generalization risk bound in terms of the Rademacher complexity analysis.

Pattern Recognition | 2015

Improved multi-kernel classification machine with Nyström approximation technique

Changming Zhu; Daqi Gao

Kernelized modification of Ho-Kashyap algorithm with squared approximation of the misclassification errors (KMHKS) is an effective algorithm for nonlinearly separable classification problems. While KMHKS only adopts one kernel function. So a multi-kernel classification machine with reduced complexity named Nystrom approximation matrix with Multiple KMHKSs (NMKMHKS) has been developed. But NMKMHKS has to initialize many parameters and has not an ability to deal with noise well. To this end, we propose an improved multi-kernel classification machine with Nystrom approximation technique (INMKMHKS). INMKMHKS is based on a new way of generating kernel functions and a new Nystrom approximation technique. The contributions of INMKMHKS are that (1) avoiding the problem of setting too many parameters; (2) keeping comparable space and computational complexities after comparing with NMKMHKS; (3) having a tighter generalization risk bound in terms of Rademacher complexity analysis; (4) having a better recognition than NMKMHKS on average; (5) possessing an ability to deal with noise and practical images. HighlightsAvoiding the problem of setting some initial parameters.Keeping a comparable space and computational complexities.Having a tighter generalization risk bound.Having a better recognition than NMKMHKS on average.Possessing an ability to deal with noise and practical images.

Pattern Recognition | 2013

Three-fold structured classifier design based on matrix pattern

Zhe Wang; Changming Zhu; Daqi Gao; Songcan Chen

The traditional vectorized classifier is supposed to incorporate the class structural information but ignore the individual structure of single pattern. In contrast, the matrixized classifier is supposed to consider both the class and the individual structures, and thus gets a superior performance to the vectorized classifier. In this paper, we explore one middle granularity named the cluster between the class and individual, and introduce the cluster structure that means the structure within each class into the matrixized classifier design. Doing so can simultaneously utilize the class, the cluster, and the individual structures in the way that is from global to point. Therefore, the proposed classifier design here owns the three-fold structural information, and can bring the classification performance to an improving trend. In practice, we adopt the Modification of Ho-Kashyap algorithm with Squared approximation of the misclassification errors (MHKS) as the learning paradigm and develop a Three-fold Structured MHKS named TSMHKS. The advantage of the three-fold structural learning framework is considering different close degrees between samples so as to improve the performance. The experimental results demonstrate the feasibility and effectiveness of the TSMHKS. Furthermore, we discuss the theoretical and experimental generalization bound of the proposed algorithm.

Knowledge Based Systems | 2013

Random projection ensemble learning with multiple empirical kernels

Zhe Wang; Wenbo Jie; Songcan Chen; Daqi Gao

In this paper we propose an effective and efficient random projection ensemble classifier with multiple empirical kernels. For the proposed classifier, we first randomly select a subset from the whole training set and use the subset to construct multiple kernel matrices with different kernels. Then through adopting the eigendecomposition of each kernel matrix, we explicitly map each sample into a feature space and apply the transformed sample into our previous multiple kernel learning framework. Finally, we repeat the above random selection for multiple times and develop a voting ensemble classifier, which is named RPEMEKL. The contributions of the proposed RPEMEKL are: (1) efficiently reducing the computational cost for the eigendecomposition of the kernel matrix due to the smaller size of the kernel matrix; (2) effectively increasing the classification performance due to the diversity generated through different random selections of the subsets; (3) giving an alternative multiple kernel learning from the Empirical Kernel Mapping (EKM) viewpoint, which is different from the traditional Implicit Kernel Mapping (IKM) learning.

Knowledge Based Systems | 2016

Multiple empirical kernel learning with locality preserving constraint

Qi Fan; Daqi Gao; Zhe Wang

Abstract Multiple Kernel Learning (MKL) is flexible in dealing with problems involving multiple and heterogeneous data sources. However, the necessity of inner-product form restricts its application since to kernelize the algorithms unsatisfying the inner-product formulation is pretty difficult. To overcome this problem, Multiple Empirical Kernel Learning (MEKL) is proposed by explicitly mapping input samples to feature spaces, in which the mapped feature vectors are explicitly presented. Most existed MEKLs optimize the learning framework by minimizing empirical risk, regularization risk and the loss term of multiple feature spaces. As little attention is paid to preserving local structure among training samples, the learned classifier might lack of locality similarity preserving property, which might result in unfavorable performance. Inspired by Locality Preserving Projection (LPP) which is to seek the optimal projection by preserving the local property of input samples, we introduce the locality preserving constraint into the learning framework to propose a novel Multiple Empirical Kernel Learning with Locality Preserving Constraint (MEKL-LPC). MEKL-LPC shows lower generalization error bound than both the Modification of Ho–Kashyap algorithm with Squared approximation of the misclassification error (MHKS) and Multi-Kernel MHKS (MultiK-MHKS) in terms of Rademacher complexity. Experiments on several real-world datasets demonstrate that MEKL-LPC outperforms the compared algorithms. The contributions of this work are: (i) originally integrating locality preserving constraint into MEKL, (ii) proposing a lower generalization error bound algorithm, i.e. MEKL-LPC.

Engineering Applications of Artificial Intelligence | 2016

One-sided Dynamic Undersampling No-Propagation Neural Networks for imbalance problem

Qi Fan; Zhe Wang; Daqi Gao

Abstract Imbalanced problem occurs when the size of one class, i.e. the minority class, is much lower than that of the other classes, i.e. the majority classes. Conventional data level methods are employed as the preprocessing approaches to balance the datasets before the classifier learning. Since the balanced data remains unchanged during the learning process, one pre-deleted sample would never be used to train the classifier, which may result in information loss. To solve this problem, this work presents an One-sided Dynamic Undersampling (ODU) technique which adopts all samples in the training process, and dynamically determines whether a majority sample should be used for the classifier learning. Thus, ODU can dynamically undersample the majority class to balance the dataset. To validate the effectiveness of ODU, we integrate it into No-Propagation neural networks to propose an ODU No-Propagation Neural Networks (ODUNPNN). ODUNPNN takes all training samples into consideration, and dynamically undersamples majority class after each iteration, i.e. ODUNPNN integrates undersampling approach into the classifier learning process. Experimental results on both synthetic and real-world imbalance datasets demonstrate that ODUNPNN outperforms the NPNN-based algorithms, and results in comparative performance compared with LASVM-AL, EasyEnsemble, and DyS on real-world imbalance datasets. The contributions of this paper are: (1) ODUNPNN integrates undersampling approach into the classifier learning process. (2) ODUNPNN dynamically balances training data in each iteration. (3) ODU technique can be integrated into other classification learning machines.

Explore More