Shifei Ding
China University of Mining and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shifei Ding.
International Journal of Machine Learning and Cybernetics | 2017
Shifei Ding; Nan Zhang; Jian Zhang; Xinzheng Xu; Zhongzhi Shi
Extreme learning machine (ELM) is not only an effective classifier but also a useful cluster. Unsupervised extreme learning machine (US-ELM) gives favorable performance compared to state-of-the-art clustering algorithms. Extreme learning machine as an auto encoder (ELM-AE) can obtain principal components which represent original samples. The proposed unsupervised extreme learning machine based on embedded features of ELM-AE (US-EF-ELM) algorithm applies ELM-AE to US-ELM. US-EF-ELM regards embedded features of ELM-AE as the outputs of US-ELM hidden layer, and uses US-ELM to obtain the embedded matrix of US-ELM. US-EF-ELM can handle the multi-cluster clustering. The learning capability and computational efficiency of US-EF-ELM are as same as US-ELM. By experiments on UCI data sets, we compared US-EF-ELM k-means algorithm with k-means algorithm, spectral clustering algorithm, and US-ELM k-means algorithm in accuracy and efficiency.
Journal of Zhejiang University Science C | 2013
Hua-juan Huang; Shifei Ding; Zhongzhi Shi
The training algorithm of classical twin support vector regression (TSVR) can be attributed to the solution of a pair of quadratic programming problems (QPPs) with inequality constraints in the dual space. However, this solution is affected by time and memory constraints when dealing with large datasets. In this paper, we present a least squares version for TSVR in the primal space, termed primal least squares TSVR (PLSTSVR). By introducing the least squares method, the inequality constraints of TSVR are transformed into equality constraints. Furthermore, we attempt to directly solve the two QPPs with equality constraints in the primal space instead of the dual space; thus, we need only to solve two systems of linear equations instead of two QPPs. Experimental results on artificial and benchmark datasets show that PLSTSVR has comparable accuracy to TSVR but with considerably less computational time. We further investigate its validity in predicting the opening price of stock.
Journal of Zhejiang University Science C | 2012
Xin-zheng Xu; Shifei Ding; Zhongzhi Shi; Hong Zhu
A novel method based on rough sets (RS) and the affinity propagation (AP) clustering algorithm is developed to optimize a radial basis function neural network (RBFNN). First, attribute reduction (AR) based on RS theory, as a preprocessor of RBFNN, is presented to eliminate noise and redundant attributes of datasets while determining the number of neurons in the input layer of RBFNN. Second, an AP clustering algorithm is proposed to search for the centers and their widths without a priori knowledge about the number of clusters. These parameters are transferred to the RBF units of RBFNN as the centers and widths of the RBF function. Then the weights connecting the hidden layer and output layer are evaluated and adjusted using the least square method (LSM) according to the output of the RBF units and desired output. Experimental results show that the proposed method has a more powerful generalization capability than conventional methods for an RBFNN.
Neurocomputing | 2018
Nan Zhang; Shifei Ding; Jian Zhang; Yu Xue
The Restricted Boltzmann Machine (RBM) has aroused wide interest in machine learning fields during the past decade. This review aims to report the recent developments in theoretical research and applications of the RBM. We first give an overview of the general RBM from the theoretical perspective, including stochastic approximation methods, stochastic gradient methods, and preventing overfitting methods. And then this review focuses on the RBM variants which further improve the learning ability of the RBM under general or specific applications. The RBM has recently been extended for representational learning, document modeling, multi-label learning, weakly supervised learning and many other tasks. The RBM and RBM variants provide powerful tools for representing dependency in the data, and they can be used as the basic building blocks to create deep networks. Apart from the Deep Belief Network (DBN) and the Deep Boltzmann Machine (DBM), the RBM can also be combined with the Convolutional Neural Network (CNN) to create deep networks. This review provides a comprehensive view of these advances in the RBM together with its future perspectives.
International Journal of Machine Learning and Cybernetics | 2018
Mingjing Du; Shifei Ding; Yu Xue
The density peaks (DP) clustering approach, a novel density-based clustering algorithm, detects clusters with arbitrary shape. However, this method uses a crisp neighborhood relation to calculate local density. It cannot identify the different values of the neighborhood membership degrees of the points with respect to different distances from core point. The proposed FN-DP (fuzzy neighborhood density peaks) clustering algorithm uses fuzzy neighborhood relation to define the local density in FJP (fuzzy joint points) algorithm. The proposed algorithm integrates the speed of DP clustering algorithm with the robustness of FJP algorithm. The experimental results illustrate the superior performance of our algorithm compared with the DP clustering approach.
International Journal of Machine Learning and Cybernetics | 2018
Xiao Xu; Shifei Ding; Mingjing Du; Yu Xue
To deal with the complex structure of the data set, density peaks clustering algorithm (DPC) was proposed in 2014. The density and the delta-distance are utilized to find the clustering centers in the DPC method. It detects outliers efficiently and finds clusters of arbitrary shape. But unfortunately, we need to calculate the distance between all data points in the first process, which limits the running speed of DPC algorithm on large datasets. To address this issue, this paper introduces a novel approach based on grid, called density peaks clustering algorithm based on grid (DPCG). This approach can overcome the operation efficiency problem. When calculating the local density, the idea of the grid is introduced to reduce the computation time based on the DPC algorithm. Neither it requires calculating all the distances nor much input parameters. Moreover, DPCG algorithm successfully inherits the all merits of the DPC algorithm. Experimental results on UCI data sets and artificial data show that the DPCG algorithm is flexible and effective.
International Journal of Machine Learning and Cybernetics | 2018
Mingjing Du; Shifei Ding; Xiao Xu; Yu Xue
Density peaks clustering (DPC) algorithm is a novel clustering algorithm based on density. It needs neither iterative process nor more parameters. However, it cannot effectively group data with arbitrary shapes, or multi-manifold structures. To handle this drawback, we propose a new density peaks clustering, i.e., density peaks clustering using geodesic distances (DPC-GD), which introduces the idea of the geodesic distances into the original DPC method. By experiments on synthetic data sets, we reveal the power of the proposed algorithm. By experiments on image data sets, we compared our algorithm with classical methods (kernel k-means algorithm and spectral clustering algorithm) and the original algorithm in accuracy and NMI. Experimental results show that our algorithm is feasible and effective.
Neural Computing and Applications | 2018
Lingheng Meng; Shifei Ding; Nan Zhang; Jian Zhang
Learning results depend on the representation of data, so how to efficiently represent data has been a research hot spot in machine learning and artificial intelligence. With the deepening of the deep learning research, studying how to train the deep networks to express high dimensional data efficiently also has been a research frontier. In order to present data more efficiently and study how to express data through deep networks, we propose a novel stacked denoising sparse autoencoder in this paper. Firstly, we construct denoising sparse autoencoder through introducing both corrupting operation and sparsity constraint into traditional autoencoder. Then, we build stacked denoising sparse autoencoders which has multi-hidden layers by layer-wisely stacking denoising sparse autoencoders. Experiments are designed to explore the influences of corrupting operation and sparsity constraint on different datasets, using the networks with various depth and hidden units. The comparative experiments reveal that test accuracy of stacked denoising sparse autoencoder is much higher than other stacked models, no matter what dataset is used and how many layers the model has. We also find that the deeper the network is, the less activated neurons in every layer will have. More importantly, we find that the strengthening of sparsity constraint is to some extent equal to the increase in corrupted level.
Neurocomputing | 2018
Jian Zhang; Shifei Ding; Nan Zhang
Abstract This review aims to report recent developments about deep learning algorithms based on Restricted Boltzmann Machines (RBMs) and Conditional Random Fields (CRFs). Firstly, we give an overview of the general RBMs and CRFs, which are powerful methods for representing dependency of input data, and they can be treated as the basic blocks of deep neural nets as well. Secondly, this review introduces RBM variants and the deep learning models. Apart from the Deep Belief Networks (DBNs) and the Deep Boltzmann Machines (DBMs), the RBMs can be combined with the Convolutional Neural Nets (CNNs), which perform well in image recognition and image reconstruction. Thirdly, this review discusses CRFs and their applications in image annotation and scene recognition. Lastly, this review describes the developments and existing problems in neural nets and lists some experiments.
Knowledge and Information Systems | 2018
Mingjing Du; Shifei Ding; Yu Xue; Zhongzhi Shi
AbstractnThe density peaks (DP) clustering approach is a novel density-based clustering algorithm. On the basis of the prior assumption of consistency for semi-supervised learning problems, we further make the assumptions of consistency for density-based clustering. The first one is the assumption of the local consistency, which means nearby points are likely to have the similar local density; the second one is the assumption of the global consistency, which means points on the same high-density area (or the same structure, i.e., the same cluster) are likely to have the same label. According to the first assumption, we provide a new option based on the sensitivity of the local density for the local density. In addition, we redefine