Is this you? Create Your Porfile

Guanjin Wang

Hong Kong Polytechnic University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Guanjin Wang is active.

Explore More

Publication

Featured researches published by Guanjin Wang.

Computers in Biology and Medicine | 2015

Prediction of mortality after radical cystectomy for bladder cancer by machine learning techniques

Guanjin Wang; Kin-Man Lam; Zhaohong Deng; Kup-Sze Choi

Bladder cancer is a common cancer in genitourinary malignancy. For muscle invasive bladder cancer, surgical removal of the bladder, i.e. radical cystectomy, is in general the definitive treatment which, unfortunately, carries significant morbidities and mortalities. Accurate prediction of the mortality of radical cystectomy is therefore needed. Statistical methods have conventionally been used for this purpose, despite the complex interactions of high-dimensional medical data. Machine learning has emerged as a promising technique for handling high-dimensional data, with increasing application in clinical decision support, e.g. cancer prediction and prognosis. Its ability to reveal the hidden nonlinear interactions and interpretable rules between dependent and independent variables is favorable for constructing models of effective generalization performance. In this paper, seven machine learning methods are utilized to predict the 5-year mortality of radical cystectomy, including back-propagation neural network (BPN), radial basis function (RBFN), extreme learning machine (ELM), regularized ELM (RELM), support vector machine (SVM), naive Bayes (NB) classifier and k-nearest neighbour (KNN), on a clinicopathological dataset of 117 patients of the urology unit of a hospital in Hong Kong. The experimental results indicate that RELM achieved the highest average prediction accuracy of 0.8 at a fast learning speed. The research findings demonstrate the potential of applying machine learning techniques to support clinical decision making.

IEEE Transactions on Fuzzy Systems | 2017

Recognition of Epileptic EEG Signals Using a Novel Multiview TSK Fuzzy System

Yizhang Jiang; Zhaohong Deng; Fu-Lai Chung; Guanjin Wang; Pengjiang Qian; Kup-Sze Choi; Shitong Wang

Recognition of epileptic electroencephalogram (EEG) signals using machine learning techniques is becoming popular. In general, the construction of intelligent epileptic EEG recognition system involves two steps. First, an appropriate feature extraction method is applied to obtain representative features from the original raw EEG signals. Second, an effective intelligent model is trained based on the extracted features. However, there exist two major challenges in the process: 1) it is nontrivial to determine the appropriate feature extraction method to be used; 2) although many classical machine learning methods have been used for epileptic EEG recognition, most of them are “black box” approaches and more interpretable methods are desirable. To address these two challenges, a new epileptic EEG recognition method based on a multiview learning framework and fuzzy system modeling is proposed. First, multiview EEG data are generated by employing different feature extraction methods to obtain the features from different views of the signals. Second, the classical Takagi–Sugeno–Kang fuzzy system (TSK-FS) is introduced as an easy-to-interpret recognition model to develop a multiview TSK-FS method, called MV-TSK-FS, to identify epileptic EEG signals. For the proposed MV-TSK-FS, the importance of each view, i.e., the importance of each feature extraction method, can be evaluated according to the weighting of each view, and consequently the final decision can be made based on the weighted outputs of different views. Experimental results indicate that the MV-TSK-FS is a promising method when compared with the state-of-the-art algorithms.

Neurocomputing | 2017

Detection of epilepsy with Electroencephalogram using rule-based classifiers

Guanjin Wang; Zhaohong Deng; Kup-Sze Choi

Epilepsy is a common neurological disorder, characterized by recurrent seizures. Electroencephalogram (EEG), a useful measure for analysing the brains electrical activity, has been widely used for the detection of epileptic seizures. Most existing classification techniques are primarily aimed at increasing detection accuracy, while the interpretability of the methods have received relatively little attention. In this work, we concentrate on the epileptic classification of EEG signals with interpretability. We propose an epilepsy detection framework, followed by a comparative study under this framework to evaluate the accuracy and interpretability of four rule-based classifiers, namely, the decision tree algorithm C4.5, the random forest algorithm (RF), the support vector machine (SVM)-based decision tree algorithm (SVM+C4.5), and the SVM-based RF algorithm (SVM+RF), in two-group, three-group, and–the most challenging of all–five-group classifications of EEG signals. The experimental results showed that RF outperformed the other three rule-based classifiers, achieving average accuracies of 0.9896, 0.9600, and 0.8260 for the two-group, three-group, and five-group seizure classifications respectively, and exhibiting higher interpretability.

IEEE Transactions on Systems, Man, and Cybernetics | 2017

Deep Additive Least Squares Support Vector Machines for Classification With Model Transfer

Guanjin Wang; Guangquan Zhang; Kup-Sze Choi; Jie Lu

The additive kernel least squares support vector machine (AK-LS-SVM) has been well used in classification tasks due to its inherent advantages. For example, additive kernels work extremely well for some specific tasks, such as computer vision classification, medical research, and some specialized scenarios. Moreover, the analytical solution using AK-LS-SVM can formulate leave-one-out cross-validation error estimates in a closed form for parameter tuning, which drastically reduces the computational cost and guarantee the generalization performance especially on small and medium datasets. However, AK-LS-SVM still faces two main challenges: 1) improving the classification performance of AK-LS-SVM and 2) saving time when performing a grid search for model selection. Inspired by the stacked generalization principle and the transfer learning mechanism, a layer-by-layer combination of AK-LS-SVM classifiers embedded with transfer learning is proposed in this paper. This new classifier is called deep transfer additive kernel least square support vector machine (DTA-LS-SVM) which overcomes these two challenges. Also, considering that imbalanced datasets are involved in many real-world scenarios, especially for medical data analysis, the deep-transfer element is extended to compensate for this imbalance, thus leading to the development of another new classifier iDTA-LS-SVM. In the hierarchical structure of both DTA-LS-SVM and iDTA-LS-SVM, each layer has an AK-LS-SVM and the predictions from the previous layer act as an additional input feature for the current layer. Importantly, transfer learning is also embedded to guarantee generalization consistency between the adjacent layers. Moreover, both iDTA-LS-SVM and DTA-LS-SVM can ensure the minimal leave-one-out error by using the proposed fast leave-one-out cross validation strategy on the training set in each layer. We compared the proposed classifiers DTA-LS-SVM and iDTA-LS-SVM with the traditional LS-SVM and SVM using additive kernels on seven public UCI datasets and one real world dataset. The experimental results show that both DTA-LS-SVM and iDTA-LS-SVM exhibit better generalization performance and faster learning speed.

IEEE Transactions on Neural Systems and Rehabilitation Engineering | 2017

Seizure Classification From EEG Signals Using Transfer Learning, Semi-Supervised Learning and TSK Fuzzy System

Yizhang Jiang; Dongrui Wu; Zhaohong Deng; Pengjiang Qian; Jun Wang; Guanjin Wang; Fu-Lai Chung; Kup-Sze Choi; Shitong Wang

Recognition of epileptic seizures from offline EEG signals is very important in clinical diagnosis of epilepsy. Compared with manual labeling of EEG signals by doctors, machine learning approaches can be faster and more consistent. However, the classification accuracy is usually not satisfactory for two main reasons: the distributions of the data used for training and testing may be different, and the amount of training data may not be enough. In addition, most machine learning approaches generate black-box models that are difficult to interpret. In this paper, we integrate transductive transfer learning, semi-supervised learning and TSK fuzzy system to tackle these three problems. More specifically, we use transfer learning to reduce the discrepancy in data distribution between the training and testing data, employ semi-supervised learning to use the unlabeled testing data to remedy the shortage of training data, and adopt TSK fuzzy system to increase model interpretability. Two learning algorithms are proposed to train the system. Our experimental results show that the proposed approaches can achieve better performance than many state-of-the-art seizure classification algorithms.

international conference on intelligent computing | 2015

Detection of Epileptic Seizures in EEG Signals with Rule-Based Interpretation by Random Forest Approach

Guanjin Wang; Zhaohong Deng; Kup-Sze Choi

Epilepsy is a common neurological disorder and characterized by recurrent seizures. Although many classification methods have been applied to classify EEG signals for detection of epilepsy, little attention is paid on accurate epileptic seizure detection methods with comprehensible and transparent interpretation. This study develops a detection framework and focuses on doing a comparative study by applying the four rule-based classifiers, i.e., the decision tree algorithm C4.5, the random forest algorithm (RF), the support vector machine (SVM) based decision tree algorithm (SVM + C4.5) and the SVM based RF algorithm (SVM + RF), to two-group and three-group classification and the most challenging five-group classification on epileptic seizures in EEG signals. The experimental results justify that in addition to high interpretability, RF has the competitive advantage for two-group and three-group classification with the average accuracy of 0.9896 and 0.9600. More importantly, its performance is highlighted in five-group classification with the highest average accuracy of 0.8260 in contrast to other three rule-based classifiers.

IEEE Journal of Biomedical and Health Informatics | 2018

Tackling Missing Data in Community Health Studies Using Additive LS-SVM Classifier

Guanjin Wang; Zhaohong Deng; Kup-Sze Choi

Missing data is a common issue in community health and epidemiological studies. Direct removal of samples with missing data can lead to reduced sample size and information bias, which deteriorates the significance of the results. While data imputation methods are available to deal with missing data, they are limited in performance and could introduce noises into the dataset. Instead of data imputation, a novel method based on additive least square support vector machine (LS-SVM) is proposed in this paper for predictive modeling when the input features of the model contain missing data. The method also determines simultaneously the influence of the features with missing values on the classification accuracy using the fast leave-one-out cross-validation strategy. The performance of the method is evaluated by applying it to predict the quality of life (QOL) of elderly people using health data collected in the community. The dataset involves demographics, socioeconomic status, health history, and the outcomes of health assessments of 444 community-dwelling elderly people, with 5% to 60% of data missing in some of the input features. The QOL is measured using a standard questionnaire of the World Health Organization. Results show that the proposed method outperforms four conventional methods for handling missing data—case deletion, feature deletion, mean imputation, and K-nearest neighbor imputation, with the average QOL prediction accuracy reaching 0.7418. It is potentially a promising technique for tackling missing data in community health research and other applications.

Data Science and Knowledge Engineering for Sensing Decision Support | 2018

Computer aided diagnostic tool for prostate cancer with rule extraction from Support Vector Machines

Guanjin Wang; Jie Lu; Jeremy Yuen-Chun Teoh; Kup-Sze Choi

Prostate cancer is a common malignancy among men, necessitating accurate and timely diagnosis at an early stage. With the advent of Artificial Intelligence (AI) technologies in the health field, support vector machines (SVMs) as one of the most well-known machine learning methods have been widely applied for prostate cancer detection. They have good generalization performances but no interpretability on the learned patterns, which bring difficulties for health professionals to understand the inner working of the predictive model. In this paper, we aim to build a computer aided diagnostic tool for prostate cancer using the SVMs where rule extraction is enabled. Experimental results on a real-world prostate cancer dataset collected in a Hong Kong hospital show that the proposed model not only had the ability for rule generation but also achieved better prediction results compared with decision tree, exhibiting a potential to assist physicians with clinical decision support in future.

international symposium on neural networks | 2017

An output-based knowledge transfer approach and its application in bladder cancer prediction

Guanjin Wang; Guangquan Zhang; Kup-Sze Choi; Kin-Man Lam; Jie Lu

Many medical applications face a situation that the on-hand data cannot fully fit an existing predictive model or on-line tool, since these models or tools only use the most common predictors and the other valuable features collected in the current scenario are not considered altogether. On the other hand, the training data in the current scenario is not sufficient to learn a predictive model effectively yet. In order to overcome these problems and construct an efficient classifier, for these real situations in medical fields, in this work we present an approach based on the least squares support vector machine (LS-SVM), which utilizes a transfer learning framework to make maximum use of the data and guarantee its enhanced generalization capability. The proposed approach is capable of effectively learning a target domain with limited samples by relying on the probabilistic outputs from the other previously learned model using a heterogeneous method in the source domain. Moreover, it autonomously and quickly decides how much output knowledge to transfer from source domain to the target one using a fast leave-one-out cross validation strategy. This approach is applied on a real-world clinical dataset to predict 5-year mortality of bladder cancer patients after radical cystectomy, and the experimental results indicate that the proposed method can achieve better performances compared to traditional machine learning methods, consistently showing the potential of the proposed method under the circumstances with insufficient data.

Neural Computing and Applications | 2016

Linear combination of densities and its direct estimation framework with applications

Min Xu; Guanjin Wang; Fu-Lai Chung; Shitong Wang

In this paper, typical learning task including data condensation, binary classification, identification of the independence between random variables and conditional density estimation is described from a unified perspective of a linear combination of densities, and accordingly a direct estimation framework based on a linear combination of Gaussian components (i.e., Gaussian basis functions) under integrated square error criterion is proposed to solve these learning tasks. The proposed direct estimation framework has three advantages. Firstly, different from most of the existing state-of-the-art methods in which estimating each component’s density in this linear combination of densities and then combining them linearly are required, it can directly estimate the linear combination of densities as a whole, and it has at least comparable to or even better approximation accuracy than the existing density estimation methods. Secondly, the time complexity of the proposed direct estimation framework is O(l3) in which l is the number of Gaussian components in this framework which are generally viewed as the Gaussian distributions of the clusters in a dataset, and hence l is generally much less than the size of the dataset, so it is very suitable for large datasets. Thirdly, this proposed framework can be typically used to develop alternative approaches to classification, data condensation, identification of the independence between random variables, conditional density estimation and the similarity identification between multiple source domains and a target domain. Our preliminary results about experiments on several typical applications indicate the power of the proposed direct estimation framework.

Explore More