Dayou Liu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dayou Liu is active.

Explore More

Publication

Featured researches published by Dayou Liu.

Expert Systems With Applications | 2011

A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis

Huiling Chen; Bo Yang; Jie Liu; Dayou Liu

Breast cancer is becoming a leading cause of death among women in the whole world, meanwhile, it is confirmed that the early detection and accurate diagnosis of this disease can ensure a long survival of the patients. Expert systems and machine learning techniques are gaining popularity in this field because of the effective classification and high diagnostic capability. In this paper, a rough set (RS) based supporting vector machine classifier (RS_SVM) is proposed for breast cancer diagnosis. In the proposed method (RS_SVM), RS reduction algorithm is employed as a feature selection tool to remove the redundant features and further improve the diagnostic accuracy by SVM. The effectiveness of the RS_SVM is examined on Wisconsin Breast Cancer Dataset (WBCD) using classification accuracy, sensitivity, specificity, confusion matrix and receiver operating characteristic (ROC) curves. Experimental results demonstrate the proposed RS_SVM can not only achieve very high classification accuracy but also detect a combination of five informative features, which can give an important clue to the physicians for breast diagnosis.

Expert Systems With Applications | 2011

A new hybrid method based on local fisher discriminant analysis and support vector machines for hepatitis disease diagnosis

Huiling Chen; Dayou Liu; Bo Yang; Jie Liu; Gang Wang

In this paper, a novel hybrid method named the LFDA_SVM, which integrates a new feature extraction method and a classification algorithm, has been introduced for diagnosing hepatitis disease. The two integrated methods are the local fisher discriminant analysis (LFDA) and the supporting vector machine (SVM), respectively. In the proposed LFDA_SVM, the LFDA is employed as a feature extraction tool for dimensionality reduction in order to further improve the diagnostic accuracy of the standard SVM algorithm. The effectiveness of the LFDA_SVM has been rigorously evaluated against the hepatitis dataset, a benchmark dataset, from UCI Machine Learning Database in terms of classification accuracy, sensitivity and specificity respectively. In addition, the proposed LFDA_SVM has been compared with three existing methods including the SVM based on principle component analysis (PCA_SVM), the SVM based on fisher discriminant analysis (FDA_SVM) and the standard SVM in terms of their classification accuracy. Experimental results have demonstrated that the LFDA_SVM greatly outperforms other three methods. The best classification accuracy (96.77%) obtained by the LFDA_SVM is much higher than that of the compared ones. Promisingly, the proposed LFDA_SVM might serve as a new candidate of powerful methods for diagnosing hepatitis with excellent performance.

international conference on data mining | 2003

The rough set approach to association rule mining

J. W. Guan; David A. Bell; Dayou Liu

In transaction processing, an association is said to exist between two sets of items when a transaction containing one set is likely to also contain the other. In information retrieval, an association between two sets of keywords occurs when they cooccur in a document. Similarly, in data mining, an association occurs when one attribute set occurs together with another. As the number of such associations may be large, maximal association rules are sought, e.g., Feldman et al. (1997, 1998). Rough set theory is a successful tool for data mining. By using this theory, rules similar to maximal associations can be found. However, we show that the rough set approach to discovering knowledge is much simpler than the maximal association method.

Journal of Medical Systems | 2012

A Computer Aided Diagnosis System for Thyroid Disease Using Extreme Learning Machine

Lina Li; Jihong Ouyang; Huiling Chen; Dayou Liu

In this paper, we present an effective and efficient computer aided diagnosis (CAD) system based on principle component analysis (PCA) and extreme learning machine (ELM) to assist the task of thyroid disease diagnosis. The CAD system is comprised of three stages. Focusing on dimension reduction, the first stage applies PCA to construct the most discriminative new feature set. After then, the system switches to the second stage whose target is model construction. ELM classifier is explored to train an optimal predictive model whose parameters are optimized. As we known, the number of hidden neurons has an important role in the performance of ELM, so we propose an experimental method to hunt for the optimal value. Finally, the obtained optimal ELM model proceeds to perform the thyroid disease diagnosis tasks using the most discriminative new feature set and the optimal parameters. The effectiveness of the resultant CAD system (PCA-ELM) has been rigorously estimated on a thyroid disease dataset which is taken from UCI machine learning repository. We compare it with other related methods in terms of their classification accuracy. Experimental results demonstrate that PCA-ELM outperforms other ones reported so far by 10-fold cross-validation method, with the mean accuracy of 97.73% and with the maximum accuracy of 98.1%. Besides, PCA-ELM performs much faster than support vector machines (SVM) based CAD system. Consequently, the proposed method PCA-ELM can be considered as a new powerful tools for diagnosing thyroid disease with excellent performance and less time.

Journal of Medical Systems | 2012

Design of an Enhanced Fuzzy k-nearest Neighbor Classifier Based Computer Aided Diagnostic System for Thyroid Disease

Dayou Liu; Huiling Chen; Bo Yang; Xin-En Lv; Lina Li; Jie Liu

In this paper, we present an enhanced fuzzy k-nearest neighbor (FKNN) classifier based computer aided diagnostic (CAD) system for thyroid disease. The neighborhood size k and the fuzzy strength parameter m in FKNN classifier are adaptively specified by the particle swarm optimization (PSO) approach. The adaptive control parameters including time-varying acceleration coefficients (TVAC) and time-varying inertia weight (TVIW) are employed to efficiently control the local and global search ability of PSO algorithm. In addition, we have validated the effectiveness of the principle component analysis (PCA) in constructing a more discriminative subspace for classification. The effectiveness of the resultant CAD system, termed as PCA-PSO-FKNN, has been rigorously evaluated against the thyroid disease dataset, which is commonly used among researchers who use machine learning methods for thyroid disease diagnosis. Compared to the existing methods in previous studies, the proposed system has achieved the highest classification accuracy reported so far via 10-fold cross-validation (CV) analysis, with the mean accuracy of 98.82% and with the maximum accuracy of 99.09%. Promisingly, the proposed CAD system might serve as a new candidate of powerful tools for diagnosing thyroid disease with excellent performance.

asia-pacific web conference | 2007

Process Mining: Extending α-Algorithm to Mine Duplicate Tasks in Process Logs

Jiafei Li; Dayou Liu; Bo Yang

Process mining is a new technology which can distill workflow models from a set of real executions. However, the present research in process mining still meets many challenges. The problem of duplicate tasks is one of them, which refers to the situation that the same task can appear multiple times in one workflow model. The “α-algorithm” is proved to mine sound Structured Workflow nets without task duplication. In this paper, basing on the “α-algorithm”, a new algorithm (the “α*-algorithm”) is presented to deal with duplicate tasks and has been implemented in a research prototype. In eight scenarios, the “α*-algorithm” is evaluated experimentally to show its validity.

web age information management | 2004

Spatio-temporal Database with Multi-granularities

Sheng-sheng Wang; Dayou Liu

Spatio-temporal information process grows up quickly in these years. Although uncertain and multi-granularities is the common features of spatio-temporal data, those problems were not well solved yet. A new spatio-temporal granularity representation method which supports multiple granularities and uncertain spatio-temporal objects is put forward. Then this method is applied to a spatio-temporal database ASTD. By supporting multiple granularities and approximate region, ASTD can process the multiple level data together, perform the query in variant precision and handle uncertainty. Compared with similar systems, ASTD is more powerful in multiple granularities, uncertainty, object type and granularity operations.

knowledge discovery and data mining | 2011

An adaptive fuzzy k-nearest neighbor method based on parallel particle swarm optimization for bankruptcy prediction

Huiling Chen; Dayou Liu; Bo Yang; Jie Liu; Gang Wang; Su-Jing Wang

This study proposes an efficient non-parametric classifier for bankruptcy prediction using an adaptive fuzzy k-nearest neighbor (FKNN) method, where the nearest neighbor k and the fuzzy strength parameter m are adaptively specified by the particle swarm optimization (PSO) approach. In addition to performing the parameter optimization for FKNN, PSO is utilized to choose the most discriminative subset of features for prediction as well. Time varying acceleration coefficients (TVAC) and inertia weight (TVIW) are employed to efficiently control the local and global search ability of PSO. Moreover, both the continuous and binary PSO are implemented in parallel on a multi-core platform. The resultant bankruptcy prediction model, named PTVPSO-FKNN, is compared with three classification methods on a real-world case. The obtained results clearly confirm the superiority of the developed model as compared to the other three methods in terms of Classification accuracy, Type I error, Type II error and AUC (area under the receiver operating characteristic (ROC) curve) criterion. It is also observed that the PTVPSO-FKNN is a powerful feature selection tool which has indentified a subset of best discriminative features. Additionally, the proposed model has gained a great deal of efficiency in terms of CPU time owing to the parallel implementation.

Mathematical and Computer Modelling | 2013

Mathematical modeling for active and dynamic diagnosis of crop diseases based on Bayesian networks and incremental learning

Yungang Zhu; Dayou Liu; Guifen Chen; Haiyang Jia; Helong Yu

Abstract To achieve rapid and precise diagnosis of crop diseases, an active and dynamic method of diagnosis of crop diseases is needed and such a method is proposed in this paper. This method adopts Bayesian networks to represent the relationships among the symptoms and crop diseases. This method has two main differences from the existing diagnosis methods. First, it does not use all the symptoms in the diagnosis, but purposively selects a subset of symptoms which are the most relevant to diagnosis; the active symptom selection is based on the concept of a Markov blanket in a Bayesian network. Second, a specific incremental learning algorithm for Bayesian networks is also proposed to make the diagnosis model update dynamically over time in order to adapt to temporal changes of environment. Furthermore, the diagnosis results can be calculated without inference in Bayesian networks, so the method has low time complexity. Theoretical analysis and experimental results demonstrate that the proposed method can significantly enhance the performance of crop disease diagnosis.

Mathematical and Computer Modelling | 2010

A neural network ensemble method for precision fertilization modeling

Helong Yu; Dayou Liu; Guifen Chen; Baocheng Wan; Sheng-Sheng Wang; Bo Yang

There exists a nonlinear relationship between fertilizer input and soil nutrient level. To calculate the fertilization rate more precisely, a novel neural network ensemble method has been proposed, in which the K-means clustering method is used to select optimal networks individually and a Lagrange multiplier is used to combine these selected networks. On the basis of the above neural network ensemble method, a fertilization model is constructed. In this model, the soil nutrient level and the fertilization rate are taken as neural network inputs and the yield is taken as the output. This model transforms the calculation of the fertilization rate into solving a programming problem, and can be used to calculate the fertilization rate with maximum yield and maximum profit as well as to forecast the yield. Furthermore, this fertilization model has been tested on fertilizer effect data. The results show that the value forecast using the neural network ensemble is more accurate than that obtained with individual neural networks. The fertilization model constructed in this paper not only can precisely simulate the nonlinear relationship between yield and soil nutrient level, but also can adequately make use of the existing fertilizer effect data.

Explore More