Chunguang Zhou
Ministry of Education
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chunguang Zhou.
Expert Systems With Applications | 2009
Xiujuan Xu; Chunguang Zhou; Zhe Wang
Credit scoring is very important in business, especially in banks. We want to describe a person who is a good credit or a bad one by evaluating his/her credit. We systematically proposed three link analysis algorithms based on the preprocess of support vector machine, to estimate an applicants credit so as to decide whether a bank should provide a loan to the applicant. The proposed algorithms have two major phases which are called input weighted adjustor and class by support vector machine-based models. In the first phase, we consider the link relation by link analysis and integrate the relation of applicants through their information into input vector of next phase. In the other phase, an algorithm is proposed based on general support vector machine model. A real world credit dataset is used to evaluate the performance of the proposed algorithms by 10-fold cross-validation method. It is shown that the genetic link analysis ranking methods have higher performance in terms of classification accuracy.
Archive | 2006
W.G. Zhou; Chunguang Zhou; G.X. Liu; H.Y. Lv; Yanchun Liang
Microarray technologies have made it straightforward to monitor simultaneously the expression pattern of thousands of genes. So an important task is to cluster gene expression data to identify groups of genes with similar patterns and hence similarfunctions.Inthispaper,animprovedquantum-inspiredevolutionaryalgo- rithm (IQEA) is first proposed for minimum sum-of-squares clustering. We have suggested a new representation form and added an additional mutation operation in IQEA. Experiment results show that IQEA appears to be much more robust in findingoptimumorbest-knownsolutionsandbesuperiortoconventionalk-means and self-organizing maps clustering algorithms even with a small population.
pacific rim international conference on artificial intelligence | 2004
Chunguo Wu; Wei Xiang; Yanchun Liang; Heow Pueh Lee; Chunguang Zhou
An operation template is proposed in this paper for describing the mapping between operations and a subset of natural numbers. With such operation template, a job shop scheduling problem (JSSP) can be transformed into a traveling salesman problem (TSP), hence the integer-coding genetic algorithm for TSP can be easily applied and modified. A decoding strategy, called virtual job shop, is proposed to evaluate the fitness of the individual in GA population. The integration of the operation template and virtual job shop makes the existing integer-coding GA possible for solving an extension of a classical job shop scheduling problem.
Pattern Recognition Letters | 2009
Xiujuan Xu; Yu Liu; Zhe Wang; Chunguang Zhou; Yanchun Liang
Catalog segmentation is an important issue in data mining in business from the microeconomic point of view. In catalog segmentation, an enterprise tries to develop k catalogs with r products that are sent to corresponding customers in order to maximize the overall number of catalog products purchased. In this paper, a novel model called catalog segmentation problem with double constraints (DCCSP) is presented. In this model, the interest constraint is minimized and the profit constraint is maximized so that the profit of products purchased by customers who have at least t interesting products in receiving catalogs is maximized. The complexity of the DCCSP is analyzed, and a DCCS algorithm to solve the optimization is proposed. The experimental results show that the proposed algorithm is efficient and can be used to solve the DCCSP effectively.
conference on industrial electronics and applications | 2006
Xiaoyu Chang; Wengang Zhou; Chunguang Zhou; Yanchun Liang
Identification of transcription factor binding sites (TFBS) from the upstream region of genes remains a highly important and unsolved problem particularly in higher eukaryotic genomes. In this paper, we propose a new approach to predict TFBS. This approach uses position weight matrix (PWM) to represent binding sites and uses genetic algorithm (GA) to search the best matrix. A new coding method so called multiple-variable coding is proposed in GA. We apply it on two transcription factors rebl and mgl. The result shows that this approach can find most of the known sites, which indicates that this method is very effective
british national conference on databases | 2006
Ying Wang; Jianbin Gong; Zhe Wang; Chunguang Zhou
Ontology mapping is one of the important problems for the development of Semantic Web. Establishing such mappings has been the focus of a variety of research originating from diverse communities. In this paper, we propose a Composite Approach for Ontology Mapping (ACAOM) for semi-automatic ontology mapping based on the combination of the name and instance methods. We conclude that the combination is a promising method.
fuzzy systems and knowledge discovery | 2006
Xiao-Yu Chang; Chunguang Zhou; Zhe Wang; Yan-Wen Li; Ping Hu
A novel algorithm FFSPAN (Fast Frequent Sequential Pattern mining algorithm) is proposed in this paper. FFSPAN mines all the frequent sequential patterns in large datasets, and solves the problem of searching frequent sequences in a sequence database by searching frequent items or frequent itemsets. Moreover, the databases that FFSPAN scans keep shrinking quickly, which makes the algorithm more efficient when the sequential patterns are longer. Experiments on standard test data show that FFSPAN is very effective.
granular computing | 2006
Dongbin Zhou; Lifeng Jia; Zhe Wang; Xiujuan Xu; Chunguang Zhou
Characteristics of data stream make it difficult for the clustering algorithms to satisfy the requirements on efficiency and effectiveness. This paper proposes a data stream clustering algorithm on dual-tier structure which employs the agent method. In the on-line process, a set of agents working simultaneously collect similar data points into sub-clusters by applying a heuristic strategy. And in the off-line process, summary information from the on-line component will be further analyzed to obtain the final clusters. The algorithm also supports the time-window queries on streams. The empirical evidence shows that this method can obtain high-quality clusters with low time complexity. analysis over an arbitrary period of the stream etc. As for stream clustering, a common method is dividing the streaming data into chunks, and algorithms for static sets can be used on each sub-set separately (2). In recent years, stream algorithms have developed into a two-phase structure (3), (4). Usually, a dual framework includes two parts: the on-line component and the off-line component. The former is responsible for the fast but rough processing of streaming data and saving the summary information to meet the one-pass restriction while the latter takes advantage of the information to conduct high-level analysis. At present, stream algorithms are still facing some problems, for example: sensitive to the initial data points; bad quality of clusters due to the loss of global information caused by dividing the stream; high time complexity etc. A novel dual-tier clustering algorithm for data streams, AGCluStream, is proposed in this paper. The on-line algorithm uses agents to make similar points denser in local areas, and record the temporary distribution of data according to the pyramidal time frame (3). The off-line algorithm uses these records to conduct time-window analysis and higher-level clustering analysis. AGCluStream dose not divide the stream, and it adopts an incomplete-partition strategy to maintain the global information more effectively.
fuzzy systems and knowledge discovery | 2005
Lifeng Jia; Chunguang Zhou; Zhe Wang; Xiujuan Xu
We proposed a new algorithm SuffixMiner which eliminates the requirement of multiple passes through the data when finding out all frequent itemsets in data streams, takes full advantage of the special property of suffix-tree to avoid generating candidate itemsets and traversing each suffix-tree during the itemset growth, and utilizes a new itemset growth method to mine all frequent itemsets in data streams. Experiment results show that the SuffixMiner algorithm not only has an excellent scalability to mine frequent itemsets over data streams, but also outperforms Apriori and Fp-Growth algorithms.
advanced data mining and applications | 2005
Lifeng Jia; Zhe Wang; Chunguang Zhou; Xiujuan Xu
We propose a novel approach for mining recent frequent itemsets. The approach has three key contributions. First, it is a single-scan algorithm which utilizes the special property of suffix-trees to guarantee that all frequent itemsets are mined. During the phase of itemset growth it is unnecessary to traverse the suffix-trees which are the data structure for storing the summary information of data. Second, our algorithm adopts a novel method for itemset growth which includes two special kinds of itemset growth operations to avoid generating any candidate itemset. Third, we devise a new regressive strategy from the attenuating phenomenon of radioelement in nature, and apply it into the algorithm to distinguish the influence of latest transactions from that of obsolete transactions. We conduct detailed experiments to evaluate the algorithm. It confirms that the new method has an excellent scalability and the performance illustrates better quality and efficiency.