Will N. Browne
Victoria University of Wellington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Will N. Browne.
IEEE Transactions on Systems, Man, and Cybernetics | 2013
Bing Xue; Mengjie Zhang; Will N. Browne
Classification problems often have a large number of features in the data sets, but not all of them are useful for classification. Irrelevant and redundant features may even reduce the performance. Feature selection aims to choose a small number of relevant features to achieve similar or even better classification performance than using all features. It has two main conflicting objectives of maximizing the classification performance and minimizing the number of features. However, most existing feature selection algorithms treat the task as a single objective problem. This paper presents the first study on multi-objective particle swarm optimization (PSO) for feature selection. The task is to generate a Pareto front of nondominated solutions (feature subsets). We investigate two PSO-based multi-objective feature selection algorithms. The first algorithm introduces the idea of nondominated sorting into PSO to address feature selection problems. The second algorithm applies the ideas of crowding, mutation, and dominance to PSO to search for the Pareto front solutions. The two multi-objective algorithms are compared with two conventional feature selection methods, a single objective feature selection method, a two-stage feature selection algorithm, and three well-known evolutionary multi-objective algorithms on 12 benchmark data sets. The experimental results show that the two PSO-based multi-objective algorithms can automatically evolve a set of nondominated solutions. The first algorithm outperforms the two conventional methods, the single objective method, and the two-stage algorithm. It achieves comparable results with the existing three well-known multi-objective algorithms in most cases. The second algorithm achieves better results than the first algorithm and all other methods mentioned previously.
IEEE Transactions on Evolutionary Computation | 2016
Bing Xue; Mengjie Zhang; Will N. Browne; Xin Yao
Feature selection is an important task in data mining and machine learning to reduce the dimensionality of the data and increase the performance of an algorithm, such as a classification algorithm. However, feature selection is a challenging task due mainly to the large search space. A variety of methods have been applied to solve feature selection problems, where evolutionary computation (EC) techniques have recently gained much attention and shown some success. However, there are no comprehensive guidelines on the strengths and weaknesses of alternative approaches. This leads to a disjointed and fragmented field with ultimately lost opportunities for improving performance and successful applications. This paper presents a comprehensive survey of the state-of-the-art work on EC for feature selection, which identifies the contributions of these different algorithms. In addition, current issues and challenges are also discussed to identify promising areas for future research.
Applied Soft Computing | 2014
Bing Xue; Mengjie Zhang; Will N. Browne
In classification, feature selection is an important data pre-processing technique, but it is a difficult problem due mainly to the large search space. Particle swarm optimisation (PSO) is an efficient evolutionary computation technique. However, the traditional personal best and global best updating mechanism in PSO limits its performance for feature selection and the potential of PSO for feature selection has not been fully investigated. This paper proposes three new initialisation strategies and three new personal best and global best updating mechanisms in PSO to develop novel feature selection approaches with the goals of maximising the classification performance, minimising the number of features and reducing the computational time. The proposed initialisation strategies and updating mechanisms are compared with the traditional initialisation and the traditional updating mechanism. Meanwhile, the most promising initialisation strategy and updating mechanism are combined to form a new approach (PSO(4-2)) to address feature selection problems and it is compared with two traditional feature selection methods and two PSO based methods. Experiments on twenty benchmark datasets show that PSO with the new initialisation strategies and/or the new updating mechanisms can automatically evolve a feature subset with a smaller number of features and higher classification performance than using all features. PSO(4-2) outperforms the two traditional methods and two PSO based algorithm in terms of the computational time, the number of features and the classification performance. The superior performance of this algorithm is due mainly to both the proposed initialisation strategy, which aims to take the advantages of both the forward selection and backward selection to decrease the number of features and the computational time, and the new updating mechanism, which can overcome the limitations of traditional updating mechanisms by taking the number of features into account, which reduces the number of features and the computational time.
IEEE Transactions on Evolutionary Computation | 2014
Muhammad Iqbal; Will N. Browne; Mengjie Zhang
Evolutionary computation techniques have had limited capabilities in solving large-scale problems due to the large search space demanding large memory and much longer training times. In the work presented here, a genetic programming like rich encoding scheme has been constructed to identify building blocks of knowledge in a learning classifier system. The fitter building blocks from the learning system trained against smaller problems have been utilized in a higher complexity problem in the domain to achieve scalable learning. The proposed system has been examined and evaluated on four different Boolean problem domains: 1) multiplexer, 2) majority-on, 3) carry, and 4) even-parity problems. The major contribution of this paper is to successfully extract useful building blocks from smaller problems and reuse them to learn more complex large-scale problems in the domain, e.g., 135-bit multiplexer problem, where the number of possible instances is 2135 ≈ 4 × 1040, is solved by reusing the extracted knowledge from the learned lower level solutions in the domain. Autonomous scaling is, for the first time, shown to be possible in learning classifier systems. It improves effectiveness and reduces the number of training instances required in large problems, but requires more time due to its sequential build-up of knowledge.
Connection Science | 2012
Bing Xue; Liam Cervante; Lin Shang; Will N. Browne; Mengjie Zhang
Feature selection has the two main objectives of minimising the classification error rate and the number of features. Based on binary particle swarm optimisation (BPSO), we develop two novel multi-objective feature selection frameworks for classification, which are multi-objective binary PSO using the idea of non-dominated sorting (NSBPSO) and multi-objective binary PSO using the ideas of crowding, mutation and dominance (CMDBPSO). Four multi-objective feature selection methods are then developed by applying mutual information and entropy as two different filter evaluation criteria in each of the proposed frameworks. The proposed algorithms are examined and compared with a single objective method on eight benchmark data sets. Experimental results show that the proposed multi-objective algorithms can evolve a set of solutions that use a smaller number of features and achieve better classification performance than using all features. In most cases, NSBPSO achieves better results than the single objective algorithm and CMDBPSO outperforms all other methods mentioned above. This work represents the first study on multi-objective BPSO for filter-based feature selection.
genetic and evolutionary computation conference | 2012
Bing Xue; Mengjie Zhang; Will N. Browne
Feature selection (FS) is an important data preprocessing technique, which has two goals of minimising the classification error and minimising the number of features selected. Based on particle swarm optimisation (PSO), this paper proposes two multi-objective algorithms for selecting the Pareto front of non-dominated solutions (feature subsets) for classification. The first algorithm introduces the idea of non-dominated sorting based multi-objective genetic algorithm II into PSO for FS. In the second algorithm, multi-objective PSO uses the ideas of crowding, mutation and dominance to search for the Pareto front solutions. The two algorithms are compared with two single objective FS methods and a conventional FS method on nine datasets. Experimental results show that both proposed algorithms can automatically evolve a smaller number of features and achieve better classification performance than using all features and feature subsets obtained from the two single objective methods and the conventional method. Both the continuous and the binary versions of PSO are investigated in the two proposed algorithms and the results show that continuous version generally achieves better performance than the binary version. The second new algorithm outperforms the first algorithm in both continuous and binary versions.
International Journal of Computational Intelligence and Applications | 2014
Bing Xue; Liam Cervante; Lin Shang; Will N. Browne; Mengjie Zhang
Feature selection is a multi-objective problem, where the two main objectives are to maximize the classification accuracy and minimize the number of features. However, most of the existing algorithms belong to single objective, wrapper approaches. In this work, we investigate the use of binary particle swarm optimization (BPSO) and probabilistic rough set (PRS) for multi-objective feature selection. We use PRS to propose a new measure for the number of features based on which a new filter based single objective algorithm (PSOPRSE) is developed. Then a new filter-based multi-objective algorithm (MORSE) is proposed, which aims to maximize a measure for the classification performance and minimize the new measure for the number of features. MORSE is examined and compared with PSOPRSE, two existing PSO-based single objective algorithms, two traditional methods, and the only existing BPSO and PRS-based multi-objective algorithm (MORSN). Experiments have been conducted on six commonly used discrete datasets with a relative small number of features and six continuous datasets with a large number of features. The classification performance of the selected feature subsets are evaluated by three classification algorithms (decision trees, Naive Bayes, and k-nearest neighbors). The results show that the proposed algorithms can automatically select a smaller number of features and achieve similar or better classification performance than using all features. PSOPRSE achieves better performance than the other two PSO-based single objective algorithms and the two traditional methods. MORSN and MORSE outperform all these five single objective algorithms in terms of both the classification performance and the number of features. MORSE achieves better classification performance than MORSN. These filter algorithms are general to the three different classification algorithms.
soft computing | 2013
Muhammad Iqbal; Will N. Browne; Mengjie Zhang
The main goal of the research direction is to extract building blocks of knowledge from a problem domain. Once extracted successfully, these building blocks are to be used in learning more complex problems of the domain, in an effort to produce a scalable learning classifier system (LCS). However, whilst current LCS (and other evolutionary computation techniques) discover good rules, they also create sub-optimum rules. Therefore, it is difficult to separate good building blocks of information from others without extensive post-processing. In order to provide richness in the LCS alphabet, code fragments similar to tree expressions in genetic programming are adopted. The accuracy-based XCS concept is used as it aims to produce maximally general and accurate classifiers, albeit the rule base requires condensation (compaction) to remove spurious classifiers. Serendipitously, this work on scalability of LCS produces compact rule sets that can be easily converted to the optimum population. The main contribution of this work is the ability to clearly separate the optimum rules from others without the need for expensive post-processing for the first time in LCS. This paper identifies that consistency of action in rich alphabets guides LCS to optimum rule sets.
congress on evolutionary computation | 2012
Bing Xue; Mengjie Zhang; Will N. Browne
Feature selection is an important data preprocessing technique in classification problems. This paper proposes two new fitness functions in binary particle swarm optimisation (BPSO) for feature selection to choose a small number of features and achieve high classification accuracy. In the first fitness function, the relative importance of classification performance and the number of features are balanced by using a linearly increasing weight in the evolutionary process. The second is a two-stage fitness function, where classification performance is optimised in the first stage and the number of features is taken into account in the second stage. K-nearest neighbour (KNN) is employed to evaluate the classification performance in the experiments on ten datasets. Experimental results show that by using either of the two proposed fitness functions in the training process, in almost all cases, BPSO can select a smaller number of features and achieve higher classification accuracy on the test sets than using overall classification performance as the fitness function. They outperform two conventional feature selection methods in almost all cases. In most cases, BPSO with the second fitness function can achieve better performance than with the first fitness function in terms of classification accuracy and the number of features.
genetic and evolutionary computation conference | 2013
Muhammad Iqbal; Will N. Browne; Mengjie Zhang
Evolutionary computational techniques have had limited capabilities in solving large-scale problems, due to the large search space demanding large memory and much longer training time. Recently work has begun on automously reusing learnt building blocks of knowledge to scale from low dimensional problems to large-scale ones. An XCS-based classifier system has been shown to be scalable, through the addition of tree-like code fragments, to a limit beyond standard learning classifier systems. Self-modifying cartesian genetic programming (SMCGP) can provide general solutions to a number of problems, but the obtained solutions for large-scale problems are not easily interpretable. A limitation in both techniques is the lack of a cyclic representation, which is inherent in finite state machines. Hence this work introduces a state-machine based encoding scheme into scalable XCS, for the first time, in an attempt to develop a general scalable classifier system producing easily interpretable classifier rules. The proposed system has been tested on four different Boolean problem domains, i.e. even-parity, majority-on, carry, and multiplexer problems. The proposed approach outperformed standard XCS in three of the four problem domains. In addition, the evolved machines provide general solutions to the even-parity and carry problems that are easily interpretable as compared with the solutions obtained using SMCGP.