Nasim Adnan
Charles Sturt University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nasim Adnan.
Knowledge Based Systems | 2016
Nasim Adnan; Zahidul Islam
A decision forest is an ensemble of decision trees, and it is often built to discover more patterns (i.e. logic rules) and predict/classify class values more accurately than a single decision tree. Existing decision forest algorithms are typically used for building huge numbers of decision trees, involving large memory and computational overhead, in order to achieve high accuracy. Generally, many of the trees do not contribute to improving the ensemble accuracy of a forest. As a result, ensemble pruning algorithms aim to get rid of those trees while generating a subforest in order to achieve higher (or comparable) ensemble accuracy than the original forest. The objectives are two fold: select as small number of trees as possible, and maintain the ensemble accuracy of the subforest as high as possible. An optimal subforest can be found by exhaustive search; however it is not practical for any standard-sized forest as the number of candidate subforests grows exponentially. In order to avoid the computational burden of an exhaustive search, many greedy and genetic algorithm-based subforest selection techniques have been proposed in literature. In this paper, we propose a subforest selection technique that achieves small size as well as high accuracy. We use a genetic algorithm where we carefully select high quality individual trees for the initial population of the genetic algorithm in order to improve the final output of the algorithm. Experiments are conducted on 20 data sets from the UCI Machine Learning Repository to compare the proposed technique with several existing state-of-the-art techniques. The results indicate that the proposed technique can select effective subforests which are significantly smaller than original forests while achieving better (or comparable) accuracy than the original forests.
advanced data mining and applications | 2014
Nasim Adnan
Random Forest is one of the most popular decision forest building algorithms that use decision trees as the base classifiers. The splitting attributes for decision trees of Random Forest are generally determined from a predefined number of randomly selected attribute subset of the original attribute set. In this paper, we propose a new technique that randomly determines the size of the attribute subset between a dynamically determined range based on the relative size of current data segment to the bootstrap samples at each node splitting event. We present elaborate experimental results involving five widely used data sets from the UCI Machine Learning Repository. The experimental results indicate the effectiveness of the proposed technique in the context of Random Forest.
Expert Systems With Applications | 2017
Nasim Adnan; Zahidul Islam
Forest PA assigns weights only on the attributes appearing in the latest tree.Weights are obtained randomly from dynamically determined weight ranges.Weights are incremented if the attributes do not appear in the subsequent tree(s).Forest PA is applied on 20 well known data sets.The experimental results indicate the effectiveness of Forest PA. In this paper, we propose a new decision forest algorithm that builds a set of highly accurate decision trees by exploiting the strength of all non-class attributes available in a data set, unlike some existing algorithms that use a subset of the non-class attributes. At the same time to promote strong diversity, the proposed algorithm imposes penalties (disadvantageous weights) to those attributes that participated in the latest tree in order to generate the subsequent trees. Besides, some other weight-related concerns are taken into account so that the trees generated by the proposed algorithm remain individually accurate and retain strong diversity. In order to show the worthiness of the proposed algorithm, we carry out experiments on 20 well known data sets that are publicly available from the UCI Machine Learning Repository. The experimental results indicate that the proposed algorithm is effective in generating highly accurate and more balanced decision forests compared to other prominent decision forest algorithms. Accordingly, the proposed algorithm is expected to be very effective in the domain of expert and intelligent systems.
advanced data mining and applications | 2016
Nasim Adnan; Zahidul Islam
Random Forest draws much interest from the research community because of its simplicity and excellent performance. The splitting attribute at each node of a decision tree for Random Forest is determined from a predefined number of randomly selected subset of attributes of the entire attribute set. The size of the subset is one of the most controversial points of Random Forest that encouraged many contributions. However, a little attention is given to improve Random Forest specifically for those records that are hard to classify. In this paper, we propose a novel technique of detecting hard-to-classify records and increase the weights of those records in a training data set. We then build Random Forest from the weighted training data set. The experimental results presented in this paper indicate that the ensemble accuracy of Random Forest can be improved when applied on weighted training data sets with more emphasis on hard-to-classify records.
advanced data mining and applications | 2017
Nasim Adnan; Zahidul Islam
Due to its simplicity and good performance, Random Forest attains much interest from the research community. The splitting attribute at each node of a decision tree for Random Forest is determined from a predefined number of randomly selected attributes (a subset of the entire attribute set). The size of an attribute subset (subspace) is one of the most important factors that stems multitude of influences over Random Forest. In this paper, we propose a new technique that dynamically determines the size of subspaces based on the relative size of the current data segment to the entire data set. In order to assess the effects of the proposed technique, we conduct experiments involving five widely used data set from the UCI Machine Learning Repository. The experimental results indicate the capability of the proposed technique on improving the ensemble accuracy of Random Forest.
the european symposium on artificial neural networks | 2015
Nasim Adnan; Zahidul Islam
australasian data mining conference | 2015
Nasim Adnan; Zahidul Islam
international conference on artificial intelligence | 2014
Nasim Adnan; Zahidul Islam
computer and information technology | 2014
Nasim Adnan; Zahidul Islam
the european symposium on artificial neural networks | 2015
Nasim Adnan; Zahidul Islam