Mrithyumjaya Rao Kuppa
Vaagdevi College of Engineering
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mrithyumjaya Rao Kuppa.
international conference on information technology | 2010
Ali Mirza Mahmood; Mrithyumjaya Rao Kuppa
In this paper, we propose a new pruning method which is a combination of pre-pruning and post-pruning, aiming on both classification accuracy and tree size. Based upon this method, we induce a decision tree. The experimental results are computed by using 18 benchmark datasets from UCI Machine Learning Repository. The results, when compared to benchmark algorithms, indicate that our new tree pruning method considerably reduces the tree size and increases the accuracy in general. We have also conducted a case study of heart disease dataset by using our improved algorithm. This study suggests that (Thal), type of defect in heart is the most important predictor for confirming the presence of heart disease. Number of major vessels colored by fluoroscopy (MV) and type of chest pain (Chest) as biomarkers of heart disease.
Progress in Artificial Intelligence | 2013
Satuluri Naganjaneyulu; Mrithyumjaya Rao Kuppa
In data mining and Knowledge Discovery hidden and valuable knowledge from the data sources is discovered. The traditional algorithms used for knowledge discovery are bottle necked due to wide range of data sources availability. Class imbalance is one of the problems arises due to data source which provide unequal class, i.e., examples of one class in a training dataset vastly outnumber examples of the other class(es). Researchers have rigorously studied several techniques to alleviate the problem of class imbalance, including resampling algorithms, and feature selection approaches to this problem. In this paper, we present a new hybrid frame work and two algorithms dubbed as Class Imbalance Learning using Intelligent Under Sampling—Tree and Neural Network versions (CILIUS-T, CILIUS-NN) for learning from skewed training data. These algorithms provide a simpler and faster alternative by using C4.5 and Neural Network as base algorithm. We conduct experiments using ten UCI datasets from various application domains using five algorithms for comparison on five evaluation metrics. Experimental results show that our method has higher Area under the ROC Curve, F-measure, precision, TP rate and TN rate values than many existing class imbalance learning methods.
swarm evolutionary and memetic computing | 2011
Ali Mirza Mahmood; Mohammad Imran; Naganjaneyulu Satuluri; Mrithyumjaya Rao Kuppa; Vemulakonda Rajesh
Data mining tasks results are usually improved by reducing the dimensionality of data. This improvement however is achieved harder in the case that data size is moderate or huge. Although numerous algorithms for accuracy improvement have been proposed, all assume that inducing a compact and highly generalized model is difficult. In order to address above said issue, we introduce Randomized Gini Index (RGI), a novel heuristic function for dimensionality reduction, particularly applicable in large scale databases. Apart from removing irrelevant attributes, our algorithm is capable of minimizing the level of noise in the data to a greater extend which is a very attractive feature for data mining problems. We extensively evaluate its performance through experiments on both artificial and real world datasets. The outcome of the study shows the suitability and viability of our approach for knowledge discovery in moderate and large datasets.
Trendz in Information Sciences & Computing(TISC2010) | 2010
Ali Mirza Mahmood; Mrithyumjaya Rao Kuppa
In this paper, we present a practical algorithm to deal with the data specific classification problem when there are datasets with different properties. We proposed to integrate error rate, missing values and expert judgment as factors for determining data specific pruning to form Expert Knowledge Based Pruning (EKBP). We conduct an extensive experimental study on openly available 40 real world datasets from UCI repository. In all these experiments, the proposed approach shows considerably reduction of tree size and achieves equal or better accuracy compared to several bench mark decision tree methods. We have also conducted a case study of heart disease dataset by using our improved algorithm. This study suggests that (Thal), type of defect in heart is the most important predictor for confirming heart disease presence, Number of major vessels colored by fluoroscopy (MV) and type of chest pain (Chest) as biomarkers of heart disease.
international conference on emerging trends in electrical and computer technology | 2011
Ali Mirza Mahmood; Mrithyumjaya Rao Kuppa; V Sai Phani Chandu
Decision trees induction is among powerful and commonly encountered architecture for extracting of classification knowledge from datasets of labeled instances. However, learning decision trees from large irrelevant datasets is quite different from learning small and moderate sized datasets. In this paper, we propose a simple yet effective composite splitting criterion equal to a random sampling approach and gain ratio. Our random sampling method depends on small random subset of attributes and it is computationally cheap to act on such a set in a reasonable time. The superiority of the composite splitting criterion can persist when used for high dimensional datasets with irrelevant attributes. The empirical and theoretical prospective are validated by using 40 UCI datasets. The experimental results indicate that the proposed new heuristic function can result in much more simpler trees with almost unaffected or improved accuracy.
international conference on human-computer interaction | 2010
Ali Mirza Mahmood; Mrithyumjaya Rao Kuppa
The ever growing presence of data lead to a large number of proposed algorithms for classification and especially decision trees over the last few years. However, learning decision trees from large irrelevant datasets is quite different from learning small and moderate sized datasets. In practice, use of only small and moderate sized datasets is rare. Unfortunately, the most popular heuristic function gain ratio has a serious disadvantage towards dealing with large and irrelevant datasets. To tackle these issues, we design a new composite splitting criterion with random sampling approach. Our random sampling method depends on small random subset of attributes and it is computationally cheap to act on such a set in a reasonable time. The empirical and theoretical properties are validated by using 40 UCI datasets. The experimental result supports the efficacy of the proposed method in terms of tree size and accuracy.
international conference on human-computer interaction | 2010
Ali Mirza Mahmood; Mrithyumjaya Rao Kuppa
Many traditional pruning methods assume that all the datasets are equally probable and equally important. Thus, they apply equal pruning to all the datasets. However, in real-world classification problems, all the datasets are not equal. Consequently, considering equal pruning rate tends to generate inefficient and large size decision trees. Therefore, we propose a practical algorithm to deal with the data specific classification problem when there are datasets with different properties. In this paper, First, we computed the data specific pruning values for each dataset. Then, we used expert knowledge to find inexact pruning value. Finally, we integrated those values in a well established pruning technique to form Expert Knowledge based Pruning (EKBP). We empirically validated the analysis with publicly available 40 datasets from UCI on four existing techniques. Both the analytical and experimental results have shown that our proposed method achieves reduction of tree size and retains equal or better accuracy.
Trendz in Information Sciences & Computing(TISC2010) | 2010
Ali Mirza Mahmood; Mrithyumjaya Rao Kuppa
In this work we investigate several issues in order to improve the performance of decision trees. Firstly, we introduced or adopt a new composite splitting criterion aimed to improve classification accuracy. Secondly, we derive a new pruning technique using expert knowledge, which is able to significantly reduce the size of tree without degrading the classification accuracy. Finally, we implemented our new splitting criterion and pruning technique to form a new decision tree model; Classification Using Randomization and Expert knowledge (CURE). Carried out experiments using 40 UCI datasets on four existing algorithms showed empirical effectiveness of the devised approach.
Archive | 2012
Naganjaneyulu Satuluri; Mrithyumjaya Rao Kuppa
Archive | 2014
Satuluri Naganjaneyulu; Mrithyumjaya Rao Kuppa; Ali MIrza