Divakar Singh
Barkatullah University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Divakar Singh.
International Journal of Computer Applications | 2013
Anuradha Patra; Divakar Singh
Text classification approach gaining more importance because of the accessibility of large number of electronic documents from a variety of resource. Text categorization is the task of assigning predefined categories to documents. It is the method of finding interesting regularities in large textual, where interesting means non trivial, hidden, previously unknown and potentially useful. The goal of text mining is to enable users to extract information from textual resource and deals with operation such as retrieval, classification, clustering, data mining, natural language preprocessing and machine learning techniques together to classify different pattern. In text classification, term weighting methods design appropriate weights to the given terms to improve the text classification performance. This paper surveys of text classification, process of text classification different term weighing methods and comparisons between different classification algorithms.
International Journal of Computer Applications | 2013
Anuradha Patra; Divakar Singh
With the rapid growth of online information there is growing need for tools that help in finding filtering and managing the high dimensional data .text classification is a supervised learning task whose goal is to classify document into the predefined categories. Phases involved in text classification are collecting data set, preprocessing, stemming, and implementing the classifier and performance measure. There are several learning method for Text classification such as Naive bayes, k-nearest neighbor decision tree, SVM, BPNN etc. algorithm is applied to multilayer feed forward networks consisting of processing element with continuous differentiable activation function. The network associated with back propagation learning algorithm called BPNN. This paper demonstrates the result of text classification using BPNN and relevance factor (rf) as term weighing method.
international conference on communication systems and network technologies | 2014
Vikram Garg; Anju Singh; Divakar Singh
The significant development in field of data collection and data storage technologies have provided transactional data to grow in data warehouses that reside in companies and public sector organizations. As the data is growing day by day, there has to be certain mechanism that could analyze such large volume of data. Data mining is a way of extracting the hidden predictive information from those data warehouses without revealing their sensitive information. Privacy preserving data mining (PPDM) is the recent research area that deals with the problem of hiding the sensitive information while analyzing data. Association Rule Hiding is one of the techniques of PPDM to hide association rules generated by Association Rule Generation Algorithms. In this paper we will provide a comparative theoretical analysis of Algorithms that have been developed for Association Rule Hiding.
wireless and optical communications networks | 2013
Surendra Kumar Chadokar; Divakar Singh; Anju Singh
Association rule mining is a technique of generating frequent item sets so that the analysis on the basis of these sets can be used for different application areas such as analysis of network traffic. Although the frequent sets generated using apriori algorithm provides less computational time and provides less frequent sets, but the technique that we are implemented here provides less computational time as compared as well generated less sets and provides less rules for the network traffics. These frequent sets are used for the analysis of traffic in the network so that the analysis of different spams or any unwanted issues can be detected easily.
International Journal of Computer Applications | 2014
Vikram Garg; Anju Singh; Divakar Singh
In the recent years, data mining has emerged as a very popular tool for extracting hidden knowledge from collection of large amount of data. One of the major challenges of data mining is to find the hidden knowledge in the data while the sensitive information is not revealed. Many strategies have been proposed to hide the information containing sensitive data. Privacy preserving data mining is an answer to such challenge. Association rule hiding is one of the PPDM techniques to protect the sensitive association rule generated by Association rule mining (ARM). In this paper, the data distortion technique for hiding the sensitive information is used. The proposed approach uses the concept of Representative Rule (RR) which is used to prune the number of association rule. The proposed algorithm hides the more number of rules while making the fewer database scans.
International Journal of Computer Applications | 2013
Preeti Yadav; Divakar Singh
The paper proposes a parallel SVM for detecting intrusions in computer network. The success of any Intrusion Detection System (IDS) is a complex problem due to its non-linearity and quantitative or qualitative traffic stream with irrelevant and unnecessary features. How to choose effective and key features of IDS is a very important topic in information security. Since the training data set size may be very large with a large number of parameters, which makes it difficult to handle single SVM therefore parallel LMM concept is proposed in this paper for distributing data files to n different sets of n different devices that reduce computational complexity, computational power and memory for each machine. The proposed method is simple but very reliable parallel operation SVM and can be used for large data files and unbalanced method also provides the flexibility to change depending on the size of the data file, the processor and the memory available on the various units. The proposed method is simulated using MATLAB and the result shows its superiority.
International Journal of Computer Applications | 2015
Jagrati Malviya; Anju Singh; Divakar Singh
There are lots of data mining tasks such as association rule, clustering, classification, regression and others. Among these tasks association rule mining is most prominent. One of the most popular approaches to find frequent item set in a given transactional dataset is Association rule mining. Frequent pattern mining is one of the most important tasks for discovering useful meaningful patterns from large collection of data. The FP Growth algorithm is currently one of the fastest approaches to frequent item set mining. This paper proposed an efficient and improved FP Tree algorithm which used a projection method to reduce the database scan and save the execution time. The advantage of PFP Tree is that it takes less memory and time in association mining. Experimental result showed that the improved PFP Tree algorithm performs faster than FP growth Tree algorithm and partition projection algorithm. It is more efficient and scalable in the case of large volume of data. The effectiveness of the method has been justified over a sample our one super market database.
International Journal of Computer Applications | 2015
Neha Patel; Divakar Singh
Data mining is one of the most important steps of the knowledge discovery in databases process and is considered as significant subfield in knowledge management. A classification of the data mining methods would greatly simplify the understanding of the whole space of available methods. Decision tree learning algorithm has been successfully used in expert systems in capturing knowledge. Most decision tree classifiers are designed to classify the data with categorical or Boolean class labels. To the best of our knowledge, no previous research has considered the induction of decision trees from data with data dissimilarities. This work proposes a novel classification algorithm for learning decision tree classifiers from data using dissimilarities with less complexity and less time to construct decision tree.
2015 2nd International Conference on Electronics and Communication Systems (ICECS) | 2015
Pooja Sharma; Divakar Singh; Anju Singh
Classification is widely used technique in the data mining domain, where scalability and efficiency are the immediate problems in classification algorithms for large databases. Now a days large amount of data is generated, that need to be analyse, and pattern have to be extracted from that to get some knowledge. Classification is a supervised machine learning task which builds a model from labelled training data. The model is used for determining the class; there are many types of classification algorithms such as tree-based algorithms (C4.5 decision tree, j48 decision tree etc.), naive Bayes and many more. These classification algorithms have their own pros and cons, depending on many factors such as the characteristics of the data. We can measure the classification performance by using several metrics, such as accuracy, precision, classification error and kappa on the testing data. We have used a random dataset in a rapid miner tool for the classification. Stratified sampling is used in different classifier such as J48, C4.5 and naïve Bayes. We analysed the result of the classifier using the randomly generated dataset and without random dataset.
International Journal of Computer Applications | 2014
Jyotsna Bansal; Divakar Singh; Anju Singh
In the case of different diseases classification is an important aspect so that one can find the infected set efficiently. In this paper three different dataset named Leukemia, Lung Cancer and Prostate from the UCI machine learning repository are considered and apply efficient association based ant colony optimization for improving the classification accuracy. In our approach one can select the dataset. The data set has been refined according to the attributes. Then final data set is achieved on which we apply the next inabilities. The maximum threshold will be determined by finding the support value. So the support values are fetched and according to the support value, it will be categorized in two different parts that is relevant or irrelevant. In our case it is 0.5. If the set crosses the maximum threshold then it will be qualify for the final set otherwise it is discarded. Then ACO mechanism has been applied on the final dataset to find the classification accuracy. Our results show the effectiveness of our approach. General Terms Leukemia, lung cancer, prostate