Nagamma Patil
National Institute of Technology, Karnataka
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nagamma Patil.
international conference on industrial and information systems | 2014
R C Anirudha; Remya Kannan; Nagamma Patil
Data mining concepts have been extensively used for disease prediction in the medical field. Many Hybrid Prediction Models (HPM) have been proposed and implemented in this area, however, there is always a need for increasing accuracy and efficiency. The existing methods take into account all the features to build the classifier model thus reducing the accuracy and increasing the overall processing time. This paper proposes a Genetic Algorithm based Wrapper feature selection Hybrid Prediction Model (GWHPM). This model initially uses k-means clustering technique to remove the outliers from the dataset. Further, an optimal set of features are obtained by using Genetic Algorithm based Wrapper feature selection. Finally, it is used to build the classifier models such as Decision Tree, Naive Bayes, k nearest neighbor and Support Vector Machine. A comparative study of GWHPM is carried out and it is observed that the proposed model performed better than the existing methods.
international conference on advances in computing and communication engineering | 2015
Naganna Chetty; Kunwar Singh Vaisla; Nagamma Patil
Data mining is a process of extracting useful information from the huge amount of data. Data Mining has great scope in the field of medicine. This article deals with the working on PIMA and Liver-disorder datasets. Many researchers have proposed the use of K-nearest neighbor (KNN) algorithm for diabetes disease prediction. Some researchers have proposed a different approach by using K-means clustering for preprocessing and then using KNN for classification. These approaches resulted in poor classification accuracy or prediction. In our work we proposed and developed two different methods first one is fuzzy c-means clustering algorithm followed by a KNN classifier and second one is fuzzy c-means clustering algorithm followed by fuzzy KNN classifier to improve the accuracy of classification. We are successful in obtaining the better results than the existing methods for the given datasets. Our second approach produced better result than the first one. Classification is carried out using ten folds cross-validation technique.
ieee international advance computing conference | 2015
Utkarsh Gupta; Nagamma Patil
Recommender Systems are becoming inherent part of todays e-commerce applications. Since recommender system has a direct impact on the sales of many products therefore Recommender system plays an important role in e-commerce. Collaborative filtering is the oldest techniques used in the recommender system. A lot of work has been done towards the improvement of collaborative filtering which comprises of two components User Based and Item Based. The basic necessity of todays recommender system is accuracy and speed. In this work an efficient technique for recommender system based on Hierarchical Clustering is proposed. The user or item specific information is grouped into a set of clusters using Chameleon Hierarchical clustering algorithm. Further voting system is used to predict the rating of a particular item. In order to evaluate the performance of Chameleon based recommender system, it is compared with existing technique based on K-means clustering algorithm. The results demonstrates that Chameleon based Recommender system produces less error as compared to K-means based Recommender System.
international conference on advances in computing and communication engineering | 2015
Bhuvan M. Shashidhara; Siddharth Jain; Vinay D Rao; Nagamma Patil; G.S. Raghavendra
Big data is an emerging field with different datasets of various sizes are being analyzed for potential applications. In parallel, many frameworks are being introduced where these datasets can be fed into machine learning algorithms. Though some experiments have been done to compare different machine learning algorithms on different data, these experiments have not been tested out on different platforms. Our research aims to compare two selected machine learning algorithms on data sets of different sizes deployed on different platforms like Weka, Scikit-Learn and Apache Spark. They are evaluated based on Training time, Accuracy and Root mean squared error. This comparison helps us to decide what platform is best suited to work while applying computationally expensive selected machine learning algorithms on a particular size of data. Experiments suggested that Scikit-Learn would be optimal on data which can fit into memory. While working with huge, data Apache Spark would be optimal as it performs parallel computations by distributing the data over a cluster. Hence this study concludes that spark platform which has growing support for parallel implementation of machine learning algorithms could be optimal to analyze big data.
Archive | 2018
Rohit John Joseph; Prateek Narendra; Jashan Shetty; Nagamma Patil
Social media has grown rapidly in the past several years. Twitter in particular has seen a significant rise in its user audience because of the short and compact Tweet concept (140 characters). As more users come on board, it provides a large market for companies to advertise and find prospective customers by classifying users into different market categories. Traditional classification methods use TF–IDF and bag of words concept as the feature vector which inevitably is of large dimensions. In this paper we propose a method to improve the method of classification using semantic information to reduce dimensions of the feature vectors and validate this method by feeding them into multiple learning algorithms and evaluating the results.
Archive | 2018
Gaurav Gahlot; Nagamma Patil
Retail market has paced with an enormous rate, sprawling its effect over the nations. The B2C companies have been putting lucrative offers and schemes to fetch the customers’ attractions in the awe of upbringing the business profits, but with the mindless notion of the same. Knowledge discovery in the field of data mining can be well harnessed to achieve the profit benefits. This article proposes the novel way for determining the items to be given on sale, with the logical clubs, thus extending the Apriori algorithm. The dissertation proposes the high-utility mining for itemsets of size two (HUM-IS2) Algorithm using the transactional logs of the superstores. The pruning strategies have been introduced to remove unnecessary formations of the clubs. The essence of the algorithm has been proved by experimenting with various datasets.
Archive | 2018
Sudeep Sureshan; Anusha Penumacha; Siddharth Jain; Manjunath K Vanahalli; Nagamma Patil
Mining colossal patterns is one of the budding fields with a lot of applications, especially in the field of bioinformatics and genetics. Gene sequences contain inherent information. Mining colossal patterns in such sequences can further help in their study and improve prediction accuracy. The increase in average transaction length reduces the efficiency and effectiveness of existing closed frequent pattern mining algorithm. The traditional algorithms expend most of the running time in mining huge amount of minute and midsize patterns which do not enclose valuable information. The recent research focused on mining large cardinality patterns called as colossal patterns which possess valuable information. A novel parallel algorithm has been proposed to extract the closed colossal frequent patterns from high-dimensional datasets. The algorithm has been implemented on Hadoop framework to exploit its inherent distributed parallelism using MapReduce programming model. The experiment results highlight that the proposed parallel algorithm on Hadoop framework gives an efficient performance in terms of execution time compared to the existing algorithms.
Archive | 2018
K. S. Sandeep; Nagamma Patil
Blogs are textual web documents published by bloggers to share their experience or opinion about a particular topic(s). These blogs are frequently retrieved by the readers who are in need of such information. Existing techniques for text mining and web document mining can be applied to blogs to ease the blog retrieval. But these existing techniques consider only the content of the blogs or tags associated with them for mining topics from these blogs. This paper proposes a Multidimensional Approach to Blog Mining which defines a method to combine the Blog Content and Blog Tags to obtain Blog Patterns. These Blog Patterns represent a blog better when compared to Blog Content Patterns or Blog Tag Patterns. These Blog Patterns can either be used for Blog Clustering or used by Blog Retrieval Engines to compare with user queries. The proposed approach has been implemented and evaluated on real-world blog data.
international conference on computer and communication technology | 2017
Akanksha Kumari; Ashish Kumar Singh; Nagamma Patil
Recently in multimedia, web services contain a huge volume of geo-tagged photos. The users who upload these photos are sharing their travel experiences through them. Geo-tagged photos have crucial information imbibed within them, like location, time, tags and weather. Travel Recommendation methods that exist do not take into consideration user preferences and weather all at once. In this paper, a travel recommendation system is proposed for tourists in Mumbai according to their preferences, weather and live events. The preferences are obtained according to the prior travel history of user(s) and recommendations are suggested. Dataset is collected from the Flickr API and the technique is examined for Mumbai, an Indian metropolitan city. The effectiveness of the proposed method can be seen from the experimental results, which shows an average of 15% improvement in the accuracy with respect to the existing methods.
ieee uttar pradesh section international conference on electrical computer and electronics engineering | 2016
Manjunath K Vanahalli; Nagamma Patil
Bioinformatics has contributed to a different form of datasets called as high dimensional datasets. The high dimensional datasets are characterized by a large number of features and a small number of samples. The traditional algorithms expend most of the running time in mining large number of small and mid-size items which does not enclose valuable and significant information. The recent research focused on mining large cardinality itemsets called as colossal itemsets which are significant to many applications, especially in the field of bioinformatics. The existing frequent colossal itemset mining algorithms are unsuccessful in discovering complete set of significant frequent colossal itemsets. The mined colossal itemsets from existing algorithms provide erroneous support information which affects association analysis. Mining significant frequent colossal itemsets with accurate support information helps in attaining a high-level accuracy of association analysis. The proposed work highlights a novel pre-processing technique and bottom-up row enumeration algorithm to mine significant frequent colossal itemsets with accurate support information. A novel pre-processing technique efficiently utilizes minimum support threshold and minimum cardinality threshold to prune irrelevant samples and features. The experiment results demonstrate that the proposed algorithm has high accuracy over existing algorithms. Performance study indicates the efficiency of the pre-processing technique.