Santosh K. Vishwakarma

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Santosh K. Vishwakarma is active.

Explore More

Publication

Featured researches published by Santosh K. Vishwakarma.

international conference on communication systems and network technologies | 2014

An Efficient Approach for Inverted Index Pruning Based on Document Relevance

Santosh K. Vishwakarma; Kamaljit I. Lakhtaria; Divya Bhatnagar; Akhilesh K. Sharma

Information Retrieval deals with retrieving documents from a large collection that matches the information need of a user. Efficient retrieval is based on the proper storage of the inverted index. There have been many techniques for reducing the size of the inverted index. Static index pruning is one such technique, which is used to reduce the index size. This paper investigates a static index pruning approach which is useful to reduce the index size. The proposed approach prunes the entire document from the index based on its importance and relevance of top-k results. The elimination takes place on the basis of the score of the individual document. Experiments have been conducted on the FIRE text collection. Based on the results, it was found that for specific collections, the proposed model gives better precision values for the retrieval of top 30 and above documents.

international conference on contemporary computing | 2014

An Analytical approach based on self organized maps (SOM) in Indian classical music raga clustering

Akhilesh K. Sharma; Kamaljit I. Lakhtaria; Avinash Panwar; Santosh K. Vishwakarma

This paper is mainly focusing the aspects regarding the Self organized maps in the recognition of the Ragas and the strategy behind the Digital signal processing for the raga recognition and clustering of the same. Paper mainly describes all the features extraction mechanism for the SOM input and we devised an algorithm for creating the clusters of the raga based on their PCP (pitch class profiles) and the onsets detected. Our strategy is very promising that its providing better clusters of the raga patterns and the raga segments are very much clearly be distinguished from the other ragas, as we compared them with the formation of the key attributes. The Indian classical music history is very old and the need for identifying the music based on the raga onsets is very much helping for the music professionals and domestic users for identifying and detecting the raga types and using the same without the help or availability of any experts nearby. Thus our strategy is very promising in nature for the novice users and practitioners as well as home users. The very early learners would also find it very interesting and supportive after due course of frequent uses of it. At last we provided the future possibilities for the enhancements.

international conference on computational intelligence and communication networks | 2015

Analysis of TF-IDF Model and its Variant for Document Retrieval

Apra Mishra; Santosh K. Vishwakarma

An Information Retrieval System is a system that is capable of storage, retrieval, and maintenance of an Information. In this context Information can be composed of text (including numeric and date data), images, audio, video and other multi-media objects. The TF-IDF weight is a statistical measure used to evaluate how important a word is to a document in a collection or corpus. There exist various models for weighting terms of corpus documents and query terms. This work is carried out to analyze and evaluate the retrieval effectiveness of vector -- space model while using the new data set of FIRE 2011. The experiments were performed with TF-IDF and its variants. For all experiments and evaluation the open search engine, Terrier 3.5 was used. Our result shows that TF-IDF model gives the highest precision values with the new corpus dataset.

international conference on computational intelligence and communication networks | 2015

Analysis of Soil Behaviour and Prediction of Crop Yield Using Data Mining Approach

Monali Paul; Santosh K. Vishwakarma; Ashok Verma

Yield prediction is very popular among farmers these days, which particularly contributes to the proper selection of crops for sowing. This makes the problem of predicting the yielding of crops an interesting challenge. Earlier yield prediction was performed by considering the farmers experience on a particular field and crop. This work presents a system, which uses data mining techniques in order to predict the category of the analyzed soil datasets. The category, thus predicted will indicate the yielding of crops. The problem of predicting the crop yield is formalized as a classification rule, where Naive Bayes and K-Nearest Neighbor methods are used.

international conference on computational intelligence and communication networks | 2015

Sentiment Analysis of English Tweets Using Rapid Miner

Pragya Tripathi; Santosh K. Vishwakarma; Ajay Lala

Social networking sites these days are great source of communication for internet users. So these are important source for understanding the emotions of people. In this paper, we use data mining techniques for the purpose of classification to perform sentiment analysis on the views people have shared in Twitter. We collect dataset, i.e. The tweets from twitter that are in natual language and apply text mining techniques -- tokenization, stemming etc to convert them into useful form and then use it for building sentiment classifier that is able to predict happy, sad and neutral sentiments for a particular tweet. Rapid Miner tool is being used, that helps in building the classifier as well as able to apply it to the testing dataset. We are using two different classifiers and also compare their results in order to find which one gives better results.

International Journal of Computer Applications | 2014

Ad-hoc Retrieval on FIRE Data Set with TF-IDF and Probabilistic Models

Chandra ShekharJangid; Santosh K. Vishwakarma; Kamaljit I. Lakhtaria

Information Retrieval is finding documents of unstructured nature which should satisfy user’s information needs. There exist various models for weighting terms of corpus documents and query terms. This work is carried out to analyze and evaluate the retrieval effectiveness of various IR models while using the new data set of FIRE 2011. The experiments were performed with tf-idf and its variants along with probabilistic models. For all experiments and evaluation the open search engine, Terrier 3.5 was used. Our result shows that tf-idf model gives the highest precision values with the news corpus dataset. General Terms Information Retrieval, IR Models, Weighting Schemes

Archive | 2019

Mining CMS Log Data for Students’ Feedback Analysis

Ashok Verma; Sumangla Rathore; Santosh K. Vishwakarma; Shubham Goswami

In the current scenario of educational system, data storage and retrieval have been an important issue. Many universities have huge amount of databases which require proper mining to generate patterns and knowledge. Nowadays, several learning platforms like Moodle have implemented to achieve the need of educators, administrators, and learner. These platforms have been great assets for educators; still mining of the large data is required to uncover various interesting patterns and facts for decision-making process for the benefits of the students. This research paper examines various text classification algorithms to analyze various students’ problems. After extracting useful patterns from the database, it will be very useful for the concerned authorities and institute management in making better and informed decisions for providing solutions to all those students’ problems. The results obtained in our experiments are very useful to classify students’ problems as well as they are used to detect other interesting patterns about the Moodle CMS data.

International Journal of Engineering Research and | 2017

A Review on: Designing a Recommender System Using Sequential Logs

Anupama Patel; Santosh K. Vishwakarma

Recommender system has change the way of searching for information over the internet. The information over the internet is overburden. The searching for specific information sometimes becomes very time consuming. Today in the fast growing world no one wants to wait everyone wants o work fast. This in result has increased the competition for achieving success. So it is very necessary to save time in searching information. Recommender system is solution for the problem by providing recommendation to the users. This paper has discussed about the types of recommender system like collaborative filtering, content based filtering and hybrid filtering along with the challenges of recommendation system. The paper has also discussed about the sequential information of user behavior which is stored in the web servers. In this paper only the idea is being discussed. Keywords— Recommender system, collaborative filtering, content based filtering, hybrid filtering, sequential information, web

international conference on computational intelligence and communication networks | 2015

Efficient & Accurate Scheduling Algorithm for Cloudera Hadoop

Swati Yadav; Santosh K. Vishwakarma; Ashok Verma

The term immense data was coined to capture which suggests of this rising trend. To boot to its sheer volume, immense data to boot exhibits completely different distinctive characteristics as compared with ancient data. For instance, immense data is typically unstructured and wish extra amount analysis. This development incorporates new system architectures for data acquisition, transmission, storage, and large-scale process mechanisms. Recent technological advancements have semiconductor diode to a deluge of information from distinctive domains (e.g., Health care and sciatic sensors, user generated data, net and money corporations, and supply chain systems). The build up of information over the past twenty years has enlarged to large volumes. Apache Hadoop have introduced a economical and possible tool for distributed computing of such immense data for filtering and extracting massive volumes of knowledge. MapReduce can be a good used parallel computing framework for giant scale process. The two major performance metrics in MapReduce area unit job execution time and cluster production. MapReduce uses inventory accounting job programming by default and completely different programming algorithms area unit being introduced in proprietary domain. This work introduces a metric primarily based programming algorithmic rule to reinforce the potency and utilization of the server resources.

international conference on information and communication technology | 2014

On Using Chi Square Based Term Scoring for Static Index Pruning

Santosh K. Vishwakarma; Divya Bhatnagar; Kamaljit I. Lakhtaria; Akhilesh K. Sharma

In this study, a novel technique for static index pruning based on document relevance with chi square scoring is presented. The term presents in the document are score using the chi-square statistical method. It takes into account the terms occurrences and the expected frequency in the document. The expected frequency is estimated by the terms entropy and the dispersion value of the document. During the computation of document score, the associated terms are averaged by their chi-square based score. The performance of the proposed algorithm is tested on the FIRE 2010 dataset. Experimental results show that the proposed approach increases the Precision for pruning level 60 and above for the top-30 documents retrieval.

Explore More