Dilip Kumar Sharma
GLA University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dilip Kumar Sharma.
International Journal of Information Technology and Web Engineering | 2011
Dilip Kumar Sharma; A. K. Sharma
A traditional crawler picks up a URL, retrieves the corresponding page and extracts various links, adding them to the queue. A deep Web crawler, after adding links to the queue, checks for forms. If forms are present, it processes them and retrieves the required information. Various techniques have been proposed for crawling deep Web information, but much remains undiscovered. In this paper, the authors analyze and compare important deep Web information crawling techniques to find their relative limitations and advantages. To minimize limitations of existing deep Web crawlers, a novel architecture is proposed based on QIIIEP specifications Sharma & Sharma, 2009. The proposed architecture is cost effective and has features of privatized search and general search for deep Web data hidden behind html forms.
International Journal of Information Technology and Web Engineering | 2010
Dilip Kumar Sharma; A. K. Sharma
Web crawlers specialize in downloading web content and analyzing and indexing from surface web, consisting of interlinked HTML pages. Web crawlers have limitations if the data is behind the query interface. Response depends on the querying partys context in order to engage in dialogue and negotiate for the information. In this article, the authors discuss deep web searching techniques. A survey of technical literature on deep web searching contributes to the development of a general framework. Existing frameworks and mechanisms of present web crawlers are taxonomically classified into four steps and analyzed to find limitations in searching the deep web.
2009 International Conference on Intelligent Agent & Multi-Agent Systems | 2009
Dilip Kumar Sharma; A. K. Sharma
A new Query Intensive Interface Information Extraction Protocol (QIIIEP) for deep web retrieval process is proposed. Auto query word extraction and auto form unification procedure are newly proposed in order to comprehend various functions of the proposed protocol. Proposed protocol offers great advantages in deep web crawling without over burdening the requesting server. However, conventional deep web crawling procedures result in heavy communication processing loads and procedural complexity for applying either schema matching or improper otology based query. This makes it difficult to crawl entire contents of deep web. In the proposed protocol, the tradeoff between correct query response and communication loads is solved by generating knowledge base at QIIIEP server. Therefore, the proposed protocol can realize flexible and highly efficient data extraction mechanism after deploying QIIIEP server on deep web domain. It enables not only the one stop information retrieval process but also provides auto authentication mechanism for supplied domain.
Computer Methods and Programs in Biomedicine | 2016
Anushikha Singh; Malay Kishore Dutta; Dilip Kumar Sharma
BACKGROUND AND OBJECTIVE Identification of fundus images during transmission and storage in database for tele-ophthalmology applications is an important issue in modern era. The proposed work presents a novel accurate method for generation of unique identification code for identification of fundus images for tele-ophthalmology applications and storage in databases. Unlike existing methods of steganography and watermarking, this method does not tamper the medical image as nothing is embedded in this approach and there is no loss of medical information. METHODS Strategic combination of unique blood vessel pattern and patient ID is considered for generation of unique identification code for the digital fundus images. Segmented blood vessel pattern near the optic disc is strategically combined with patient ID for generation of a unique identification code for the image. RESULTS The proposed method of medical image identification is tested on the publically available DRIVE and MESSIDOR database of fundus image and results are encouraging. CONCLUSIONS Experimental results indicate the uniqueness of identification code and lossless recovery of patient identity from unique identification code for integrity verification of fundus images.
international conference on contemporary computing | 2014
Risha Gaur; Dilip Kumar Sharma
World Wide Web (WWW) is considered to be the most important source of information now a days but its difficult to decide which resources are useful and which are more important. Thus to make a specific part of web leading to only the required resources is searched for. The focused crawler crawl a specific part of the web to retrieve the relevant resources. Here in this paper the focused crawler is applied on social network having ontology dependent tags. The ontology here is also used in preprocessing step of focused crawlers to make the search more specific by expanding the search topic semantically. Further the relevancy of manually tagged and semi-automatically tagged resource is compared. Then finally the harvest rate is evaluated for focused crawlers with ontology and using semi-automatic tagging to check for the relevance.
international conference & workshop on emerging trends in technology | 2011
Dilip Kumar Sharma; A. K. Sharma
For context based surfing of World Wide Web in a systematic and automatic manner, a web crawler is required. The World Wide Web consists interlinked documents and resources that are easily crawled by general web crawler, known as surface web crawler. But for crawling the hidden web data, in which the data is hidden behind the html forms requires special type of crawler, known as hidden web crawler. For efficient crawling of hidden web data, the discovery of relevant and proper html forms is very important step. For this purpose a technique for domain specific hidden web crawler is proposed in this paper. The proposed technique is based on the domain specific crawling of World Wide Web. In this approach, a link is followed in a step by step manner, which results in a large source of hidden web databases. Experiential results verify that the proposed approach is quite effective in crawling the hidden web data contents.
international conference on next generation computing technologies | 2016
Mayank Agrawal; Dilip Kumar Sharma
Plagiarism is becoming a serious problem for intellectual community. The detection of plagiarism at various levels is a major issue. The complexity of the problem increases when we are finding the plagiarism in the source codes that may be in the same language or they have been transformed into other languages. This type of plagiarism is found not only in the academic works but also in the industries dealing with software designing. The major issue with the source code plagiarism is that different programming languages may have different syntax. In this paper the authors will explain various techniques and algorithms to discover the plagiarism in source code. So organization or academic institution can simply discover plagiarism in source code using these techniques. The authors will differentiate among these given techniques of plagiarism to discover how one technique is conflicting with the other.
Archive | 2016
Shivam Rathi; Shashi Shekhar; Dilip Kumar Sharma
Opinion mining is the field of study that analyses people’s thoughts, sentiments, emotions and attitude towards entities, product, services, issues, topics, events and their attributes. There are many different tasks such as opinion extraction, sentiment mining, emotional analysis, review mining etc. The important aspect of opinion minion is to gather the information from reviews, blogs, etc. and then finding out the behavior of that information, i.e. the information is related to either positive or negative context. The positive and negative reviews or blogs deal with a numerical value. The value is to be calculated using SentiWordNet 3.0. The opinion words are mainly adjective words such as “good,” “better,” “awesome.” But there arises several problems because identifiers negation words and the extension of the opinion words such as “very very good” are not considered. In this paper, details about opinion mining, how the polarity value deals with positive and negative and how to deal with Roman language reviews and blogs is discussed.
International Journal of Rough Sets and Data Analysis archive | 2016
Avinash Samuel; Dilip Kumar Sharma
Summary generation is an important process in those conditions where the user needs to obtain the key features of the document without having to go through the whole document itself. The summarization process is of basically two types: 1 Single document Summarization and, 2 Multiple Document Summarization. But here the microblogging environment is taken into account which have a restriction on the number of characters contained within a post. Therefore, single document summarizers are not applicable to this condition. There are many features along which the summarization of the microblog post can be done for example, posts topic, its posting time, happening of the event, etc. This paper proposes a method that includes the temporal features of the microblog posts to develop an extractive summary of the event from each and every post, which will further increase the quality of the summary created as it includes all the key features in the summary.
soft computing | 2014
Risha Gaur; Dilip Kumar Sharma
Web now has become a great source of information now-a-days but this information is not sequentially stored rather stored in a hyperlinked form. The generic crawlers are not able to differentiate between relevant and related. So to improve over the results returned by the generic web crawlers, topic specific crawlers were designed called as focused crawlers. The focused crawler crawl a specific part of the web to retrieve the relevant resources. Here in this paper the comparative analysis of the various focused crawler is shown. It classifies two types of focused crawlers: the learning focused crawlers and the classical focused crawlers. The classical crawlers are again categorized into two as : Semantic focused crawlers and social semantic focused crawlers. The semantic focused crawlers combine ontology with the focused crawlers to get semantics of the search topic related to the web pages. The Social semantic works on the social network to get semantically related web pages shared between people of common interest.