Is this you? Create Your Porfile

A. K. Sharma

YMCA University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where A. K. Sharma is active.

Explore More

Publication

Featured researches published by A. K. Sharma.

ieee international advance computing conference | 2009

Page Ranking Algorithms: A Survey

Neelam Duhan; A. K. Sharma; Komal Kumar Bhatia

Web mining is an active research area in present scenario. Web Mining is defined as the application of data mining techniques on the World Wide Web to find hidden information, This hidden information i. e. knowledge could be contained in content of web pages or in link structure of WWW or in web server logs. Based upon the type of knowledge, web mining is usually divided in three categories: web content mining, web structure mining and web usage mining. An application of web mining can be seen in the case of search engines. Most of the search engines are ranking their search results in response to users queries to make their search navigation easier. In this paper, a survey of page ranking algorithms and comparison of some important algorithms in context of performance has been carried out.

international conference on computer and communication technology | 2011

Page ranking based on number of visits of links of Web page

Gyanendra Kumar; Neelam Duhan; A. K. Sharma

Search engines generally return a large number of pages in response to user queries. To assist the users to navigate in the result list, ranking methods are applied on the search results. Most of the ranking algorithms proposed in the literature are either link or content oriented, which do not consider user usage trends. In this paper, a page ranking mechanism called Page Ranking based on Visits of Links(VOL) is being devised for search engines, which works on the basic ranking algorithm of Google i.e. PageRank and takes number of visits of inbound links of Web pages into account. This concept is very useful to display most valuable pages on the top of the result list on the basis of user browsing behavior, which reduces the search space to a large scale. The paper also presents a method to find link-visit counts of Web pages and a comparison between VOL with the PageRank algorithm.

International Journal of Computer Applications | 2010

Design of a Priority Based Frequency Regulated Incremental Crawler

Niraj Singhal; Ashutosh; Sharma; A. K. Sharma

The World Wide Web is a huge source of hyperlinked information contained in hypertext documents. Search engines use web crawlers to collect these documents from web for the purpose of storage and indexing. However, many of these documents contain dynamic information which gets changed on daily, weekly, monthly or yearly basis and hence we need to refresh the search engine side storage so that latest information is made available to the user. An incremental crawler visits the web repeatedly after a specific interval for updating its collection. In this paper to regulate the revisiting frequency a novel mechanism and a novel architecture for incremental crawler is being proposed.

International Journal of Information Technology and Web Engineering | 2011

A Novel Architecture for Deep Web Crawler

Dilip Kumar Sharma; A. K. Sharma

A traditional crawler picks up a URL, retrieves the corresponding page and extracts various links, adding them to the queue. A deep Web crawler, after adding links to the queue, checks for forms. If forms are present, it processes them and retrieves the required information. Various techniques have been proposed for crawling deep Web information, but much remains undiscovered. In this paper, the authors analyze and compare important deep Web information crawling techniques to find their relative limitations and advantages. To minimize limitations of existing deep Web crawlers, a novel architecture is proposed based on QIIIEP specifications Sharma & Sharma, 2009. The proposed architecture is cost effective and has features of privatized search and general search for deep Web data hidden behind html forms.

ieee international advance computing conference | 2010

A mathematical model for crawler revisit frequency

Ashutosh Dixit; A. K. Sharma

WWWs expansion coupled with high change frequency of web pages poses a challenge for maintaining and fetching up-to-date information. The traditional crawling methods are no longer catch up with this updating and growing web. Alternative distributed crawling scheme that uses migrating crawlers try to maximize the network utilization by minimizing the network load but are hampered due to the deficiency in their web page refresh techniques. The absence of effective measures to verify whether a web page has been changed or not is another challenge. In this paper, an efficient approach for computing revisit frequency is being proposed. Web pages which frequently undergo up-dation are detected and accordingly revisit frequency for the pages is dynamically computed.

international conference on computational intelligence and communication networks | 2011

Information Retrieval from the Web and Application of Migrating Crawler

Niraj Singhal; R. P. Agarwal; Ashutosh Dixit; A. K. Sharma

Study reports that about 40% of current internet traffic and bandwidth consumption is due to the web crawlers that retrieve pages for indexing by the different search engines. As the size of the web continues to grow, searching it for useful information has become increasingly difficult. The centralized crawling techniques are unable to cope up with constantly growing web. In this paper it is presented that distributed crawling methods based on migrating crawlers are an essential tool for allowing such access that minimizes network utilization and also keeps up with document changes.

International Journal of Information Technology and Web Engineering | 2010

Deep Web Information Retrieval Process: A Technical Survey

Dilip Kumar Sharma; A. K. Sharma

Web crawlers specialize in downloading web content and analyzing and indexing from surface web, consisting of interlinked HTML pages. Web crawlers have limitations if the data is behind the query interface. Response depends on the querying partys context in order to engage in dialogue and negotiate for the information. In this article, the authors discuss deep web searching techniques. A survey of technical literature on deep web searching contributes to the development of a general framework. Existing frameworks and mechanisms of present web crawlers are taxonomically classified into four steps and analyzed to find limitations in searching the deep web.

2009 International Conference on Intelligent Agent & Multi-Agent Systems | 2009

Query Intensive Interface Information Extraction Protocol for deep web

Dilip Kumar Sharma; A. K. Sharma

A new Query Intensive Interface Information Extraction Protocol (QIIIEP) for deep web retrieval process is proposed. Auto query word extraction and auto form unification procedure are newly proposed in order to comprehend various functions of the proposed protocol. Proposed protocol offers great advantages in deep web crawling without over burdening the requesting server. However, conventional deep web crawling procedures result in heavy communication processing loads and procedural complexity for applying either schema matching or improper otology based query. This makes it difficult to crawl entire contents of deep web. In the proposed protocol, the tradeoff between correct query response and communication loads is solved by generating knowledge base at QIIIEP server. Therefore, the proposed protocol can realize flexible and highly efficient data extraction mechanism after deploying QIIIEP server on deep web domain. It enables not only the one stop information retrieval process but also provides auto authentication mechanism for supplied domain.

grid computing | 2010

AKSHR: A novel framework for a Domain-specific Hidden Web Crawler

Komal Kumar Bhatia; A. K. Sharma; Rosy Madaan

Existing search engines crawl and index surface web, ignoring hidden web which otherwise contains more than 500 times of information than PIW. In this paper, a Domain-specific Hidden Web Crawler (AKSHR) is being proposed. The framework extracts hidden web pages by accruing benefits of its three unique features: 1) automatic downloading of search interfaces to crawl hidden web databases, 2) identification of semantic mappings between search interface elements by using a novel approach called DSIM (Domain-specific Interface Mapper), and 3) the capability to automatic filling of search interfaces. The effectiveness of proposed framework has been evaluated through experiments using real web sites and encouraging preliminary results were obtained.

international conference on information systems | 2013

Relevant document crawling with usage pattern and domain profile based page ranking

Ashlesha Gupta; Ashutosh Dixit; A. K. Sharma

WWW is a distributed heterogeneous information resource. With the exponential growth of WWW, it has become difficult to access desired information that matches with user needs and interest. In spite of strong crawling, indexing and page ranking techniques, the returned result-sets of the search engine lack in accuracy and preciseness. Large number of irrelevant links, topic drift, and load on servers are some of the other issues that need to be addressed towards developing an efficient search engine. In this paper a solution is being proposed for the development of a crawling technique that attempts to reduce server load by taking advantage of migrants for downloading the relevant pages; pertaining to a specific topic only. The downloaded documents are then ranked considering user preferences and past usage patterns of the web page thereby improving the quality of retuned result-sets.

Explore More