Advances in Data Mining and Database Management | 2021

Information Retrieval in the Hidden Web

 
 
 

Abstract


Information retrieval is finding material of unstructured nature within large collections stored on computers. Surface web consists of indexed content accessible by traditional browsers whereas deep or hidden web content cannot be found with traditional search engines and requires a password or network permissions. In deep web, dark web is also growing as new tools make it easier to navigate hidden content and accessible with special software like Tor. According to a study by Nature, Google indexes no more than 16% of the surface web and misses all of the deep web. Any given search turns up just 0.03% of information that exists online. So, the key part of the hidden web remains inaccessible to the users. This chapter deals with positing some questions about this research. Detailed definitions, analogies are explained, and the chapter discusses related work and puts forward all the advantages and limitations of the existing work proposed by researchers. The chapter identifies the need for a system that will process the surface and hidden web data and return integrated results to the users.

Volume None
Pages None
DOI 10.4018/978-1-7998-8061-5.ch003
Language English
Journal Advances in Data Mining and Database Management

Full Text