Rifat Ozcan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rifat Ozcan is active.

Explore More

Publication

Featured researches published by Rifat Ozcan.

ACM Transactions on The Web | 2011

Cost-Aware Strategies for Query Result Caching in Web Search Engines

Rifat Ozcan; Ismail Sengor Altingovde; Özgür Ulusoy

Search engines and large-scale IR systems need to cache query results for efficiency and scalability purposes. Static and dynamic caching techniques (as well as their combinations) are employed to effectively cache query results. In this study, we propose cost-aware strategies for static and dynamic caching setups. Our research is motivated by two key observations: (i) query processing costs may significantly vary among different queries, and (ii) the processing cost of a query is not proportional to its popularity (i.e., frequency in the previous logs). The first observation implies that cache misses have different, that is, nonuniform, costs in this context. The latter observation implies that typical caching policies, solely based on query popularity, can not always minimize the total cost. Therefore, we propose to explicitly incorporate the query costs into the caching policies. Simulation results using two large Web crawl datasets and a real query log reveal that the proposed approach improves overall system performance in terms of the average query execution time.

international world wide web conferences | 2008

Static query result caching revisited

Rifat Ozcan; Ismail Sengor Altingovde; Özgür Ulusoy

Query result caching is an important mechanism for search engine efficiency. In this study, we first review several query features that are used to determine the contents of a static result cache. Next, we introduce a new feature that more accurately represents the popularity of a query by measuring the stability of query frequency over a set of time intervals. Experimental results show that this new feature achieves hit ratios better than those of the previously proposed features.

international conference on information technology coding and computing | 2005

Concept-based information access

Rifat Ozcan; Y.A. Aslandogan

Concept-based access to information promises important benefits over keyword-based access. One of these benefits is the ability to take advantage of semantic relationships among concepts in finding relevant documents. Another benefit is the elimination of irrelevant documents by identifying conceptual mismatches. Concepts are mental structures. Words and phrases are the linguistic representatives of concepts. Due to the inherent conciseness of natural language, words can represent multiple concepts and different words may represent the same or very similar concepts. Word sense disambiguation attempts to resolve this ambiguity using contextual information. The use of an ontology facilitates identification of related concepts and their linguistic representatives given a key concept. Latent semantic analysis, on the other hand, attempts to reveal the hidden conceptual relationships among words and phrases based on linguistic usage patterns. In this work we explore the potential of concept-based information access via these two methods. We examine under what circumstances concept-based access becomes feasible and improves user experience.

european conference on information retrieval | 2012

Adaptive time-to-live strategies for query result caching in web search engines

Sadiye Alici; Ismail Sengor Altingovde; Rifat Ozcan; Berkant Barla Cambazoglu; Özgür Ulusoy

An important research problem that has recently started to receive attention is the freshness issue in search engine result caches. In the current techniques in literature, the cached search result pages are associated with a fixed time-to-live (TTL) value in order to bound the staleness of search results presented to the users, potentially as part of a more complex cache refresh or invalidation mechanism. In this paper, we propose techniques where the TTL values are set in an adaptive manner, on a per-query basis. Our results show that the proposed techniques reduce the fraction of stale results served by the cache and also decrease the fraction of redundant query evaluations on the search engine backend compared to a strategy using a fixed TTL value for all queries.

international acm sigir conference on research and development in information retrieval | 2011

Timestamp-based result cache invalidation for web search engines

Sadiye Alici; Ismail Sengor Altingovde; Rifat Ozcan; Berkant Barla Cambazoglu; Özgür Ulusoy

The result cache is a vital component for efficiency of large-scale web search engines, and maintaining the freshness of cached query results is the current research challenge. As a remedy to this problem, our work proposes a new mechanism to identify queries whose cached results are stale. The basic idea behind our mechanism is to maintain and compare generation time of query results with update times of posting lists and documents to decide on staleness of query results. The proposed technique is evaluated using a Wikipedia document collection with real update information and a real-life query log. We show that our technique has good prediction accuracy, relative to a baseline based on the time-to-live mechanism. Moreover, it is easy to implement and incurs less processing overhead on the system relative to a recently proposed, more sophisticated invalidation mechanism.

european conference on information retrieval | 2009

A Cost-Aware Strategy for Query Result Caching in Web Search Engines

Ismail Sengor Altingovde; Rifat Ozcan; Özgür Ulusoy

Search engines and large scale IR systems need to cache query results for efficiency and scalability purposes. In this study, we propose to explicitly incorporate the query costs in the static caching policy. To this end, a querys cost is represented by its execution time, which involves CPU time to decompress the postings and compute the query-document similarities to obtain the final top-N answers. Simulation results using a large Web crawl data and a real query log reveal that the proposed strategy improves overall system performance in terms of the total query execution time.

ACM Transactions on Information Systems | 2012

Static index pruning in web search engines: Combining term and document popularities with query views

Ismail Sengor Altingovde; Rifat Ozcan; Özgür Ulusoy

Static index pruning techniques permanently remove a presumably redundant part of an inverted file, to reduce the file size and query processing time. These techniques differ in deciding which parts of an index can be removed safely; that is, without changing the top-ranked query results. As defined in the literature, the query view of a document is the set of query terms that access to this particular document, that is, retrieves this document among its top results. In this paper, we first propose using query views to improve the quality of the top results compared against the original results. We incorporate query views in a number of static pruning strategies, namely term-centric, document-centric, term popularity based and document access popularity based approaches, and show that the new strategies considerably outperform their counterparts especially for the higher levels of pruning and for both disjunctive and conjunctive query processing. Additionally, we combine the notions of term and document access popularity to form new pruning strategies, and further extend these strategies with the query views. The new strategies improve the result quality especially for the conjunctive query processing, which is the default and most common search mode of a search engine.

european conference on information retrieval | 2011

Second chance: a hybrid approach for dynamic result caching in search engines

I. Sengor Altingovde; Rifat Ozcan; Berkant Barla Cambazoglu; Özgür Ulusoy

Result caches are vital for efficiency of search engines. In this work, we propose a novel caching strategy in which a dynamic result cache is split into two layers: an HTML cache and a docID cache. The HTML cache in the first layer stores the result pages computed for queries. The docID cache in the second layer stores ids of documents in search results. Experiments under various scenarios show that, in terms of average query processing time, this hybrid caching approach outperforms the traditional approach, which relies only on the HTML cache.

european conference on information retrieval | 2009

A Practitioner's Guide for Static Index Pruning

Ismail Sengor Altingovde; Rifat Ozcan; Özgür Ulusoy

We compare the term- and document-centric static index pruning approaches as described in the literature and investigate their sensitivity to the scoring functions employed during the pruning and actual retrieval stages.

international acm sigir conference on research and development in information retrieval | 2007

Large-scale cluster-based retrieval experiments on Turkish texts

Ismail Sengor Altingovde; Rifat Ozcan; Huseyin Cagdas Ocalan; Fazli Can; Özgür Ulusoy

We present cluster-based retrieval (CBR) experiments on the largest available Turkish document collection. Our experiments evaluate retrieval effectiveness and efficiency on both an automatically generated clustering structure and a manual classification of documents. In particular, we compare CBR effectiveness with full-text search (FS) and evaluate several implementation alternatives for CBR. Our findings reveal that CBR yields comparable effectiveness figures with FS. Furthermore, by using a specifically tailored cluster-skipping inverted index we significantly improve in-memory query processing efficiency of CBR in comparison to other traditional CBR techniques and even FS.

Explore More