Anna Shtok
Technion – Israel Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anna Shtok.
ACM Transactions on Information Systems | 2012
Anna Shtok; Oren Kurland; David Carmel; Fiana Raiber; Gad Markovits
Predicting query performance, that is, the effectiveness of a search performed in response to a query, is a highly important and challenging problem. We present a novel approach to this task that is based on measuring the standard deviation of retrieval scores in the result list of the documents most highly ranked. We argue that for retrieval methods that are based on document-query surface-level similarities, the standard deviation can serve as a surrogate for estimating the presumed amount of query drift in the result list, that is, the presence (and dominance) of aspects or topics not related to the query in documents in the list. Empirical evaluation demonstrates the prediction effectiveness of our approach for several retrieval models. Specifically, the prediction quality often transcends that of current state-of-the-art prediction methods.
international acm sigir conference on research and development in information retrieval | 2010
Anna Shtok; Oren Kurland; David Carmel
We present a novel framework for the query-performance prediction task. That is, estimating the effectiveness of a search performed in response to a query in lack of relevance judgments. Our approach is based on using statistical decision theory for estimating the utility that a document ranking provides with respect to an information need expressed by the query. To address the uncertainty in inferring the information need, we estimate utility by the expected similarity between the given ranking and those induced by relevance models; the impact of a relevance model is based on its presumed representativeness of the information need. Specific query-performance predictors instantiated from the framework substantially outperform state-of-the-art predictors over five TREC corpora.
conference on information and knowledge management | 2016
Saar Kuzi; Anna Shtok; Oren Kurland
We present a suite of query expansion methods that are based on word embeddings. Using Word2Vecs CBOW embedding approach, applied over the entire corpus on which search is performed, we select terms that are semantically related to the query. Our methods either use the terms to expand the original query or integrate them with the effective pseudo-feedback-based relevance model. In the former case, retrieval performance is significantly better than that of using only the query, and in the latter case the performance is significantly better than that of the relevance model.
conference on information and knowledge management | 2012
Oren Kurland; Anna Shtok; Shay Hummel; Fiana Raiber; David Carmel; Ofri Rom
The query-performance prediction task is estimating the effectiveness of a search performed in response to a query when no relevance judgments are available. Although there exist many effective prediction methods, these differ substantially in their basic principles, and rely on diverse hypotheses about the characteristics of effective retrieval. We present a novel fundamental probabilistic prediction framework. Using the framework, we derive and explain various previously proposed prediction methods that might seem completely different, but turn out to share the same formal basis. The derivations provide new perspectives on several predictors (e.g., Clarity). The framework is also used to devise new prediction approaches that outperform the state-of-the-art.
international conference on the theory of information retrieval | 2011
Oren Kurland; Anna Shtok; David Carmel; Shay Hummel
The query-performance prediction task is estimating the effectiveness of a search performed in response to a query in lack of relevance judgments. Post-retrieval predictors analyze the result list of top-retrieved documents. While many of these previously proposed predictors are supposedly based on different principles, we show that they can actually be derived from a novel unified prediction framework that we propose. The framework is based on using a pseudo effective and/or ineffective ranking as reference comparisons to the ranking at hand, the quality of which we want to predict. Empirical exploration provides support to the underlying principles, and potential merits, of our framework.
ACM Transactions on Information Systems | 2016
Anna Shtok; Oren Kurland; David Carmel
The task of query performance prediction is to estimate the effectiveness of search performed in response to a query when no relevance judgments are available. We present a novel probabilistic analysis of the performance prediction task. The analysis gives rise to a general prediction framework that uses pseudo-effective or ineffective document lists that are retrieved in response to the query. These lists serve as reference to the result list at hand, the effectiveness of which we want to predict. We show that many previously proposed prediction methods can be explained using our framework. More generally, we shed new light on existing prediction methods and establish formal common grounds to seemingly different prediction approaches. In addition, we formally demonstrate the connection between prediction using reference lists and fusion of retrieved lists, and provide empirical support to this connection. Through an extensive empirical exploration, we study various factors that affect the quality of prediction using reference lists.
conference on information and knowledge management | 2012
Oren Kurland; Fiana Raiber; Anna Shtok
We show that two tasks which were independently addressed in the information retrieval literature actually amount to the exact same task. The first is query performance prediction; i.e., estimating the effectiveness of a search performed in response to a query in the absence of relevance judgments. The second task is cluster ranking, that is, ranking clusters of similar documents by their presumed effectiveness (i.e., relevance) with respect to the query. Furthermore, we show that several state-of-the-art methods that were independently devised for each of the two tasks are based on the same principles. Finally, we empirically demonstrate that using insights gained in work on query-performance prediction can help, in many cases, to improve the performance of a previously proposed cluster ranking method.
conference on information and knowledge management | 2012
Gad Markovits; Anna Shtok; Oren Kurland; David Carmel
Estimating the effectiveness of a search performed in response to a query in the absence of relevance judgments is the goal of query-performance prediction methods. Post-retrieval predictors analyze the result list of the most highly ranked documents. We address the prediction challenge for retrieval approaches wherein the final result list is produced by fusing document lists that were retrieved in response to a query. To that end, we present a novel fundamental prediction framework that accounts for this special characteristics of the fusion setting; i.e., the use of intermediate retrieved lists. The framework is based on integrating prediction performed upon the final result list with that performed upon the lists that were fused to create it; prediction integration is controlled based on inter-list similarities. We empirically demonstrate the merits of various predictors instantiated from the framework. A case in point, their prediction quality substantially transcends that of applying state-of-the-art predictors upon the final result list.
european conference on information retrieval | 2016
Liora Braunstain; Oren Kurland; David Carmel; Idan Szpektor; Anna Shtok
In many questions in Community Question Answering sites users look for the advice or opinion of other users who might offer diverse perspectives on a topic at hand. The novel task we address is providing supportive evidence for human answers to such questions, which will potentially help the asker in choosing answers that fit her needs. We present a support retrieval model that ranks sentences from Wikipedia by their presumed support for a human answer. The model outperforms a state-of-the-art textual entailment system designed to infer factual claims from texts. An important aspect of the model is the integration of relevance oriented and support oriented features.
conference on information and knowledge management | 2016
Yael Anava; Anna Shtok; Oren Kurland; Ella Rabinovich
There are numerous methods for fusing document lists retrieved from the same corpus in response to a query. Many of these methods are based on seemingly unrelated techniques and heuristics. Herein we present a probabilistic framework for the fusion task. The framework provides a formal basis for deriving and explaining many fusion approaches and the connections between them. Instantiating the framework using various estimates yields novel fusion methods, some of which significantly outperform state-of-the-art approaches.