Olga Vechtomova
University of Waterloo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Olga Vechtomova.
international acm sigir conference on research and development in information retrieval | 2008
Charles L. A. Clarke; Maheedhar Kolla; Gordon V. Cormack; Olga Vechtomova; Azin Ashkan; Stefan Büttcher; Ian MacKinnon
Evaluation measures act as objective functions to be optimized by information retrieval systems. Such objective functions must accurately reflect user requirements, particularly when tuning IR systems and learning ranking functions. Ambiguity in queries and redundancy in retrieved documents are poorly reflected by current evaluation measures. In this paper, we present a framework for evaluation that systematically rewards novelty and diversity. We develop this framework into a specific evaluation measure, based on cumulative gain. We demonstrate the feasibility of our approach using a test collection based on the TREC question answering track.
international conference on the theory of information retrieval | 2009
Charles L. A. Clarke; Maheedhar Kolla; Olga Vechtomova
Building upon simple models of user needs and behavior, we propose a new measure of novelty and diversity for information retrieval evaluation. We combine ideas from three recently proposed effectiveness measures in an attempt to achieve a balance between the complexity of genuine users needs and the simplicity required for feasible evaluation.
Information Retrieval | 2003
Olga Vechtomova; Stephen E. Robertson; Susan Jones
The paper presents two novel approaches to query expansion with long-span collocates—words, significantly co-occurring in topic-size windows with query terms. In the first approach—global collocation analysis—collocates of query terms are extracted from the entire collection, in the second—local collocation analysis—from a subset of retrieved documents. The significance of association between collocates was estimated using modified Mutual Information and Z score. The techniques were tested using the Okapi IR system. The effect of different parameters on performance was evaluated: window size, number of expansion terms, measures of collocation significance and types of expansion terms. We present performance results of these techniques and provide comparison with related approaches.
Journal of Information Science | 2006
Olga Vechtomova; Ying Wang
Query expansion terms are often used to enhance original query formulations in document retrieval. Such terms are usually selected from the entire documents or from windows or passages surrounding query term occurrences. Arguably, the semantic relatedness between terms weakens with the increase in the distance separating them. In this paper we report a study that was conducted to systematically evaluate different distance functions for selecting query expansion terms. We propose a distance factor that can be effectively combined with the statistical term association measure of mutual information for selecting query expansion terms. Evaluation of the TREC collection shows that distance-weighted mutual information is more effective than mutual information alone in selecting terms for query expansion.
Information Processing and Management | 2008
Olga Vechtomova; Murat Karamuftuoglu
We demonstrate effective new methods of document ranking based on lexical cohesive relationships between query terms. The proposed methods rely solely on the lexical relationships between original query terms, and do not involve query expansion or relevance feedback. Two types of lexical cohesive relationship information between query terms are used in document ranking: short-distance collocation relationship between query terms, and long-distance relationship, determined by the collocation of query terms with other words. The methods are evaluated on TREC corpora, and show improvements over baseline systems.
Information Processing and Management | 2007
Olga Vechtomova; Murat Karamuftuoglu
We present new methods of query expansion using terms that form lexical cohesive links between the contexts of distinct query terms in documents (i.e., words surrounding the query terms in text). The link-forming terms (link-terms) and short snippets of text surrounding them are evaluated in both interactive and automatic query expansion (QE). We explore the effectiveness of snippets in providing context in interactive query expansion, compare query expansion from snippets vs. whole documents, and query expansion following snippet selection vs. full document relevance judgements. The evaluation, conducted on the HARD track data of TREC 2005, suggests that there are considerable advantages in using link-terms and their surrounding short text snippets in QE compared to terms selected from full-texts of documents.
european conference on information retrieval | 2008
Ian MacKinnon; Olga Vechtomova
When the objective of an information retrieval task is to return a nugget rather than a document, query terms that exist in a document will often not be used in the most relevant information nugget in the document. In this paper, a new method of query expansion is proposed based on the Wikipedia link structure surrounding the most relevant articles selected automatically. Evaluated with the Nuggeteer automatic scoring software, an increase in the F-scores is found from the TREC Complex Interactive Question Answering task when integrating this expansion into an already high-performing baseline system.
european conference on information retrieval | 2005
Olga Vechtomova
The paper presents several techniques for selecting noun phrases for interactive query expansion following pseudo-relevance feedback and a new phrase search method. A combined syntactico-statistical method was used for the selection of phrases. First, noun phrases were selected using a part-of-speech tagger and a noun-phrase chunker, and secondly, different statistical measures were applied to select phrases for query expansion. Experiments were also conducted studying the effectiveness of noun phrases in document ranking. We analyse the problems of phrase weighting and suggest new ways of addressing them. A new method of phrase matching and weighting was developed, which specifically addresses the problem of weighting overlapping and non-contiguous word sequences in documents.
Information Processing and Management | 2012
Olga Vechtomova; Stephen E. Robertson
We propose an approach to the retrieval of entities that have a specific relationship with the entity given in a query. Our research goal is to investigate whether related entity finding problem can be addressed by combining a measure of relatedness of candidate answer entities to the query, and likelihood that the candidate answer entity belongs to the target entity category specified in the query. An initial list of candidate entities, extracted from top ranked documents retrieved for the query, is refined using a number of statistical and linguistic methods. The proposed method extracts the category of the target entity from the query, identifies instances of this category as seed entities, and computes similarity between candidate and seed entities. The evaluation was conducted on the Related Entity Finding task of the Entity Track of TREC 2010, as well as the QA list questions from TREC 2005 and 2006. Evaluation results demonstrate that the proposed methods are effective in finding related entities.
european conference on information retrieval | 2006
Olga Vechtomova
The paper presents several techniques for selecting noun phrases for interactive query expansion following pseudo-relevance feedback and a new phrase-based document ranking method. A combined syntactico-statistical method was used for the selection of phrases for query expansion. Several statistical measures of phrase selection were evaluated. Experiments were also conducted studying the effectiveness of noun phrases in document ranking. One of the major problems in phrase-based document retrieval is weighting of overlapping and non-contiguous word sequences in documents. The paper presents a new method of phrase weighting, which addressed this problem, and its evaluation on the TREC dataset.