M. Catherine McCabe | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where M. Catherine McCabe is active.

Explore More

Publication

Featured researches published by M. Catherine McCabe.

international acm sigir conference on research and development in information retrieval | 2002

Document normalization revisited

Abdur Chowdhury; M. Catherine McCabe; David A. Grossman; Ophir Frieder

Cosine Pivoted Document Length Normalization has reached a point of stability where many researchers indiscriminately apply a specific value of 0.2 regardless of the collection. Our efforts, however, demonstrate that applying this specific value without tuning for the document collection degrades average precision by as much as 20%.

international acm sigir conference on research and development in information retrieval | 2000

On the design and evaluation of a multi-dimensional approach to information retrieval (poster session)

M. Catherine McCabe; Jinho Lee; Abdur Chowdhury; David A. Grossman; Ophir Frieder

We present a method of searching text collections that takes advantage of hierarchrical information within documents and integrates searches of structured and unstructured data. We show that Multidimensional databases (MDB), designed for accessing data along hierarchical dimensions, are effective for information retrieval. We demonstrate a method of using On-Line Analytic Processing (OLAP) techniques on a text collection. This combines traditional information retrieval and the slicing, dicing, drill-down, and roll-up of OLAP. We demonstrate use of a prototype for searching documents from the TREC collection.

conference on information and knowledge management | 1999

A unified environment for fusion of information retrieval approaches

M. Catherine McCabe; Abdur Chowdhury; David A. Grossman; Ophir Frieder

Prior work has shown that combining results of various retrieval approaches and query representations can improve search effectiveness. Today, many meta-search engines exist which combine the results of various search engines in the hopes of improving overall effectiveness. However, the combination of results from different search engines masks variations in parsers, and other indexing techniques (stemming, stop words, etc.) This makes it difficult to assess the utility of the fusion technique. We have implemented the two most prevalent retrieval strategies: probabilistic and vector space using the same parser and the same relational retrieval engine. First, we identified a model that enables the fusion of an arbitrary number of sources. Next, we tested various linear combinations of these two methods as well as various thresholds for identifying retrieved documents. Our results show some improvement of effectiveness, but they also provide us for a baseline from which we can continue with other retrieval strategies and test the effect of fusing these strategies.

international acm sigir conference on research and development in information retrieval | 2001

Analyses of multiple-evidence combinations for retrieval strategies

Abdur Chowdhury; Ophir Frieder; David A. Grossman; M. Catherine McCabe

Multiple-evidence techniques are touted as means to improve the effectiveness of systems. Belkin, et al. [1] examined the effects of various query representations. Fox, et al. [2] proposed several combination algorithms and found that combinations of the same types of runs (long and short queries within the vector space model) did not yield improvement and sometimes even degraded performance. He did achieve improvement over individual runs when merging different retrieval strategies (e.g., vector space and pnorm Boolean). Lee [3] further examined various combination algorithms for fusing result sets to improve effectiveness. He identified that, for multiple-evidence to improve system effectiveness, the retrieved sets must have higher relevance overlap than non-relevance overlap. Lee did not identify the exact difference needed to improve effectiveness. His results had a 125% difference in relevant and non-relevant overlap. While Lees experiments focused on different system result sets, we focus on effective ranking strategies removing systemic differences of parsers, stemmers, phrase processing and weighting factors. We show that the improvements shown by Lee were likely produced by fusing ranking strategies less tuned than today’s measures, and current improvements are likely to be produced by systemic differences rather than ranking strategies.

international acm sigir conference on research and development in information retrieval | 2000