Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chris Buckley is active.

Publication


Featured researches published by Chris Buckley.


Information Processing and Management | 1988

Term-weighting approaches in automatic text retrieval

Gerard Salton; Chris Buckley

The experimental evidence accumulated over the past 20 years indicates that textindexing systems based on the assignment of appropriately weighted single terms produce retrieval results that are superior to those obtainable with other more elaborate text representations. These results depend crucially on the choice of effective term weighting systems. This paper summarizes the insights gained in automatic term weighting, and provides baseline single term indexing models with which other more elaborate content analysis procedures can be compared.


Journal of the Association for Information Science and Technology | 1997

Improving retrieval performance by relevance feedback

Gerard Salton; Chris Buckley

Relevance feedback is an automatic process, introduced over 20 years ago, designed to produce query formulations following an initial retrieval operation. The principal relevance feedback methods described over the years are examined briefly, and evaluation data are included to demonstrate the effectiveness of the various methods. Prescriptions are given for conducting text retrieval operations iteratively using relevance feedback.


international acm sigir conference on research and development in information retrieval | 1996

Pivoted document length normalization

Amit Singhal; Chris Buckley; Manclar Mitra

Automatic information retrieval systems have to deal with documents of varying lengths in a text collection. Document length normalization is used to fairly retrieve documents of all lengths. In this study, we ohserve that a normalization scheme that retrieves documents of all lengths with similar chances as their likelihood of relevance will outperform another scheme which retrieves documents with chances very different from their likelihood of relevance. We show that the retrievaf probabilities for a particular normalization method deviate systematically from the relevance probabilities across different collections. We present pivoted normalization, a technique that can be used to modify any normalization function thereby reducing the gap between the relevance and the retrieval probabilities. Training pivoted normalization on one collection, we can successfully use it on other (new) text collections, yielding a robust, collectzorz independent normalization technique. We use the idea of pivoting with the well known cosine normalization function. We point out some shortcomings of the cosine function andpresent two new normalization functions--pivoted unique normalization and piuotert byte size normalization.


international acm sigir conference on research and development in information retrieval | 1994

OHSUMED: an interactive retrieval evaluation and new large test collection for research

William R. Hersh; Chris Buckley; T. J. Leone; David H. Hickam

A series of information retrieval experiments was carried out with a computer installed in a medical practice setting for relatively inexperienced physician end-users. Using a commercial MEDLINE product based on the vector space model, these physicians searched just as effectively as more experienced searchers using Boolean searching. The results of this experiment were subsequently used to create a new large medical test collection, which was used in experiments with the SMART retrieval system to obtain baseline performance data as well as compare SMART with the other searchers.


international acm sigir conference on research and development in information retrieval | 1998

Improving automatic query expansion

Mandar Mitra; Amit Singhal; Chris Buckley

Most casual users of IR systems type short queries. Recent research has shown that adding new words to these queries via odhoc feedback improves the retrieval effectiveness of such queries. We investigate ways to improve this query expansion process by refining the set of documents used in feedback. We start by using manually formulated Boolean filters along with proximity constraints. Our approach is similar to the one proposed by Hearst[l2]. Next, we investigate a completely automatic method that makes use of term cooccurrence information to estimate word correlation. Experimental results show that refining the set of documents used in query expansion often prevents the query drift caused by blind expansion and yields substantial improvements in retrieval effectiveness, both in terms of average precision and precision in the top twenty documents. More importantly, the fully automatic approach developed in this study performs competitively with the best manual approach and requires little computational overhead.


international acm sigir conference on research and development in information retrieval | 2004

Retrieval evaluation with incomplete information

Chris Buckley; Ellen M. Voorhees

This paper examines whether the Cranfield evaluation methodology is robust to gross violations of the completeness assumption (i.e., the assumption that all relevant documents within a test collection have been identified and are present in the collection). We show that current evaluation measures are not robust to substantially incomplete relevance judgments. A new measure is introduced that is both highly correlated with existing measures when complete judgments are available and more robust to incomplete judgment sets. This finding suggests that substantially larger or dynamic test collections built using current pooling practices should be viable laboratory tools, despite the fact that the relevance information will be incomplete and imperfect.


international acm sigir conference on research and development in information retrieval | 2000

Evaluating Evaluation Measure Stability

Chris Buckley; Ellen M. Voorhees

This paper presents a novel way of examining the accuracy of the evaluation measures commonly used in information retrieval experiments. It validates several of the rules-of-thumb experimenters use, such as the number of queries needed for a good experiment is at least 25 and 50 is better, while challenging other beliefs, such as the common evaluation measures are equally reliable. As an example, we show that Precision at 30 documents has about twice the average error rate as Average Precision has. These results can help information retrieval researchers design experiments that provide a desired level of confidence in their results. In particular, we suggest researchers using Web measures such as Precision at 10 documents will need to use many more than 50 queries or will have to require two methods to have a very large difference in evaluation scores before concluding that the two methods are actually different.


acm conference on hypertext | 1997

Automatic text structuring and summarization

Gerard Salton; Amit Singhal; Mandar Mitra; Chris Buckley

Abstract In recent years, information retrieval techniques have been used for automatic generation of semantic hypertext links. This study applies the ideas from the automatic link generation research to attack another important problem in text processing—automatic text summarization. An automatic “general purpose” text summarization tool would be of immense utility in this age of information overload. Using the techniques used (by most automatic hypertext link generation algorithms) for inter-document link generation, we generate intra-document links between passages of a document. Based on the intra-document linkage pattern of a text, we characterize the structure of the text. We apply the knowledge of text structure to do automatic text summarization by passage extraction. We evaluate a set of fifty summaries generated using our techniques by comparing them to paragraph extracts constructed by humans. The automatic summarization methods perform well, especially in view of the fact that the summaries generated by two humans for the same article are surprisingly dissimilar.


international acm sigir conference on research and development in information retrieval | 1994

The effect of adding relevance information in a relevance feedback environment

Chris Buckley; Gerard Salton; James Allan

The effects of adding information from relevant documents are examined in the TREC routing environment. A modified Rocchio relevance feedback approach is used, with a varying number of relevant documents retrieved by an initial SMART search, and a varying number of terms from those relevant documents used to expand the initial query. Recall-precision evaluation reveals that as the amount of expansion of the query due to adding terms from relevant documents increases, so does the effectiveness. There appears to be a linear relationship between the log of the number of terms added and the recall-precision effectiveness. There also appears to be a linear relationship between the log of the number of known relevant documents and the recall-precision effectiveness.


Science | 1994

Automatic analysis, theme generation, and summarization of machine-readable texts

Gerard Salton; James Allan; Chris Buckley; Amit Singhal

Vast amounts of text material are now available in machine-readable form for automatic processing. Here, approaches are outlined for manipulating and accessing texts in arbitrary subject areas in accordance with user needs. In particular, methods are given for determining text themes, traversing texts selectively, and extracting summary statements that reflect text content.

Collaboration


Dive into the Chris Buckley's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

James Allan

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ellen M. Voorhees

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Donna Harman

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Norbert Fuhr

University of Duisburg-Essen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge