Is this you? Create Your Porfile

Bruce Croft

University of Massachusetts Amherst

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bruce Croft is active.

Explore More

Publication

Featured researches published by Bruce Croft.

web search and data mining | 2010

Query reformulation using anchor text

Van Dang; Bruce Croft

Query reformulation techniques based on query logs have been studied as a method of capturing user intent and improving retrieval effectiveness. The evaluation of these techniques has primarily, however, focused on proprietary query logs and selected samples of queries. In this paper, we suggest that anchor text, which is readily available, can be an effective substitute for a query log and study the effectiveness of a range of query reformulation techniques (including log-based stemming, substitution, and expansion) using standard TREC collections. Our results show that log-based query reformulation techniques are indeed effective with standard collections, but expansion is a much safer form of query modification than word substitution. We also show that using anchor text as a simulated query log is as least as effective as a real log for these techniques.

international acm sigir conference on research and development in information retrieval | 2013

Term level search result diversification

Van Dang; Bruce Croft

Current approaches for search result diversification have been categorized as either implicit or explicit. The implicit approach assumes each document represents its own topic, and promotes diversity by selecting documents for different topics based on the difference of their vocabulary. On the other hand, the explicit approach models the set of query topics, or aspects. While the former approach is generally less effective, the latter usually depends on a manually created description of the query aspects, the automatic construction of which has proven difficult. This paper introduces a new approach: term-level diversification. Instead of modeling the set of query aspects, which are typically represented as coherent groups of terms, our approach uses terms without the grouping. Our results on the ClueWeb collection show that the grouping of topic terms provides very little benefit to diversification compared to simply using the terms themselves. Consequently, we demonstrate that term-level diversification, with topic terms identified automatically from the search results using a simple greedy algorithm, significantly outperforms methods that attempt to create a full topic structure for diversification.

international acm sigir conference on research and development in information retrieval | 1987

An approach to natural language for document retrieval

Bruce Croft

Document retrieval systems have been restricted, by the nature of the task, to techniques that can be used with large numbers of documents and broad domains. The most effective techniques that have been developed are based on the statistics of word occurrences in text. In this paper, we describe an approach to using natural language processing (NLP) techniques for what is essentially a natural language problem - the comparison of a request text with the text of document titles and abstracts. The proposed NLP techniques are used to develop a request model based on “conceptual case frames” and to compare this model with the texts of candidate documents. The request model is also used to provide information to statistical search techniques that identify the candidate documents. As part of a preliminary evaluation of this approach, case frame representations of a set of requests from the CACM collection were constructed. Statistical searches carried out using dependency and relative importance information derived from the request models indicate that performance benefits can be obtained.

ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4 | 2002

The future of web search

Bruce Croft

The battle for pre-eminence between search engines seems to have reached a point where a few systems and techniques dominate the market. As more careful evaluations have been carried out, our level of understanding of which techniques work for the Web environment has increased significantly. Does this mean that researchers should declare victory and move on to other areas? In fact, substantial opportunities for improving the performance of Web search still exist, and research in this field continues to expand rather than shrink. Areas such as question answering, the Semantic Web, the Hidden Web, cross-lingual search, personalized search, and peer-to-peer search are generating considerable interest. This talk will describe some of this research and discuss its potential.

international acm sigir conference on research and development in information retrieval | 2012

Dependency trigram model for social relation extraction from news articles

Maengsik Choi; Harksoo Kim; Bruce Croft

We propose a kernel-based model to automatically extract social relations such as economic relations and political relations between two people from news articles. To determine whether two people are structurally associated with each other, the proposed model uses an SVM (support vector machine) tree kernel based on trigrams of head-dependent relations between them. In the experiments with the automatic content extraction (ACE) corpus and a Korean news corpus, the proposed model outperformed the previous systems based on SVM tree kernels even though it used more shallow linguistic knowledge.

Frontiers, Challenges, and Opportunities for Information Retrieval | 2012

Frontiers, Challenges, and Opportunities for Information Retrieval

James Allan; Bruce Croft; Alistair Moffat; Mark Sanderson

international acm sigir conference on research and development in information retrieval | 2005

A markov random field for term dependencies

Donald Metzler; Bruce Croft

Archive | 2003

The lemur toolkit for lan-guage modeling and information retrieval

James Allan; Jamie Callan; Kevyn Collins-Thompson; Bruce Croft; Fengpin Feng; David Fisher; John D. Lafferty; Leah S. Larkey; Thanh N. Truong; Paul Ogilvie; Ligeng Si; Trevor Strohman; Howard R. Turtle; ChengXiang Zhai

international acm sigir conference on research and development in information retrieval | 2018