Bodo Billerbeck
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bodo Billerbeck.
conference on information and knowledge management | 2003
Bodo Billerbeck; Falk Scholer; Hugh E. Williams; Justin Zobel
Hundreds of millions of users each day use web search engines to meet their information needs. Advances in web search effectiveness are therefore perhaps the most significant public outcomes of IR research. Query expansion is one such method for improving the effectiveness of ranked retrieval by adding additional terms to a query. In previous approaches to query expansion, the additional terms are selected from highly ranked documents returned from an initial retrieval run. We propose a new method of obtaining expansion terms, based on selecting terms from past user queries that are associated with documents in the collection. Our scheme is effective for query expansion for web retrieval: our results show relative improvements over unexpanded full text retrieval of 26%--29%, and 18%--20% over an optimised, conventional expansion approach.
web search and data mining | 2012
David Sontag; Kevyn Collins-Thompson; Paul N. Bennett; Ryen W. White; Susan T. Dumais; Bodo Billerbeck
We present a new approach for personalizing Web search results to a specific user. Ranking functions for Web search engines are typically trained by machine learning algorithms using either direct human relevance judgments or indirect judgments obtained from click-through data from millions of users. The rankings are thus optimized to this generic population of users, not to any specific user. We propose a generative model of relevance which can be used to infer the relevance of a document to a specific user for a search query. The user-specific parameters of this generative model constitute a compact user profile. We show how to learn these profiles from a users long-term search history. Our algorithm for computing the personalized ranking is simple and has little computational overhead. We evaluate our personalization approach using historical search data from thousands of users of a major Web search engine. Our findings demonstrate gains in retrieval performance for queries with high ambiguity, with particularly large improvements for acronym queries.
string processing and information retrieval | 2004
Bodo Billerbeck; Justin Zobel
Query expansion is a well-known method for improving average effectiveness in information retrieval. However, the most effective query expansion methods rely on costly retrieval and processing of feedback documents. We explore alternative methods for reducing query-evaluation costs, and propose a new method based on keeping a brief summary of each document in memory. This method allows query expansion to proceed three times faster than previously, while approximating the effectiveness of standard expansion.
european conference on information retrieval | 2008
Falk Scholer; Milad Shokouhi; Bodo Billerbeck; Andrew Turpin
Clickthrough data has been the subject of increasing popularity as an implicit indicator of user feedback. Previous analysis has suggested that user click behaviour is subject to a quality bias--that is, users click at different rank positions when viewing effective search results than when viewing less effective search results. Based on this observation, it should be possible to use click data to infer the quality of the underlying search system. In this paper we carry out a user study to systematically investigate how click behaviour changes for different levels of search system effectiveness as measured by information retrieval performance metrics. Our results show that click behaviour does not vary systematically with the quality of search results. However, click behaviour does vary significantly between individual users, and between search topics. This suggests that using direct click behaviour--click rank and click frequency--to infer the quality of the underlying search system is problematic. Further analysis of our user click data indicates that the correspondence between clicks in a search result list and subsequent confirmation that the clicked resource is actually relevant is low. Using clicks as an implicit indication of relevance should therefore be done with caution.
international acm sigir conference on research and development in information retrieval | 2003
Bodo Billerbeck; Justin Zobel
The effectiveness of queries in information retrieval can be improved through query expansion. This technique automatically introduces additional query terms that are statistically likely to match documents on the intended topic. However, query expansion techniques rely on fixed parameters. Our investigation of the effect of varying these parameters shows that the strategy of using fixed values is questionable.
Information Systems | 2006
Bodo Billerbeck; Justin Zobel
Query expansion is a well-known method for improving average effectiveness in information retrieval. The most effective query expansion methods rely on retrieving documents which are used as a source of expansion terms. Retrieving those documents is costly. We examine the bottlenecks of a conventional approach and investigate alternative methods aimed at reducing query evaluation time. We propose a new method that draws candidate terms from brief document summaries that are held in memory for each document. While approximately maintaining the effectiveness of the conventional approach, this method significantly reduces the time required for query expansion by a factor of 5-10.
european conference on research and advanced technology for digital libraries | 2010
Bodo Billerbeck; Gianluca Demartini; Claudiu S. Firan; Tereza Iofciu; Ralf Krestel
Searching for entities is an emerging task in Information Retrieval for which the goal is finding well defined entities instead of documents matching the query terms. In this paper we propose a novel approach to Entity Retrieval by using Web search engine query logs. We use Markov random walks on (1) Click Graphs - built from clickthrough data - and on (2) Session Graphs - built from user session information. We thus provide semantic bridges between different query terms, and therefore indicate meaningful connections between Entity Retrieval queries and related entities.
web search and data mining | 2013
Nick Craswell; Bodo Billerbeck; Dennis Fetterly; Marc Najork
Query rewriting algorithms can be used as a form of query expansion, by combining the users original query with automatically generated rewrites. Rewriting algorithms bring linguistic datasets to bear without the need for iterative relevance feedback, but most studies of rewriting have used proprietary datasets such as large-scale search logs. By contrast this paper uses readily available data, particularly ClueWeb09 link text with over 1.2 billion anchor phrases, to generate rewrites. To avoid overfitting, our initial analysis is performed using Million Query Track queries, leading us to identify three algorithms which perform well. We then test the algorithms on Web and newswire data. Results show good properties in terms of robustness and early precision.
australasian database conference | 2004
Bodo Billerbeck; Justin Zobel
Archive | 2005
Bodo Billerbeck