Leif Azzopardi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Leif Azzopardi is active.

Explore More

Publication

Featured researches published by Leif Azzopardi.

international acm sigir conference on research and development in information retrieval | 2006

Formal models for expert finding in enterprise corpora

Krisztian Balog; Leif Azzopardi; Maarten de Rijke

Searching an organizations document repositories for experts provides a cost effective solution for the task of expert finding. We present two general strategies to expert searching given a document collection which are formalized using generative probabilistic models. The first of these directly models an experts knowledge based on the documents that they are associated with, whilst the second locates documents on topic, and then finds the associated expert. Forming reliable associations is crucial to the performance of expert finding systems. Consequently, in our evaluation we compare the different approaches, exploring a variety of associations along with other operational parameters (such as topicality). Using the TREC Enterprise corpora, we show that the second strategy consistently outperforms the first. A comparison against other unsupervised techniques, reveals that our second model delivers excellent performance.

Information Processing and Management | 2009

A language modeling framework for expert finding

Krisztian Balog; Leif Azzopardi; Maarten de Rijke

Statistical language models have been successfully applied to many information retrieval tasks, including expert finding: the process of identifying experts given a particular topic. In this paper, we introduce and detail language modeling approaches that integrate the representation, association and search of experts using various textual data sources into a generative probabilistic framework. This provides a simple, intuitive, and extensible theoretical framework to underpin research into expertise search. To demonstrate the flexibility of the framework, two search strategies to find experts are modeled that incorporate different types of evidence extracted from the data, before being extended to also incorporate co-occurrence information. The models proposed are evaluated in the context of enterprise search systems within an intranet environment, where it is reasonable to assume that the list of experts is known, and that data to be mined is publicly accessible. Our experiments show that excellent performance can be achieved by using these models in such environments, and that this theoretical and empirical work paves the way for future principled extensions.

international acm sigir conference on research and development in information retrieval | 2007

Broad expertise retrieval in sparse data environments

Krisztian Balog; Toine Bogers; Leif Azzopardi; Maarten de Rijke; Antal van den Bosch

Expertise retrieval has been largely unexplored on data other than the W3C collection. At the same time, many intranets of universities and other knowledge-intensive organisations offer examples of relatively small but clean multilingual expertise data, covering broad ranges of expertise areas. We first present two main expertise retrieval tasks, along with a set of baseline approaches based on generative language modeling, aimed at finding expertise relations between topics and people. For our experimental evaluation, we introduce (and release) a new test set based on a crawl of a university site. Using this test set, we conduct two series of experiments. The first is aimed at determining the effectiveness of baseline expertise retrieval methods applied to the new test set. The second is aimed at assessing refined models that exploit characteristic features of the new test set, such as the organizational structure of the university, and the hierarchical structure of the topics in the test set. Expertise retrieval models are shown to be robust with respect to environments smaller than the W3C collection, and current techniques appear to be generalizable to other settings.

international acm sigir conference on research and development in information retrieval | 2007

Building simulated queries for known-item topics: an analysis using six european languages

Leif Azzopardi; Maarten de Rijke; Krisztian Balog

There has been increased interest in the use of simulated queries for evaluation and estimation purposes in Information Retrieval. However, there are still many unaddressed issues regarding their usage and impact on evaluation because their quality, in terms of retrieval performance, is unlike real queries. In this paper, wefocus on methods for building simulated known-item topics and explore their quality against real known-item topics. Using existing generation models as our starting point, we explore factors which may influence the generation of the known-item topic. Informed by this detailed analysis (on six European languages) we propose a model with improved document and term selection properties, showing that simulated known-item topics can be generated that are comparable to real known-item topics. This is a significant step towards validating the potential usefulness of simulated queries: for evaluation purposes, and becausebuilding models of querying behavior provides a deeper insight into the querying process so that better retrieval mechanisms can be developed to support the user.

international acm sigir conference on research and development in information retrieval | 2003

Investigating the relationship between language model perplexity and IR precision-recall measures

Leif Azzopardi; Mark A. Girolami; Keith van Risjbergen

An empirical study has been conducted investigating the relationship between the performance of an aspect based language model in terms of perplexity and the corresponding information retrieval performance obtained. It is observed, on the corpora considered, that the perplexity of the language model has a systematic relationship with the achievable precision recall performance though it is not statistically significant.

international acm sigir conference on research and development in information retrieval | 2014

Modelling interaction with economic models of search

Leif Azzopardi

Understanding how people interact when searching is central to the study of Interactive Information Retrieval (IIR). Most of the prior work has either been conceptual, observational or empirical. While this has led to numerous insights and findings regarding the interaction between users and systems, the theory has lagged behind. In this paper, we extend the recently proposed search economic theory to make the model more realistic. We then derive eight interaction based hypotheses regarding search behaviour. To validate the model, we explore whether the search behaviour of thirty-six participants from a lab based study is consistent with the theory. Our analysis shows that observed search behaviours are in line with predicted search behaviours and that it is possible to provide credible explanations for such behaviours. This work describes a concise and compact representation of search behaviour providing a strong theoretical basis for future IIR research.

international acm sigir conference on research and development in information retrieval | 2013

How query cost affects search behavior

Leif Azzopardi; Diane Kelly; Kathy Brennan

affects how users interact with a search system. Microeconomic theory is used to generate the cost-interaction hypothesis that states as the cost of querying increases, users will pose fewer queries and examine more documents per query. A between-subjects laboratory study with 36 undergraduate subjects was conducted, where subjects were randomly assigned to use one of three search interfaces that varied according to the amount of physical cost required to query: Structured (high cost), Standard (medium cost) and Query Suggestion (low cost). Results show that subjects who used the Structured interface submitted significantly fewer queries, spent more time on search results pages, examined significantly more documents per query, and went to greater depths in the search results list. Results also showed that these subjects spent longer generating their initial queries, saved more relevant documents and rated their queries as more successful. These findings have implications for the usefulness of microeconomic theory as a way to model and explain search interaction, as well as for the design of query facilities.

european conference on information retrieval | 2009

The Combination and Evaluation of Query Performance Prediction Methods

Claudia Hauff; Leif Azzopardi; Djoerd Hiemstra

In this paper, we examine a number of newly applied methods for combining pre-retrieval query performance predictors in order to obtain a better prediction of the querys performance. However, in order to adequately and appropriately compare such techniques, we critically examine the current evaluation methodology and show how using linear correlation coefficients (i) do not provide an intuitive measure indicative of a methods quality, (ii) can provide a misleading indication of performance, and (iii) overstate the performance of combined methods. To address this, we extend the current evaluation methodology to include cross validation, report a more intuitive and descriptive statistic, and apply statistical testing to determine significant differences. During the course of a comprehensive empirical study over several TREC collections, we evaluate nineteen pre-retrieval predictors and three combination methods.

Information Retrieval | 2008

An analysis on document length retrieval trends in language modeling smoothing

David E. Losada; Leif Azzopardi

Document length is widely recognized as an important factor for adjusting retrieval systems. Many models tend to favor the retrieval of either short or long documents and, thus, a length-based correction needs to be applied for avoiding any length bias. In Language Modeling for Information Retrieval, smoothing methods are applied to move probability mass from document terms to unseen words, which is often dependant upon document length. In this article, we perform an in-depth study of this behavior, characterized by the document length retrieval trends, of three popular smoothing methods across a number of factors, and its impact on the length of documents retrieved and retrieval performance. First, we theoretically analyze the Jelinek–Mercer, Dirichlet prior and two-stage smoothing strategies and, then, conduct an empirical analysis. In our analysis we show how Dirichlet prior smoothing caters for document length more appropriately than Jelinek–Mercer smoothing which leads to its superior retrieval performance. In a follow up analysis, we posit that length-based priors can be used to offset any bias in the length retrieval trends stemming from the retrieval formula derived by the smoothing technique. We show that the performance of Jelinek–Mercer smoothing can be significantly improved by using such a prior, which provides a natural and simple alternative to decouple the query and document modeling roles of smoothing. With the analysis of retrieval behavior conducted in this article, it is possible to understand why the Dirichlet Prior smoothing performs better than the Jelinek–Mercer, and why the performance of the Jelinek–Mercer method is improved by including a length-based prior.

information interaction in context | 2010

A survey of patent users: an analysis of tasks, behavior, search functionality and system requirements

Hideo Joho; Leif Azzopardi; Wim Vanderbauwhede

With a growing interest in Patent Information Retrieval, there is a need to better understand the context associated with patent users, their tasks, needs and expectations of patent search systems and applications. Patent search is known to be a complex, difficult and challenging activity, usually requiring expert Patent Information Specialists to spend a substantial amount of time sourcing (or not) documents relevant to their particular task. Information Retrieval provides a whole array of possible techniques and tools which could be applied to ease the burden of such retrieval tasks, and also make searching patents more accessible to non-Patent Information Specialists. In this paper, we report the findings from a survey of patent users conducted to ascertain information about patent users and their search requirements with respect to Information Retrieval systems and applications.

Explore More