John Dunnion
University College Dublin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by John Dunnion.
international acm sigir conference on research and development in information retrieval | 2006
David Lillis; Fergus Toolan; Rem W. Collier; John Dunnion
Data fusion is the combination of the results of independent searches on a document collection into one single output result set. It has been shown in the past that this can greatly improve retrieval effectiveness over that of the individual results.This paper presents probFuse, a probabilistic approach to data fusion. ProbFuse assumes that the performance of the individual input systems on a number of training queries is indicative of their future performance. The fused result set is based on probabilities of relevance calculated during this training process. Retrieval experiments using data from the TREC ad hoc collection demonstrate that probFuse achieves results superior to that of the popular CombMNZ fusion algorithm.
conference on intelligent text processing and computational linguistics | 2004
William P. Doran; Nicola Stokes; Joe Carthy; John Dunnion
We present a comparative study of lexical chain-based summarisation techniques. The aim of this paper is to highlight the effect of lexical chain scoring metrics and sentence extraction techniques on summary generation. We present our own lexical chain-based summarisation system and compare it to other chain-based summarisation systems. We also compare the chain scoring and extraction techniques of our system to those of several other baseline systems, including a random summarizer and one based on tf.idf statistics. We use a task-orientated summarisation evaluation scheme that determines summary quality based on TDT story link detection performance.
systems and information engineering design symposium | 2003
Doireann Cassidy; Joe Carthy; Anne Drummond; John Dunnion; John Sheppard
Incident reporting is becoming increasingly important in large organizations. Legislation is progressively being introduced to deal with this information. One example is the European Directive No. 94/95/EC, which obliges airlines and national bodies to collect and collate reports of incidents. Typically these organizations use manual files and standard databases to store and retrieve incident reports. However, research has established that database technology needs to be enhanced in order to deal with incidents. We describe the design and implementation of In-Ret, an incident report retrieval system that endeavours to find similarities and patterns between incidents by combining the strengths of case-based reasoning and information retrieval techniques in an integrated system. Preliminary results from InRet are presented and are encouraging.
Artificial Intelligence Review | 2006
David Lillis; Fergus Toolan; Angel Mur; Liu Peng; Rem W. Collier; John Dunnion
Information Retrieval (IR) forms the basis of many information management tasks. Information management itself has become an extremely important area as the amount of electronically available information increases dramatically. There are numerous methods of performing the IR task both by utilising different techniques and through using different representations of the information available to us. It has been shown that some algorithms outperform others on certain tasks. Combining the results produced by different algorithms has resulted in superior retrieval performance and this has become an important research area. This paper introduces a probability-based fusion technique probFuse that shows initial promise in addressing this question. It also compares probFuse with the common CombMNZ data fusion technique.
international acm sigir conference on research and development in information retrieval | 2010
David Lillis; Lusheng Zhang; Fergus Toolan; Rem W. Collier; David Leonard; John Dunnion
Data Fusion is the combination of a number of independent search results, relating to the same document collection, into a single result to be presented to the user. A number of probabilistic data fusion models have been shown to be effective in empirical studies. These typically attempt to estimate the probability that particular documents will be relevant, based on training data. However, little attempt has been made to gauge how the accuracy of these estimations affect fusion performance. The focus of this paper is twofold: firstly, that accurate estimation of the probability of relevance results in effective data fusion; and secondly, that an effective approximation of this probability can be made based on less training data that has previously been employed. This is based on the observation that the distribution of relevant documents follows a similar pattern in most high-quality result sets. Curve fitting suggests that this can be modelled by a simple function that is less complex than other models that have been proposed. The use of existing IR evaluation metrics is proposed as a substitution for probability calculations. Mean Average Precision is used to demonstrate the effectiveness of this approach, with evaluation results demonstrating competitive performance when compared with related algorithms with more onerous requirements for training data.
european conference on information retrieval | 2008
David Lillis; Fergus Toolan; Rem W. Collier; John Dunnion
Recent developments in the field of data fusion have seen a focus on techniques that use training queries to estimate the probability that various documents are relevant to a given query and use that information to assign scores to those documents on which they are subsequently ranked. This paper introduces SlideFuse, which builds on these techniques, introducing a sliding window in order to compensate for situations where little relevance information is available to aid in the estimation of probabilities. SlideFuse is shown to perform favourably in comparison with CombMNZ, ProbFuse and SegFuse. CombMNZ is the standard baseline technique against which data fusion algorithms are compared whereas ProbFuse and SegFuse represent the state-of-the-art for probabilistic data fusion methods.
text speech and dialogue | 2004
Svetlana Hensman; John Dunnion
This paper describes a technique which uses research into the use of existing linguistic resources (VerbNet and WordNet) to construct conceptual graph representations of texts. We use a two-step approach, firstly identifying the semantic roles in a sentence, and then using these roles, together with semi-automatically compiled domain-specific knowledge, to construct the conceptual graph representation.
european conference on information retrieval | 2005
Ruichao Wang; Nicola Stokes; William P. Doran; Eamonn Newman; Joe Carthy; John Dunnion
In this paper we compare a number of Topiary-style headline generation systems. The Topiary system, developed at the University of Maryland with BBN, was the top performing headline generation system at DUC 2004. Topiary-style headlines consist of a number of general topic labels followed by a compressed version of the lead sentence of a news story. The Topiary system uses a statistical learning approach to finding topic labels for headlines, while our approach, the LexTrim system, identifies key summary words by analysing the lexical cohesive structure of a text. The performance of these systems is evaluated using the ROUGE evaluation suite on the DUC 2004 news stories collection. The results of these experiments show that a baseline system that identifies topic descriptors for headlines using term frequency counts outperforms the LexTrim and Topiary systems. A manual evaluation of the headlines also confirms this result.
Lecture Notes in Computer Science | 2005
Liu Peng; Rem W. Collier; Angel Mur; David Lillis; Fergus Toolan; John Dunnion
This paper describes an extensible and scalable approach to indexing documents that is utilized within the Highly Organised Team of Agents for Information Retrieval (HOTAIR) architecture.
Artificial Intelligence Review | 2006
David Lillis; Fergus Toolan; Rem W. Collier; John Dunnion
Data fusion is the process of combining the output of a number of Information Retrieval (IR) algorithms into a single result set, to achieve greater retrieval performance. ProbFuse is a data fusion algorithm that uses the history of the underlying IR algorithms to estimate the probability that subsequent result sets include relevant documents in particular positions. It has been shown to out-perform CombMNZ, the standard data fusion algorithm against which to compare performance, in a number of previous experiments. This paper builds upon this previous work and applies probFuse to the much larger Web Track document collection from the 2004 Text REtreival Conference. The performance of probFuse is compared against that of CombMNZ using a number of evaluation measures and is shown to achieve substantial performance improvements.