Paul B. Thistlewaite
Australian National University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Paul B. Thistlewaite.
international world wide web conferences | 1999
David Hawking; Nick Craswell; Paul B. Thistlewaite; Donna Harman
Abstract A frozen 18.5 million page snapshot of part of the Web has been created to enable and encourage meaningful and reproducible evaluation of Web search systems and techniques. This collection is being used in an evaluation framework within the Text Retrieval Conference (TREC) and will hopefully provide convincing answers to questions such as, “Can link information result in better rankings?”, “Do longer queries result in better answers?”, and, “Do TREC systems work well on Web data?” The snapshot and associated evaluation methods are described and an invitation is extended to participate. Preliminary results are presented for an effectivess comparison of six TREC systems working on the snapshot collection against five well-known Web search systems working over the current Web. These suggest that the standard of document rankings produced by public Web search engines is by no means state-of-the-art.
ACM Transactions on Information Systems | 1999
David Hawking; Paul B. Thistlewaite
The problem of using a broker to select a subset of available information servers in order to achieve a good trade-off between document retrieval effectiveness and cost is addressed. Server selection methods which are capable of operating in the absence of global information, and where servers have no knowledge of brokers, are investigated. A novel method using Lightweight Probe queries (LWP method) is compared with several methods based on data from past query processing, while Random and Optimal server rankings serve as controls. Methods are evaluated, using TREC data and relevance judgments, by computing ratios, both empirical and ideal, of recall and early precision for the subset versus the complete set of available servers. Estimates are also made of the best-possible performance of each of the methods. LWP and Topic Similarity methods achieved best results, each being capable of retrieving about 60% of the relevant documents for only one-third of the cost of querying all servers. Subject to the applicable cost model, the LWP method is likely to be preferred because it is suited to dynamic environments. The good results obtained with a simple automatic LWP implementation were replicated using different data and a larger set of query topics.
acm conference on hypertext | 1997
Paul B. Thistlewaite
Abstract Many researchers have noted the problems associated with manually created or maintained hyperdocument links, and the consequent need for automated methods. A number of techniques have been applied to the problem, including pattern-matching, information retrieval, and natural language processing. This paper describes a system for the automatic detection and management of structural and referential links. The paper also addresses the issues of link-set soundness and completeness, open link management, and the particular problems engendered by large volatile hyperbases.
Information Retrieval | 1999
David Hawking; Paul B. Thistlewaite; Donna Harman
Due to the popularity of Web search engines, a large proportion of real text retrieval queries are now processed over collections measured in tens or hundreds of gigabytes. A new Very Large test Collection (VLC) has been created to support qualification, measurement and comparison of systems operating at this level and to permit the study of the properties of very large collections. The VLC is an extension of the well-known TREC collection and has been distributed under the same conditions. A simple set of efficiency and effectiveness measures have been defined to encourage comparability of reporting. The 20 gigabyte first-edition of the VLC and a representative 10% sample have been used in a special interest track of the 1997 Text Retrieval Conference (TREC-6). The unaffordable cost of obtaining complete relevance assessments over collections of this scale is avoided by concentrating on early precision and relying on the core TREC collection to support detailed effectiveness studies. Results obtained by TREC-6 VLC track participants are presented here. All groups observed a significant increase in early precision as collection size increased. Explanatory hypotheses are advanced for future empirical testing. A 100 gigabyte second edition VLC (VLC2) has recently been compiled and distributed for use in TREC-7 in 1998.
conference on automated deduction | 1988
Michael A. McRobbie; Robert K. Meyer; Paul B. Thistlewaite
In this paper we give an introduction to a technique for greatly increasing the efficiency of automated theorem provers for non-standard logics. This technique takes advantage of the fact that while most important non-standard logics do not have finite characteristic models (in the sense that truth tables are a finite characteristic model for classical propositional logic), they do have finite models. These models validate all the theorems of a given logic, though some non-theorems as well. They invalidate the rest of the non-theorems. Hence this technique involves using the models to direct a search by an automated theorem prover for a proof by filtering out or pruning the items of the search space which the models invalidate.
Journal of Automated Reasoning | 1991
Paul B. Thistlewaite; Michael A. McRobbie
The introduction to this issue mentions three approaches that might be taken to non-classical theorem proving. During development of theorem-proving techniques for the non-classical relevant logic LR, reported in Thistlewaite, McRobbie and Meyer [1], the second approach was taken namely, a Gentzen-based deductive formulation of the logic was implemented. The hardest set of problems decided in this way involved proving that certain formulas in fact define binary connectives that are associative in LR. The motivation for doing so was to show the relevant logic R undecidable by defining an appropriately free associative connective within R to act like the desired semigroup operation in the manner of Post. This work was interrupted by Alistair Urquharts proof of the undecidability of R using techniques from projective geometry. Nonetheless, we believe that this set of problems probably marks the limits of competence of prooftheoretic theorem provers for LR. An unresolved question is whether either of the other two approaches to non-classical ATP might prove more efficacious for LR, and by implication, perhaps for non-classical logics generally. A first step towards resolving this question would be proving, using either of the other two approaches, that the set of formulas (F1) to (Fl6) below, define binary connectives that are associative in LR.
text retrieval conference | 1995
David Hawking; Paul B. Thistlewaite
Archive | 1988
Paul B. Thistlewaite; Michael A. McRobbie; Robert K. Meyer
Archive | 1996
David Hawking; Paul B. Thistlewaite
international world wide web conferences | 1996
Paul B. Thistlewaite; Steve Ball