Andrew Turpin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrew Turpin is active.

Explore More

Publication

Featured researches published by Andrew Turpin.

American Journal of Ophthalmology | 2003

A computerized method of visual acuity testing: adaptation of the early treatment of diabetic retinopathy study testing protocol.

Roy W. Beck; Pamela S. Moke; Andrew Turpin; Frederick L. Ferris; John Paul SanGiovanni; Chris A. Johnson; Eileen E. Birch; Danielle L. Chandler; Terry A. Cox; R. Clifford Blair; Raymond T. Kraker

PURPOSE To develop a computerized method of visual acuity testing for clinical research as an alternative to the standard Early Treatment for Diabetic Retinopathy Study (ETDRS) testing protocol, and to evaluate its test-retest reliability and concordance with standard ETDRS testing. DESIGN Test-retest reliability study. METHODS Multicenter setting of a study population of 265 patients at three clinical sites. Visual acuity was measured with both the electronic visual acuity testing algorithm (E-ETDRS) and standard ETDRS protocol (S-ETDRS) twice on one eye of each patient. E-ETDRS testing was conducted using the electronic visual acuity tester (EVA), which utilizes a programmed Palm (Palm, Inc, Santa Clara, California, USA) hand-held device communicating with a personal computer and 17-inch monitor at a test distance of 3 meters. RESULTS For the E-ETDRS protocol, test-retest reliability was high (r = 0.99; with 89% and 98% of retests within 0.1 logMAR and 0.2 logMAR of initial tests, respectively) and comparable with that of S-ETDRS testing (r = 0.99; with 87% and 98% of retests within 0.1 logMAR and 0.2 logMAR of initial test, respectively). The E-ETDRS and S-ETDRS scores were highly correlated (r = 0.96 for initial tests and r = 0.97 for repeat tests). Based on estimates of 95% confidence intervals, a change in visual acuity of 0.2 logMAR (10 letters) from a baseline level is unlikely to be related to measurement variability using either the E-ETDRS or the S-ETDRS visual acuity testing protocol. CONCLUSIONS The E-ETDRS protocol has high test-retest reliability and good concordance with S-ETDRS testing. The computerized method has advantages over the S-ETDRS testing in electronically capturing the data for each tested letter, requiring only a single distance for testing from 20/12 to 20/800, potentially reducing testing time, and potentially decreasing technician-related bias.

international acm sigir conference on research and development in information retrieval | 2006

User performance versus precision measures for simple search tasks

Andrew Turpin; Falk Scholer

Several recent studies have demonstrated that the type of improvements in information retrieval system effectiveness reported in forums such as SIGIR and TREC do not translate into a benefit for users. Two of the studies used an instance recall task, and a third used a question answering task, so perhaps it is unsurprising that the precision based measures of IR system effectiveness on one-shot query evaluation do not correlate with user performance on these tasks. In this study, we evaluate two different information retrieval tasks on TREC Web-track data: a precision-based user task, measured by the length of time that users need to find a single document that is relevant to a TREC topic; and, a simple recall-based task, represented by the total number of relevant documents that users can identify within five minutes. Users employ search engines with controlled mean average precision (MAP) of between 55% and 95%. Our results show that there is no significant relationship between system effectiveness measured by MAP and the precision-based task. A significant, but weak relationship is present for the precision at one document returned metric. A weak relationship is present between MAP and the simple recall-based task.

American Journal of Ophthalmology | 2001

Computerized method of visual acuity testing: Adaptation of the Amblyopia Treatment Study visual acuity testing protocol

Pamela S. Moke; Andrew Turpin; Roy W. Beck; Jonathan M. Holmes; Michael X. Repka; Eileen E. Birch; Richard W. Hertle; Raymond T. Kraker; Joseph M. Miller; Chris A. Johnson

PURPOSE To report a computerized method for determining visual acuity in children using the Amblyopia Treatment Study visual acuity testing protocol. METHODS A computerized visual acuity tester was developed that uses a programmed handheld device that uses the Palm operating system (Palm, Inc, Santa Clara, California). The handheld device communicates with a personal computer running a Linux operating system and 17-inch monitor. At a test distance of 3 m, single letters can be displayed from 20/800 to 20/12. A C program on the handheld device runs the Amblyopia Treatment Study visual acuity testing protocol. Using this method, visual acuity was tested in both the right and left eyes, and then the testing was repeated in 156 children age 3 to 7 years at four clinical sites. RESULTS Test-retest reliability was high (r =.92 and 0.95 for and right and left eyes, respectively), with 88% of right eye retests and 94% of left eye retests within 0.1 logarithm of minimal angle of resolution (logMAR) units of the initial test. The 95% confidence interval for an acuity score was calculated to be the score +/- 0.13 logMAR units. For a change between two acuity scores, the 95% confidence interval was the difference +/- 0.19 logMAR units. CONCLUSIONS We have developed a computerized method for measurement of visual acuity. Automation of the Amblyopia Treatment Study visual acuity testing protocol is an effective method of testing visual acuity in children 3 to 7 years of age.

international acm sigir conference on research and development in information retrieval | 2001

Why batch and user evaluations do not give the same results

Andrew Turpin; William R. Hersh

Much system-oriented evaluation of information retrieval systems has used the Cranfield approach based upon queries run against test collections in a batch mode. Some researchers have questioned whether this approach can be applied to the real world, but little data exists for or against that assertion. We have studied this question in the context of the TREC Interactive Track. Previous results demonstrated that improved performance as measured by relevance-based metrics in batch studies did not correspond with the results of outcomes based on real user searching tasks. The experiments in this paper analyzed those results to determine why this occurred. Our assessment showed that while the queries entered by real users into systems yielding better results in batch studies gave comparable gains in ranking of relevant documents for those users, they did not translate into better performance on specific tasks. This was most likely due to users being able to adequately find and utilize relevant documents ranked further down the output list.

international acm sigir conference on research and development in information retrieval | 2007

Fast generation of result snippets in web search

Andrew Turpin; Yohannes Tsegay; David Hawking; Hugh E. Williams

The presentation of query biased document snippets as part of results pages presented by search engines has become an expectation of search engine users. In this paper we explore the algorithms and data structures required as part of a search engine to allow efficient generation of query biased snippets. We begin by proposing and analysing a document compression method that reduces snippet generation time by 58% over a baseline using the zlib compression library. These experiments reveal that finding documents on secondary storage dominates the total cost of generating snippets, and so caching documents in RAM is essential for a fast snippet generation process. Using simulation, we examine snippet generation performance for different size RAM caches. Finally we propose and analyse document reordering and compaction, revealing a scheme that increases the number of document cache hits with only a marginal affect on snippet quality. This scheme effectively doubles the number of documents that can fit in a fixed size cache.

IEEE Transactions on Communications | 1997

On the implementation of minimum redundancy prefix codes

Alistair Moffat; Andrew Turpin

Minimum redundancy coding (also known as Huffman coding) is one of the enduring techniques of data compression. Many efforts have been made to improve the efficiency of minimum redundancy coding, the majority based on the use of improved representations for explicit Huffman trees. In this paper, we examine how minimum redundancy coding can be implemented efficiently by divorcing coding from a code tree, with emphasis on the situation when n is large, perhaps on the order of 10/sup 6/. We review techniques for devising minimum redundancy codes, and consider in detail how encoding and decoding should be accomplished. In particular, we describe a modified decoding method that allows improved decoding speed, requiring just a few machine operations per output symbol (rather than for each decoded bit), and uses just a few hundred bytes of memory above and beyond the space required to store an enumeration of the source alphabet.

string processing and information retrieval | 2009

Range Quantile Queries: Another Virtue of Wavelet Trees

Travis Gagie; Simon J. Puglisi; Andrew Turpin

We show how to use a balanced wavelet tree as a data structure that stores a list of numbers and supports efficient range quantile queries . A range quantile query takes a rank and the endpoints of a sublist and returns the number with that rank in that sublist. For example, if the rank is half the sublists length, then the query returns the sublists median. We also show how these queries can be used to support space-efficient coloured range reporting and document listing .

foundations of software engineering | 2007

Efficient token based clone detection with flexible tokenization

Hamid Abdul Basit; Simon J. Puglisi; William F. Smyth; Andrew Turpin; Stan Jarzabek

Code clones are similar code fragments that occur at multiple locations in a software system. Detection of code clones provides useful information for maintenance, reengineering, program understanding and reuse. Several techniques have been proposed to detect code clones. These techniques differ in the code representation used for analysis of clones, ranging from plain text to parse trees and program dependence graphs. Clone detection based on lexical tokens involves minimal code transformation and gives good results, but is computationally expensive because of the large number of tokens that need to be compared. We explored string algorithms to find suitable data structures and algorithms for efficient token based clone detection and implemented them in our tool Repeated Tokens Finder (RTF). Instead of using suffix tree for string matching, we use more memory efficient suffix array. RTF incorporates a suffix array based linear time algorithm to detect string matches. It also provides a simple and customizable tokenization mechanism. Initial analysis and experiments show that our clone detection is simple, scalable, and performs better than the previous well-known tools.

international acm sigir conference on research and development in information retrieval | 2011

Quantifying test collection quality based on the consistency of relevance judgements

Falk Scholer; Andrew Turpin; Mark Sanderson

Relevance assessments are a key component for test collection-based evaluation of information retrieval systems. This paper reports on a feature of such collections that is used as a form of ground truth data to allow analysis of human assessment error. A wide range of test collections are retrospectively examined to determine how accurately assessors judge the relevance of documents. Our results demonstrate a high level of inconsistency across the collections studied. The level of irregularity is shown to vary across topics, with some showing a very high level of assessment error. We investigate possible influences on the error, and demonstrate that inconsistency in judging increases with time. While the level of detail in a topic specification does not appear to influence the errors that assessors make, judgements are significantly affected by the decisions made on previously seen similar documents. Assessors also display an assessment inertia. Alternate approaches to generating relevance judgements appear to reduce errors. A further investigation of the way that retrieval systems are ranked using sets of relevance judgements produced early and late in the judgement process reveals a consistent influence measured across the majority of examined test collections. We conclude that there is a clear value in examining, even inserting, ground truth data in test collections, and propose ways to help minimise the sources of inconsistency when creating future test collections.

international acm sigir conference on research and development in information retrieval | 2009

Including summaries in system evaluation

Andrew Turpin; Falk Scholer; Kalvero Järvelin; Mingfang Wu; J. Shane Culpepper

In batch evaluation of retrieval systems, performance is calculated based on predetermined relevance judgements applied to a list of documents returned by the system for a query. This evaluation paradigm, however, ignores the current standard operation of search systems which require the user to view summaries of documents prior to reading the documents themselves. In this paper we modify the popular IR metrics MAP and P@10 to incorporate the summary reading step of the search process, and study the effects on system rankings using TREC data. Based on a user study, we establish likely disagreements between relevance judgements of summaries and of documents, and use these values to seed simulations of summary relevance in the TREC data. Re-evaluating the runs submitted to the TREC Web Track, we find the average correlation between system rankings and the original TREC rankings is 0.8 (Kendall τ), which is lower than commonly accepted for system orderings to be considered equivalent. The system that has the highest MAP in TREC generally remains amongst the highest MAP systems when summaries are taken into account, but other systems become equivalent to the top ranked system depending on the simulated summary relevance. Given that system orderings alter when summaries are taken into account, the small amount of effort required to judge summaries in addition to documents (19 seconds vs 88 seconds on average in our data) should be undertaken when constructing test collections.

Explore More