Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mark Sanderson is active.

Publication


Featured researches published by Mark Sanderson.


international acm sigir conference on research and development in information retrieval | 2003

Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002

James Allan; Jay Aslam; Nicholas J. Belkin; Chris Buckley; James P. Callan; W. Bruce Croft; Susan T. Dumais; Norbert Fuhr; Donna Harman; David J. Harper; Djoerd Hiemstra; Thomas Hofmann; Eduard H. Hovy; Wessel Kraaij; John D. Lafferty; Victor Lavrenko; David Lewis; Liz Liddy; R. Manmatha; Andrew McCallum; Jay M. Ponte; John M. Prager; Dragomir R. Radev; Philip Resnik; Stephen E. Robertson; Ron G. Rosenfeld; Salim Roukos; Mark Sanderson; Richard M. Schwartz; Amit Singhal

Information retrieval (IR) research has reached a point where it is appropriate to assess progress and to define a research agenda for the next five to ten years. This report summarizes a discussion of IR research challenges that took place at a recent workshop. The attendees of the workshop considered information retrieval research in a range of areas chosen to give broad coverage of topic areas that engage information retrieval researchers. Those areas are retrieval models, cross-lingual retrieval, Web search, user modeling, filtering, topic detection and tracking, classification, summarization, question answering, metasearch, distributed retrieval, multimedia retrieval, information extraction, as well as testbed requirements for future work. The potential use of language modeling techniques in these areas was also discussed. The workshop identified major challenges within each of those areas. The following are recurring themes that ran throughout: • User and context sensitive retrieval • Multi-lingual and multi-media issues • Better target tasks • Improved objective evaluations • Substantially more labeled data • Greater variety of data sources • Improved formal models Contextual retrieval and global information access were identified as particularly important long-term challenges.


international acm sigir conference on research and development in information retrieval | 2011

Evaluating multi-query sessions

Evangelos Kanoulas; Ben Carterette; Paul D. Clough; Mark Sanderson

The standard system-based evaluation paradigm has focused on assessing the performance of retrieval systems in serving the best results for a single query. Real users, however, often begin an interaction with a search engine with a sufficiently under-specified query that they will need to reformulate before they find what they are looking for. In this work we consider the problem of evaluating retrieval systems over test collections of multi-query sessions. We propose two families of measures: a model-free family that makes no assumption about the users behavior over a session, and a model-based family with a simple model of user interactions over the session. In both cases we generalize traditional evaluation metrics such as average precision to multi-query session evaluation. We demonstrate the behavior of the proposed metrics by using the new TREC 2010 Session track collection and simulations over the TREC-9 Query track collection.


international acm sigir conference on research and development in information retrieval | 2011

Quantifying test collection quality based on the consistency of relevance judgements

Falk Scholer; Andrew Turpin; Mark Sanderson

Relevance assessments are a key component for test collection-based evaluation of information retrieval systems. This paper reports on a feature of such collections that is used as a form of ground truth data to allow analysis of human assessment error. A wide range of test collections are retrospectively examined to determine how accurately assessors judge the relevance of documents. Our results demonstrate a high level of inconsistency across the collections studied. The level of irregularity is shown to vary across topics, with some showing a very high level of assessment error. We investigate possible influences on the error, and demonstrate that inconsistency in judging increases with time. While the level of detail in a topic specification does not appear to influence the errors that assessors make, judgements are significantly affected by the decisions made on previously seen similar documents. Assessors also display an assessment inertia. Alternate approaches to generating relevance judgements appear to reduce errors. A further investigation of the way that retrieval systems are ranked using sets of relevance judgements produced early and late in the judgement process reveals a consistent influence measured across the majority of examined test collections. We conclude that there is a clear value in examining, even inserting, ground truth data in test collections, and propose ways to help minimise the sources of inconsistency when creating future test collections.


Information Processing and Management | 2011

The effect of user characteristics on search effectiveness in information retrieval

Azzah Al-Maskari; Mark Sanderson

This paper investigates the influence of user characteristics (e.g. search experience and cognitive skills) on user effectiveness. A user study was conducted to investigate this effect, 56 participants completed searches for 56 topics using the TREC test collection. Results indicated that participants with search experience and high cognitive skills were more effective than those with less experience and slower perceptual abilities. However, all users rated themselves with the same level of satisfaction with the search results despite the fact they varied substantially in their effectiveness. Therefore, information retrieval evaluators should take these factors into consideration when investigating the impact of system effectiveness on user effectiveness.


ACM Transactions on Information Systems | 2011

Content redundancy in YouTube and its application to video tagging

Jose San Pedro; Stefan Siersdorfer; Mark Sanderson

The emergence of large-scale social Web communities has enabled users to share online vast amounts of multimedia content. An analysis of YouTube reveals a high amount of redundancy, in the form of videos with overlapping or duplicated content. We use robust content-based video analysis techniques to detect overlapping sequences between videos. Based on the output of these techniques, we present an in-depth study of duplication and content overlap in YouTube, and analyze various dependencies between content overlap and meta data such as video titles, views, video ratings, and tags. As an application, we show that content-based links provide useful information for generating new tag assignments. We propose different tag propagation methods for automatically obtaining richer video annotations. Experiments on video clustering and classification as well as a user evaluation demonstrate the viability of our approach.


Information Processing and Management | 1997

Information seeking in electronic environment

Mark Sanderson

an audience of business and government leaders, presenting to them a tapestry of general engineering issues to face and overcome if the blueprint for the future network is to be passed through this brief window of opportunity. The powerful message is woven too subtly through the main body of the work; consequently the chance to raise critical social and historical issues to buttress the books theme is missed. The plan is provided before the purpose, without the strong threads between them clearly shown. Will readers of Information Processing & Management find this book worth the time and work required to give to it its full due? This reviewer believes it merits the effort. Heldman is a veteran of the industry, is deeply knowledgeable, and is presenting a bold and carefully reasoned plan for an extraordinarily important undertaking. The book submits as coherent and visionary a blueprint for the coming information millennium as this reviewer has seen. The topics are current, the perspective is refreshingly far-sighted, and the details are included. As the reviewer began reading the book, the Telecommunications Reform Act of 1996 (P.L. 104-104) was signed into law. Will this legislation help or hinder the formation of SONET? Will users have a voice in the outcome? Will broadband emerge above the din of broadcast? These are interesting times, as signal events that are setting the stage for centuries to come unfurl. Can civilization advance by cultivating th.e growth of electronic commerce and communications through at least the next generation? Put differently, can we afford not to? Heldmans message is worth heeding. This is one tale of suspense, however, where it is advisable to peek at the ending before reading on.


IEEE Internet Computing | 2013

Examining the Limits of Crowdsourcing for Relevance Assessment

Paul D. Clough; Mark Sanderson; Jiayu Tang; Tim Gollins; Amy Warner

Evaluation is instrumental to developing and managing effective information retrieval systems. For this process, enlisting crowdsourcing has proven viable. However, less understood are crowdsourcings limits for evaluation, particularly for domain-specific search. The authors compare relevance assessments gathered using crowdsourcing with those from a domain expert to evaluate different search engines in a large government archive. Although crowdsourced judgments rank the tested search engines in the same order as expert judgments, crowdsourced workers appear unable to distinguish different levels of highly accurate search results the way expert assessors can.


human factors in computing systems | 2012

How do we find personal files?: the effect of OS, presentation & depth on file navigation

Ofer Bergman; Steve Whittaker; Mark Sanderson; Rafi Nachmias; Anand Ramamoorthy

Folder navigation is the main way that computer users retrieve their personal files. However we know surprisingly little about navigation, particularly about how it is affected by the operating system used, the interface presentation and the folder structure. To investigate this, we asked 289 participants to retrieve 1,109 of their own active files. We analyzed the 4,948 resulting retrieval steps, i.e. moves through the hierarchical folder tree. Results show: (a) significant differences in overall retrieval time between PC and Mac that arise from different organizational strategies rather than interface design; (b) the default Windows presentation is suboptimal - if changed, retrieval time could be reduced substantially and (c) contrary to our expectations, folder depth did not affect step duration. We discuss possible reasons for these results and suggest directions for future research.


european conference on information retrieval | 2014

User Perception of Information Credibility of News on Twitter

Shafiza Mohd Shariff; Xiuzhen Zhang; Mark Sanderson

In this paper, we examine user perception of credibility for news-related tweets. We conduct a user study on a crowd-sourcing platform to judge the credibility of such tweets. By analysing user judgments and comments, we find that eight features, including some that can not be automatically identified from tweets, are perceived by users as important for judging information credibility. Moreover, distinct features like link in tweet, display name and user belief consistently lead users to judge tweets as credible. We also find that users can not consistently judge or even misjudge the credibility for some tweets on politics news.


Archive | 2004

Distributed multimedia information retrieval

Jamie Callan; Fabio Crestani; Mark Sanderson

During recent years, huge efforts have been made to establish digital libraries, in a variety of media, offered from a variety of sources, and intended for a variety of professional and private user communities. As digital data collections proliferate, problems of resource selection and data fusion become major issues. Traditional search engines, even the best ones, are unable to provide access to the hidden web of information that is only available via digital library search interfaces. Originating from the SIGIR 2003 Workshop on Distributed Information Retrieval, held in Toronto, Canada in August 2003, this book presents extended and revised workshop papers as well as several invited papers on the topic to round off coverage of the core issues. The papers are devoted to recent research on the design and implementation of methods and tools for resource discovery, resource description, resource selection, data fusion, and user interaction.

Collaboration


Dive into the Mark Sanderson's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

James Allan

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge