Stewart Whiting
University of Glasgow
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stewart Whiting.
international world wide web conferences | 2014
Stewart Whiting; Joemon M. Jose
Query auto-completion (QAC) is a common interactive feature that assists users in formulating queries by providing completion suggestions as they type. In order for QAC to minimise the users cognitive and physical effort, it must: (i) suggest the users intended query after minimal input keystrokes, and (ii) rank the users intended query highly in completion suggestions. Typically, QAC approaches rank completion suggestions by their past popularity. Accordingly, QAC is usually very effective for previously seen and consistently popular queries. Users are increasingly turning to search engines to find out about unpredictable emerging and ongoing events and phenomena, often using previously unseen or unpopular queries. Consequently, QAC must be both robust and time-sensitive -- that is, able to sufficiently rank both consistently and recently popular queries in completion suggestions. To address this trade-off, we propose several practical completion suggestion ranking approaches, including: (i) a sliding window of query popularity evidence from the past 2-28 days, (ii) the query popularity distribution in the last N queries observed with a given prefix, and (iii) short-range query popularity prediction based on recently observed trends. Using real-time simulation experiments, we extensively investigated the parameters necessary to maximise QAC effectiveness for three openly available query log datasets with prefixes of 2-5 characters: MSN and AOL (both English), and Sogou 2008 (Chinese). Optimal parameters vary for each query log, capturing the differing temporal dynamics and querying distributions. Results demonstrate consistent and language-independent improvements of up to 9.2% over a non-temporal QAC baseline for all query logs with prefix lengths of 2-3 characters. This work is an important step towards more effective QAC approaches.
Information Retrieval | 2013
Guido Zuccon; Teerapong Leelanupab; Stewart Whiting; Emine Yilmaz; Joemon M. Jose; Leif Azzopardi
In the field of information retrieval (IR), researchers and practitioners are often faced with a demand for valid approaches to evaluate the performance of retrieval systems. The Cranfield experiment paradigm has been dominant for the in-vitro evaluation of IR systems. Alternative to this paradigm, laboratory-based user studies have been widely used to evaluate interactive information retrieval (IIR) systems, and at the same time investigate users’ information searching behaviours. Major drawbacks of laboratory-based user studies for evaluating IIR systems include the high monetary and temporal costs involved in setting up and running those experiments, the lack of heterogeneity amongst the user population and the limited scale of the experiments, which usually involve a relatively restricted set of users. In this paper, we propose an alternative experimental methodology to laboratory-based user studies. Our novel experimental methodology uses a crowdsourcing platform as a means of engaging study participants. Through crowdsourcing, our experimental methodology can capture user interactions and searching behaviours at a lower cost, with more data, and within a shorter period than traditional laboratory-based user studies, and therefore can be used to assess the performances of IIR systems. In this article, we show the characteristic differences of our approach with respect to traditional IIR experimental and evaluation procedures. We also perform a use case study comparing crowdsourcing-based evaluation with laboratory-based evaluation of IIR systems, which can serve as a tutorial for setting up crowdsourcing-based IIR evaluations.
international world wide web conferences | 2014
Stewart Whiting; Joemon M. Jose; Omar Alonso
Wikipedia encyclopaedia projects, which consist of vast collections of user-edited articles covering a wide range of topics, are among some of the most popular websites on internet. With so many users working collaboratively, mainstream events are often very quickly reflected by both authors editing content and users reading articles. With temporal signals such as changing article content, page viewing activity and the link graph readily available, Wikipedia has gained attention in recent years as a source of temporal event information. This paper serves as an overview of the characteristics and past work which support Wikipedia (English, in this case) for time-aware information retrieval research. Furthermore, we discuss the main content and meta-data temporal signals available along with illustrative analysis. We briefly discuss the source and nature of each signal, and any issues that may complicate extraction and use. To encourage further temporal research based on Wikipedia, we have released all the distilled datasets referred to in this paper.
european conference on information retrieval | 2012
Stewart Whiting; Iraklis A. Klampanos; Joemon M. Jose
Twitter has become a major outlet for news, discussion and commentary of on-going events and trends. Effective searching of Twitter collections poses a number of issues for traditional document-based information retrieval (IR) approaches, such as limited document term statistics and spam. In this paper we propose a novel approach to pseudo-relevance feedback, based upon the temporal profiles of n-grams extracted from the top N relevance feedback tweets. A weighted graph is used to model temporal correlation between n-grams, with a PageRank variant employed to combine both pseudo-relevant document term distribution and temporal collection evidence. Preliminary experiments with the TREC Microblogging 2011 Twitter corpus indicate that through parameter optimisation, retrieval effectiveness can be improved.
conference on information and knowledge management | 2012
Stewart Whiting; Ke Zhou; Joemon M. Jose; Omar Alonso; Teerapong Leelanupab
Time plays a central role in many web search information needs relating to recent events. For recency queries where fresh information is most desirable, there is likely to be a great deal of highly-relevant information created very recently by crowds of people across the world, particularly on platforms such as Wikipedia and Twitter. With so many users, mainstream events are often very quickly reflected in these sources. The English Wikipedia encyclopedia consists of a vast collection of user-edited articles covering a range of topics. During events, users collaboratively create and edit existing articles in near real-time. Simultaneously, users on Twitter disseminate and discuss event details, with a small number of users becoming influential for the topic. In this demo, we propose a novel approach to presenting a summary of new information and users related to recent or ongoing events associated with the users search topic, therefore aiding most recent information discovery. We outline methods to detect search topics which are driven by events, identify and extract changing Wikipedia article passages and find influential Twitter users. Using these, we provide a system which displays familiar tiles in search results to present recent changes in the event-related Wikipedia articles, as well as Twitter users who have tweeted recent relevant information about the event topics.
international acm sigir conference on research and development in information retrieval | 2011
Stewart Whiting; Yashar Moshfeghi; Joemon M. Jose
As digital collections expand, the importance of the temporal aspect of information has become increasingly apparent. The aim of this paper is to investigate the effect of using long-term temporal profiles of terms in information retrieval by enhancing the term selection process of pseudo-relevance feedback (PRF). For this purpose, two temporal PRF approaches were introduced considering only temporal aspect and temporal along with textual aspect. Experiments used the AP88-89 and WSJ87-92 test collections with TREC Ad-Hoc Topics 51-100. Term temporal profiles are extracted from the Google Books n-grams dataset. The results show that the long-term temporal aspects of terms are capable of enhancing retrieval effectiveness.
international acm sigir conference on research and development in information retrieval | 2013
Stewart Whiting; Ke Zhou; Joemon M. Jose; Mounia Lalmas
Time is often important for understanding user intent during search activity, especially for information needs related to event-driven topics. Diversity for multi-faceted information needs ensures that ranked documents optimally cover multiple facets when a users intent is uncertain. Effective diversity is reliant on methods to (i) discover and represent facets, and (ii) determine how likely each facet is the users intent (i.e., its popularity). Past work has developed several techniques addressing these issues, however, they have concentrated on static approaches which do not consider the temporal nature of new and evolving intents and their popularity. In many cases, what a user expects may change dramatically over time as events develop. In this work we study the temporal variance of search intents for event-driven information needs using Wikipedia. First, we model intents based upon the structure represented by the section hierarchy of Wikipedia articles closely related to the information need. Using this technique, we investigate whether temporal changes in the content structure, i.e. in a sections text, reflect the temporal popularity of the intent. We map intents taken from a query-log (as ground-truth) to Wikipedia article sections and found that a large proportion are indeed reflected in topic-related article structure. By correlating the change activity of each section with the use of the intent query over time, we found that section change activity does reflect temporal popularity of many intents. Furthermore, we show that popularity between intents changes over time for event-driven topics.
Proceedings of the 3rd international workshop on Collaborative information retrieval | 2011
Jesus A. Rodriguez Perez; Stewart Whiting; Joemon M. Jose
This paper outlines the preliminary design of a novel synchronous collaborative web browsing system named CoFox. CoFox differs from other collaborative web browsing systems in that it provides a live video stream of web content being viewed by the remote user as the main awareness supporting mechanism. Additionally, it allows users to scroll through the remote user video screen capture to re-watch any part of the search session. As well as screen capture features, also included is a set of collaboration tools including video segment annotations, shared links and instant messaging. CoFox provides a platform for a pair of users to tackle collaborative tasks; which may greatly benefit from the expertise of more than one individual. We outline the graphical user interface components from which CoFox will be composed, the hypothetical benefits provided by such a system and finally describe a possible user-based evaluation methodology.
conference on multimedia modeling | 2013
Philip J. McParlane; Stewart Whiting; Joemon M. Jose
Existing automatic image annotation (AIA) systems that depend solely on low-level image features often produce poor results, particularly when annotating real-life collections. Tag co-occurrence has been shown to improve image annotation by identifying additional keywords associated with user-provided keywords. However, existing approaches have treated tag co-occurrence as a static measure over time, thereby ignoring the temporal trends of many tags. The temporal distribution of tags, however, caused by events, seasons and memes, etc, provides a strong source of evidence beyond keywords for AIA. In this paper we propose a temporal tag co-occurrence approach to improve AIA accuracy. By segmenting collection tags into multiple co-occurrence matrices, each covering an interval of time, we are able to give precedence to tags which not only co-occur each other, but also have temporal significance. We evaluate our approach on a real-life timestamped image collection from Flickr by performing experiments over a number of temporal interval sizes. Results show statistically significant improvements to annotation accuracy compared to a non-temporal co-occurrence baseline.
international acm sigir conference on research and development in information retrieval | 2012
Stewart Whiting
Real-time news and social media quickly reflect large-scale phenomena and events. As users become exposed to this information, time plays a central role in prompting both information authorship and seeking activities. The objective of this research is to develop a retrieval system which can anticipate a users likely temporal intent(s), considering recent or ongoing real-world events. Such a system should not only provide recent news when relevant, but also higher rank non-timestamped or even older documents which are temporally pertinent as they cover aspects related to recent event topics. Key challenges to be addressed in this work include: a suitable source and method for event detection and tracking, an intent-aware ranking approach and an evaluation methodology.