Stefan M. Rüger
Open University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stefan M. Rüger.
conference on image and video retrieval | 2004
Peter Howarth; Stefan M. Rüger
We have carried out a detailed evaluation of the use of texture features in a query-by-example approach to image retrieval. We used 3 radically different texture feature types motivated by i) statistical, ii) psychological and iii) signal processing points of view. The features were evaluated and tested on retrieval tasks from the Corel and TRECVID2003 image collections. For the latter we also looked at the effects of combining texture features with a colour feature.
multimedia information retrieval | 2010
Stefanie Nowak; Stefan M. Rüger
The creation of golden standard datasets is a costly business. Optimally more than one judgment per document is obtained to ensure a high quality on annotations. In this context, we explore how much annotations from experts differ from each other, how different sets of annotations influence the ranking of systems and if these annotations can be obtained with a crowdsourcing approach. This study is applied to annotations of images with multiple concepts. A subset of the images employed in the latest ImageCLEF Photo Annotation competition was manually annotated by expert annotators and non-experts with Mechanical Turk. The inter-annotator agreement is computed at an image-based and concept-based level using majority vote, accuracy and kappa statistics. Further, the Kendall τ and Kolmogorov-Smirnov correlation test is used to compare the ranking of systems regarding different ground-truths and different evaluation measures in a benchmark scenario. Results show that while the agreement between experts and non-experts varies depending on the measure used, its influence on the ranked lists of the systems is rather small. To sum up, the majority vote applied to generate one annotation set out of several opinions, is able to filter noisy judgments of non-experts to some extent. The resulting annotation set is of comparable quality to the annotations of experts.
IEEE Transactions on Knowledge and Data Engineering | 2012
Chenghua Lin; Yulan He; Richard M. Everson; Stefan M. Rüger
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework called joint sentiment-topic (JST) model based on latent Dirichlet allocation (LDA), which detects sentiment and topic simultaneously from text. A reparameterized version of the JST model called Reverse-JST, obtained by reversing the sequence of sentiment and topic generation in the modeling process, is also studied. Although JST is equivalent to Reverse-JST without a hierarchical prior, extensive experiments show that when sentiment priors are added, JST performs consistently better than Reverse-JST. Besides, unlike supervised approaches to sentiment classification which often fail to produce satisfactory performance when shifting to other domains, the weakly supervised nature of JST makes it highly portable to other domains. This is verified by the experimental results on data sets from five different domains where the JST model even outperforms existing semi-supervised approaches in some of the data sets despite using no labeled documents. Moreover, the topics and topic sentiment detected by JST are indeed coherent and informative. We hypothesize that the JST model can readily meet the demand of large-scale sentiment analysis from the web in an open-ended fashion.
conference on image and video retrieval | 2005
Alexei Yavlinsky; Edward James Schofield; Stefan M. Rüger
This paper describes a simple framework for automatically annotating images using non-parametric models of distributions of image features. We show that under this framework quite simple image properties such as global colour and texture distributions provide a strong basis for reliably annotating images. We report results on subsets of two photographic libraries, the Corel Photo Archive and the Getty Image Archive. We also show how the popular Earth Mover’s Distance measure can be effectively incorporated within this framework.
intelligent information systems | 2003
Shyamala Doraisamy; Stefan M. Rüger
In this paper we investigate the retrieval performance of monophonic and polyphonic queries made on a polyphonic music database. We extend the n-gram approach for full-music indexing of monophonic music data to polyphonic music using both rhythm and pitch information. We define an experimental framework for a comparative and fault-tolerance study of various n-gramming strategies and encoding levels. For monophonic queries, we focus in particular on query-by-humming systems, and for polyphonic queries on query-by-example. Error models addressed in several studies are surveyed for the fault-tolerance study. Our experiments show that different n-gramming strategies and encoding precision differ widely in their effectiveness. We present the results of our study on a collection of 6366 polyphonic MIDI-encoded music pieces.
asia information retrieval symposium | 2008
Haiming Liu; Dawei Song; Stefan M. Rüger; Rui Hu; Victoria S. Uren
Dissimilarity measurement plays a crucial role in content-based image retrieval, where data objects and queries are represented as vectors in high-dimensional content feature spaces. Given the large number of dissimilarity measures that exist in many fields, a crucial research question arises: Is there a dependency, if yes, what is the dependency, of a dissimilarity measures retrieval performance, on different feature spaces? In this paper, we summarize fourteen core dissimilarity measures and classify them into three categories. A systematic performance comparison is carried out to test the effectiveness of these dissimilarity measures with six different feature spaces and some of their combinations on the Corel image collection. From our experimental results, we have drawn a number of observations and insights on dissimilarity measurement in content-based image retrieval, which will lay a foundation for developing more effective image search technologies.
european conference on information retrieval | 2004
Daniel Heesch; Stefan M. Rüger
This paper describes a novel interaction technique to support content-based image search in large image collections. The idea is to represent each image as a vertex in a directed graph. Given a set of image features, an arc is established between two images if there exists at least one combination of features for which one image is retrieved as the nearest neighbour of the other. Each arc is weighted by the proportion of feature combinations for which the nearest neighour relationship holds. By thus integrating the retrieval results over all possible feature combinations, the resulting network helps expose the semantic richness of images and thus provides an elegant solution to the problem of feature weighting in content-based image retrieval. We give details of the method used for network generation and describe the ways a user can interact with the structure. We also provide an analysis of the network’s topology and provide quantitative evidence for the usefulness of the technique.
Computer Vision and Image Understanding | 2003
Marcus Jerome Pickering; Stefan M. Rüger
We investigate the application of a variety of content-based image retrieval techniques to the problem of video retrieval. We generate large numbers of features for each of the key frames selected by a highly effective shot boundary detection algorithm to facilitate a query by example type search. The retrieval performance of two learning methods, boosting and k-nearest neighbours, is compared against a vector space model. We carry out a novel and extensive evaluation to demonstrate and compare the usefulness of these algorithms for video retrieval tasks using a carefully created test collection of over 6000 still images, where performance is measured against relevance judgements based on human image annotations. Three types of experiment are carried out: classification tasks, category searches (both related to automated annotation and summarisation of video material) and real world searches (for navigation and entry point finding). We also show graphical results of real video search tasks using the algorithms, which have not previously been applied to video material in this way.
International Journal of Geographical Information Science | 2008
Simon E. Overell; Stefan M. Rüger
This paper describes the generation of a model capturing information on how placenames co‐occur together. The advantages of the co‐occurrence model over traditional gazetteers are discussed and the problem of placename disambiguation is presented as a case study. We begin by outlining the problem of ambiguous placenames. We demonstrate how analysis of Wikipedia can be used in the generation of a co‐occurrence model. The accuracy of our model is compared to a handcrafted ground truth; then we evaluate alternative methods of applying this model to the disambiguation of placenames in free text (using the GeoCLEF evaluation forum). We conclude by showing how the inclusion of placenames in both the text and geographic parts of a query provides the maximum mean average precision and outline the benefits of a co‐occurrence model as a data source for the wider field of geographic information retrieval (GIR).
conference on image and video retrieval | 2007
João Magalhães; Stefan M. Rüger
To solve the problem of indexing collections with diverse text documents, image documents, or documents with both text and images, one needs to develop a model that supports heterogeneous types of documents. In this paper, we show how information theory supplies us with the tools necessary to develop a unique model for text, image, and text/image retrieval. In our approach, for each possible query keyword we estimate a maximum entropy model based on exclusively continuous features that were preprocessed. The unique continuous feature-space of text and visual data is constructed by using a minimum description length criterion to find the optimal feature-space representation (optimal from an information theory point of view). We evaluate our approach in three experiments: only text retrieval, only image retrieval, and text combined with image retrieval.