Robert Wetzker
Technical University of Berlin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Robert Wetzker.
web search and data mining | 2010
Robert Wetzker; Carsten Zimmermann; Christian Bauckhage; Sahin Albayrak
Collaborative tagging services (folksonomies) have been among the stars of the Web 2.0 era. They allow their users to label diverse resources with freely chosen keywords (tags). Our studies of two real-world folksonomies unveil that individual users develop highly personalized vocabularies of tags. While these meet individual needs and preferences, the considerable differences between personal tag vocabularies (personomies) impede services such as social search or customized tag recommendation. In this paper, we introduce a novel user-centric tag model that allows us to derive mappings between personal tag vocabularies and the corresponding folksonomies. Using these mappings, we can infer the meaning of user-assigned tags and can predict choices of tags a user may want to assign to new items. Furthermore, our translational approach helps in reducing common problems related to tag ambiguity, synonymous tags, or multilingualism. We evaluate the applicability of our method in tag recommendation and tag-based social search. Extensive experiments show that our translational model improves the prediction accuracy in both scenarios.
web search and data mining | 2009
Robert Wetzker; Winfried Umbrath; Alan Said
In this paper we consider the problem of item recommendation in collaborative tagging communities, so called folksonomies, where users annotate interesting items with tags. Rather than following a collaborative filtering or annotation-based approach to recommendation, we extend the probabilistic latent semantic analysis (PLSA) approach and present a unified recommendation model which evolves from item user and item tag co-occurrences in parallel. The inclusion of tags reduces known collaborative filtering problems related to overfitting and allows for higher quality recommendations. Experimental results on a large snapshot of the delicious bookmarking service show the scalability of our approach and an improved recommendation quality compared to two-mode collaborative or annotation based methods.
web intelligence | 2007
Robert Wetzker; Tansu Alpcan; Christian Bauckhage; Winfried Umbrath; Sahin Albayrak
We propose a hierarchical approach to document categorization that requires no pre-configuration and maps the semantic document space to a predefined taxonomy. The utilization of search engines to train a hierarchical classifier makes our approach more flexible than existing solutions which rely on (human) labeled data and are bound to a specific domain. We show that the structural information given by the taxonomy allows for a context aware construction of search queries and leads to higher tagging accuracy. We test our approach on different benchmark datasets and evaluate its performance on the single- and multi-tag assignment tasks. The experimental results show that our solution is as accurate as supervised classifiers for web page classification and still performs well when categorizing domain specific documents.We propose a hierarchical approach to document categorization that requires no pre-configuration and maps the semantic document space to a predefined taxonomy. The utilization of search engines to train a hierarchical classifier makes our approach more flexible than existing solutions which rely on (human) labeled data and are bound to a specific domain. We show that the structural information given by the taxonomy allows for a context aware construction of search queries and leads to higher tagging accuracy. We test our approach on different benchmark datasets and evaluate its performance on the single- and multi-tag assignment tasks. The experimental results show that our solution is as accurate as supervised classifiers for web page classification and still performs well when categorizing domain specific documents.
international conference on pattern recognition | 2008
Robert Wetzker; Till Plumbaum; Alexander Korth; Christian Bauckhage; Tansu Alpcan; Florian Metze
We propose a method for the detection of trends in social bookmarking systems. Compared to other work in this emerging field, our approach has a more sound statistical basis. In order to cope with the problem of vanishing probabilities due to data sparsity, we apply smoothing and show that it allows for an easy calibration of our trend detector resulting in better generalization and scalability. We test our approach on a collection of 105, 000, 000 bookmarks collected from the del.icio.us bookmarking service. To our knowledge, this is the largest corpus of a real world bookmarking service analyzed in this context. The results show that our method outperforms previously proposed methods and successfully detects trends in the data.
systems, man and cybernetics | 2007
Christian Bauckhage; Tansu Alpcan; Sachin Agarwal; Florian Metze; Robert Wetzker; M. Ilic; Sahin Albayrak
This paper presents an expert peering system for information exchange in the knowledge society. Our system realizes an intelligent, real-time search engine for enterprise Intranets or online communities that automatically relays user queries to knowledgable specialists. According to its very nature, the system requires a sound integration of concepts drawn from various areas of Computer Science. In addition to our solutions to problems in scalable processing, data transfer, and networking, we also address issues of interface design and usability, as well as aspects of machine intelligence. Results obtained from extensive experiments demonstrate the efficiency and robustness of our system and convey its potential for next generation web services.
adversarial information retrieval on the web | 2009
Nicolas Neubauer; Robert Wetzker; Klaus Obermayer
Spammers in social bookmarking systems try to mimick bookmarking behaviour of real users to gain the attention of other users or search engines. Several methods have been proposed for the detection of such spam, including domain-specific features (like URL terms) or similarity of users to previously identified spammers. However, as shown in our previous work, it is possible to identify a large fraction of spam users based on purely structural features. The hypergraph connecting documents, users, and tags can be decomposed into connected components, and any large, but non-giant components turned out to be almost entirely inhabitated by spam users in the examined dataset. Here, we test to what degree the decomposition of the complete hypergraph is really necessary, examining the component structure of the induced user/document and user/tag graphs. While the user/tag graphs connectivity does not help in classifying spammers, the user/document graphs connectivity is already highly informative. It can however be augmented with connectivity information from the hypergraph. In our view, spam detection based on structural features, like the one proposed here, requires complex adaptation strategies from spammers and may complement other, more traditional detection approaches.
international conference on image processing | 2008
Christian Bauckhage; Tansu Alpcan; Robert Wetzker; Winfried Umbrath
Compared to only a few years ago, today there is an abundance of annotated image data available on the Internet. For researchers on image retrieval, this is an unforseen but welcome consequence of the rise of Web 2.0 technologies. Popular social networking and content sharing services seem to hold the key to the integration of context and semantics into retrieval. However, at least for now, it appears that this promise has to be taken with a grain of salt. In this paper, we present preliminary empirical results on the tagging behavior of power users of content sharing and social bookmarking services. Our findings suggest different promising research directions for image retrieval and we briefly discuss some of them.
web intelligence | 2008
Robert Wetzker; Winfried Umbrath; Leonhard Hennig; Christian Bauckhage; Tansu Alpcan; Florian Metze
Automatic content categorization by means of taxonomies is a powerful tool for information retrieval and search technologies as it improves the accessibility of data both for humans and machines. While research on automatic categorization has mainly focused on the problem of classifier design, hardly any effort has been spent on the optimization of the taxonomy size itself. However, taxonomy tailoring may significantly improve computational efficiency and scalability of modern retrieval systems where taxonomies often consist of tens of thousands of non-uniformly distributed categories. In this paper we demonstrate empirically that small subtrees of a taxonomy already enable reliable categorization. We compare several measures for the optimal selection of sub-taxonomies and investigate to what extent a reduction affects the classification quality. We consider applications in classical document categorization and in the upcoming area of expert finding and report corresponding results obtained from experiments with standard benchmark data.
Archive | 2009
Alan Said; Robert Wetzker; Winfried Umbrath; Leonhard Hennig
International Journal of Data Warehousing and Mining | 2010
Robert Wetzker; Carsten Zimmermann; Christian Bauckhage