Arjen P. de Vries | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Arjen P. de Vries is active.

Explore More

Publication

Featured researches published by Arjen P. de Vries.

international acm sigir conference on research and development in information retrieval | 2006

Unifying user-based and item-based collaborative filtering approaches by similarity fusion

Jun Wang; Arjen P. de Vries; Marcel J. T. Reinders

Memory-based methods for collaborative filtering predict new ratings by averaging (weighted) ratings between, respectively, pairs of similar users or items. In practice, a large number of ratings from similar users or similar items are not available, due to the sparsity inherent to rating data. Consequently, prediction quality can be poor. This paper re-formulates the memory-based collaborative filtering problem in a generative probabilistic framework, treating individual user-item ratings as predictors of missing ratings. The final rating is estimated by fusing predictions from three sources: predictions based on ratings of the same item by other users, predictions based on different item ratings made by the same user, and, third, ratings predicted based on data from other but similar users rating other but similar items. Existing user-based and item-based approaches correspond to the two simple cases of our framework. The complete model is however more robust to data sparsity, because the different types of ratings are used in concert, while additional ratings from similar users towards similar items are employed as a background model to smooth the predictions. Experiments demonstrate that the proposed methods are indeed more robust against data sparsity and give better recommendations.

INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval | 2009

Overview of the INEX 2009 entity ranking track

Gianluca Demartini; Tereza Iofciu; Arjen P. de Vries

In some situations search engine users would prefer to retrieve entities instead of just documents. Example queries include Italian Nobel prize winners, Formula 1 drivers that won the Monaco Grand Prix, or German spoken Swiss cantons. The XML Entity Ranking (XER) track at INEX creates a discussion forum aimed at standardizing evaluation procedures for entity retrieval. This paper describes the XER tasks and the evaluation procedure used at the XER track in 2009, where a new version of Wikipedia was used as underlying collection; and summarizes the approaches adopted by the participants.

european conference on information retrieval | 2006

A user-item relevance model for log-based collaborative filtering

Jun Wang; Arjen P. de Vries; Marcel J. T. Reinders

Implicit acquisition of user preferences makes log-based collaborative filtering favorable in practice to accomplish recommendations. In this paper, we follow a formal approach in text retrieval to re-formulate the problem. Based on the classic probability ranking principle, we propose a probabilistic user-item relevance model. Under this formal model, we show that user-based and item-based approaches are only two different factorizations with different independence assumptions. Moreover, we show that smoothing is an important aspect to estimate the parameters of the models due to data sparsity. By adding linear interpolation smoothing, the proposed model gives a probabilistic justification of using TF×IDF-like item ranking in collaborative filtering. Besides giving the insight understanding of the problem of collaborative filtering, we also show experiments in which the proposed method provides a better recommendation performance on a music play-list data set.

Focused Access to XML Documents | 2008

Overview of the INEX 2007 Entity Ranking Track

Arjen P. de Vries; Anne-Marie Vercoustre; James A. Thom; Nick Craswell; Mounia Lalmas

Many realistic user tasks involve the retrieval of specific entities instead of just any type of documents. Examples of information needs include `Countries where one can pay with the euro or `Impressionist art museums in The Netherlands. The Initiative for Evaluation of XML Retrieval (INEX) started the XML Entity Ranking track (INEX-XER) to create a test collection for entity retrieval in Wikipedia. Entities are assumed to correspond to Wikipedia entries. The goal of the track is to evaluate how well systems can rank entities in response to a query; the set of entities to be ranked is assumed to be loosely defined either by a generic category (entity ranking) or by some example entities (list completion). This track overview introduces the track setup, and discusses the implications of the new relevance notion for entity ranking in comparison to ad hoc retrieval.

ACM Transactions on Information Systems | 2008

Unified relevance models for rating prediction in collaborative filtering

Jun Wang; Arjen P. de Vries; Marcel J. T. Reinders

Collaborative filtering aims at predicting a users interest for a given item based on a collection of user profiles. This article views collaborative filtering as a problem highly related to information retrieval, drawing an analogy between the concepts of users and items in recommender systems and queries and documents in text retrieval.n We present a probabilistic user-to-item relevance framework that introduces the concept of relevance into the related problem of collaborative filtering. Three different models are derived, namely, a user-based, an item-based, and a unified relevance model, and we estimate their rating predictions from three sources: the users own ratings for different items, other users ratings for the same item, and ratings from different but similar users for other but similar items.n To reduce the data sparsity encountered when estimating the probability density function of the relevance variable, we apply the nonparametric (data-driven) density estimation technique known as the Parzen-window method (or kernel-based density estimation). Using a Gaussian window function, the similarity between users and/or items would, however, be based on Euclidean distance. Because the collaborative filtering literature has reported improved prediction accuracy when using cosine similarity, we generalize the Parzen-window method by introducing a projection kernel.n Existing user-based and item-based approaches correspond to two simplified instantiations of our framework. User-based and item-based collaborative filterings represent only a partial view of the prediction problem, where the unified relevance model brings these partial views together under the same umbrella. Experimental results complement the theoretical insights with improved recommendation accuracy. The unified model is more robust to data sparsity because the different types of ratings are used in concert.

Advances in Focused Retrieval | 2009

Overview of the INEX 2008 Entity Ranking Track

Gianluca Demartini; Arjen P. de Vries; Tereza Iofciu; Jianhan Zhu

In many contexts a search engine user would prefer to retrieve entities instead of just documents. Example queries include Italian nobel prize winners, Formula 1 drivers that won the Monaco Grand Prix, or German spoken Swiss cantons. The XML Entity Ranking (XER) track at INEX creates a discussion forum aimed at standardizing evaluation procedures for entity retrieval. This paper describes the XER tasks and the evaluation procedure used at the XER track in 2008, focusing specifically on the sampled pooling strategy applied first this year. We conclude with a brief discussion of the predominant participant approaches and their effectiveness.

conference on image and video retrieval | 2009

Image annotation using clickthrough data

Theodora Tsikrika; Christos Diou; Arjen P. de Vries; Anastasios Delopoulos

Automatic image annotation using supervised learning is performed by concept classifiers trained on labelled example images. This work proposes the use of clickthrough data collected from search logs as a source for the automatic generation of concept training data, thus avoiding the expensive manual annotation effort. We investigate and evaluate this approach using a collection of 97,628 photographic images. The results indicate that the contribution of search log based training data is positive; in particular, the combination of manual and automatically generated training data outperforms the use of manual data alone. It is therefore possible to use clickthrough data to perform large-scale image annotation with little manual annotation effort or, depending on performance, using only the automatically generated training data. The datasets used as well as an extensive presentation of the experimental results can be accessed at http://olympus.ee.auth.gr/~diou/civr2009/.

Information Retrieval | 2005

TIJAH: Embracing IR Methods in XML Databases

Vojkan Mihajlovic; Johan A. List; Vojkan Mihajlovi; Georgina Ramirez; Arjen P. de Vries; Djoerd Hiemstra; Henk Ernst Blok

This paper discusses our participation in INEX (the Initiative for the Evaluation of XML Retrieval) using the TIJAH XML-IR system. TIJAH’s system design follows a ‘standard’ layered database architecture, carefully separating the conceptual, logical and physical levels. At the conceptual level, we classify the INEX XPath-based query expressions into three different query patterns. For each pattern, we present its mapping into a query execution strategy. The logical layer exploits score region algebra (SRA) as the basis for query processing. We discuss the region operators used to select and manipulate XML document components. The logical algebra expressions are mapped into efficient relational algebra expressions over a physical representation of the XML document collection using the ‘pre-post numbering scheme’. The paper concludes with an analysis of experiments performed with the INEX test collection.

international joint conference on natural language processing | 2015

Describing Images using Inferred Visual Dependency Representations

Desmond Elliott; Arjen P. de Vries

The Visual Dependency Representation (VDR) is an explicit model of the spatial relationships between objects in an image. In this paper we present an approach to training a VDR Parsing Model without the extensive human supervision used in previous work. Our approach is to find the objects mentioned in a given description using a state-of-the-art object detector, and to use successful detections to produce training data. The description of an unseen image is produced by first predicting its VDR over automatically detected objects, and then generating the text with a template-based generation model using the predicted VDR. The performance of our approach is comparable to a state-of-the-art multimodal deep neural network in images depicting actions.

european conference on information retrieval | 2010

Finding wormholes with flickr geotags

Maarten Clements; Pavel Serdyukov; Arjen P. de Vries; Marcel J. T. Reinders

We propose a kernel convolution method to predict similar locations (wormholes) based on human travel behaviour. A scaling parameter can be used to define a set of relevant users to the target location and we show how the geotags of these users can effectively be aggregated to predict a ranking of similar locations. We evaluate results on world and city level using several independent test collections.

Explore More