Lars Schmidt-Thieme | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lars Schmidt-Thieme is active.

Explore More

Publication

Featured researches published by Lars Schmidt-Thieme.

web search and data mining | 2010

Pairwise interaction tensor factorization for personalized tag recommendation

Steffen Rendle; Lars Schmidt-Thieme

Tagging plays an important role in many recent websites. Recommender systems can help to suggest a user the tags he might want to use for tagging a specific item. Factorization models based on the Tucker Decomposition (TD) model have been shown to provide high quality tag recommendations outperforming other approaches like PageRank, FolkRank, collaborative filtering, etc. The problem with TD models is the cubic core tensor resulting in a cubic runtime in the factorization dimension for prediction and learning. In this paper, we present the factorization model PITF (Pairwise Interaction Tensor Factorization) which is a special case of the TD model with linear runtime both for learning and prediction. PITF explicitly models the pairwise interactions between users, items and tags. The model is learned with an adaption of the Bayesian personalized ranking (BPR) criterion which originally has been introduced for item recommendation. Empirically, we show on real world datasets that this model outperforms TD largely in runtime and even can achieve better prediction quality. Besides our lab experiments, PITF has also won the ECML/PKDD Discovery Challenge 2009 for graph-based tag recommendation.

international world wide web conferences | 2010

Factorizing personalized Markov chains for next-basket recommendation

Steffen Rendle; Christoph Freudenthaler; Lars Schmidt-Thieme

Recommender systems are an important component of many websites. Two of the most popular approaches are based on matrix factorization (MF) and Markov chains (MC). MF methods learn the general taste of a user by factorizing the matrix over observed user-item preferences. On the other hand, MC methods model sequential behavior by learning a transition graph over items that is used to predict the next action based on the recent actions of a user. In this paper, we present a method bringing both approaches together. Our method is based on personalized transition graphs over underlying Markov chains. That means for each user an own transition matrix is learned - thus in total the method uses a transition cube. As the observations for estimating the transitions are usually very limited, our method factorizes the transition cube with a pairwise interaction model which is a special case of the Tucker Decomposition. We show that our factorized personalized MC (FPMC) model subsumes both a common Markov chain and the normal matrix factorization model. For learning the model parameters, we introduce an adaption of the Bayesian Personalized Ranking (BPR) framework for sequential basket data. Empirically, we show that our FPMC model outperforms both the common matrix factorization and the unpersonalized MC model both learned with and without factorization.

conference on recommender systems | 2011

MyMediaLite: a free recommender system library

Zeno Gantner; Steffen Rendle; Christoph Freudenthaler; Lars Schmidt-Thieme

MyMediaLite is a fast and scalable, multi-purpose library of recommender system algorithms, aimed both at recommender system researchers and practitioners. It addresses two common scenarios in collaborative filtering: rating prediction (e.g. on a scale of 1 to 5 stars) and item prediction from positive-only implicit feedback (e.g. from clicks or purchase actions). The library offers state-of-the-art algorithms for those two tasks. Programs that expose most of the librarys functionality, plus a GUI demo, are included in the package. Efficient data structures and a common API are used by the implemented algorithms, and may be used to implement further algorithms. The API also contains methods for real-time updates and loading/storing of already trained recommender models. MyMediaLite is free/open source software, distributed under the terms of the GNU General Public License (GPL). Its methods have been used in four different industrial field trials of the MyMedia project, including one trial involving over 50,000 households.

international acm sigir conference on research and development in information retrieval | 2011

Fast context-aware recommendations with factorization machines

Steffen Rendle; Zeno Gantner; Christoph Freudenthaler; Lars Schmidt-Thieme

The situation in which a choice is made is an important information for recommender systems. Context-aware recommenders take this information into account to make predictions. So far, the best performing method for context-aware rating prediction in terms of predictive accuracy is Multiverse Recommendation based on the Tucker tensor factorization model. However this method has two drawbacks: (1) its model complexity is exponential in the number of context variables and polynomial in the size of the factorization and (2) it only works for categorical context variables. On the other hand there is a large variety of fast but specialized recommender methods which lack the generality of context-aware methods. We propose to apply Factorization Machines (FMs) to model contextual information and to provide context-aware rating predictions. This approach results in fast context-aware recommendations because the model equation of FMs can be computed in linear time both in the number of context variables and the factorization size. For learning FMs, we develop an iterative optimization method that analytically finds the least-square solution for one parameter given the other ones. Finally, we show empirically that our approach outperforms Multiverse Recommendation in prediction quality and runtime.

Ai Communications | 2008

Tag recommendations in social bookmarking systems

Leandro Balby Marinho; Andreas Hotho; Lars Schmidt-Thieme; Gerd Stumme

Collaborative tagging systems allow users to assign keywords - so called “tags” - to resources. Tags are used for navigation, finding resources and serendipitous browsing and thus provide an immediate benefit for users. These systems usually include tag recommendation mechanisms easing the process of finding good tags for a resource, but also consolidating the tag vocabulary across users. In practice, however, only very basic recommendation strategies are applied. In this paper we evaluate and compare several recommendation algorithms on large-scale real life datasets: an adaptation of user-based collaborative filtering, a graph-based recommender built on top of the FolkRank algorithm, and simple methods based on counting tag occurrences. We show that both FolkRank and collaborative filtering provide better results than non-personalized baseline methods. Moreover, since methods based on counting tag occurrences are computationally cheap, and thus usually preferable for real time scenarios, we discuss simple approaches for improving the performance of such methods. We show, how a simple recommender based on counting tags from users and resources can perform almost as good as the best recommender.

international conference on data mining | 2010

Learning Attribute-to-Feature Mappings for Cold-Start Recommendations

Zeno Gantner; Lucas Drumond; Christoph Freudenthaler; Steffen Rendle; Lars Schmidt-Thieme

Cold-start scenarios in recommender systems are situations in which no prior events, like ratings or clicks, are known for certain users or items. To compute predictions in such cases, additional information about users (user attributes, e.g. gender, age, geographical location, occupation) and items (item attributes, e.g. genres, product categories, keywords) must be used. We describe a method that maps such entity (e.g. user or item) attributes to the latent features of a matrix (or higher-dimensional) factorization model. With such mappings, the factors of a MF model trained by standard techniques can be applied to the new-user and the new-item problem, while retaining its advantages, in particular speed and predictive accuracy. We use the mapping concept to construct an attribute-aware matrix factorization model for item recommendation from implicit, positive-only feedback. Experiments on the new-item problem show that this approach provides good predictive accuracy, while the prediction time only grows by a constant factor.

conference on information and knowledge management | 2004

Taxonomy-driven computation of product recommendations

Cai-Nicolas Ziegler; Georg Lausen; Lars Schmidt-Thieme

Recommender systems have been subject to an enormous rise in popularity and research interest over the last ten years. At the same time, very large taxonomies for product classification are becoming increasingly prominent among e-commerce systems for diverse domains, rendering detailed machine-readable content descriptions feasible. Amazon.com makes use of an entire plethora of hand-crafted taxonomies classifying books, movies, apparel, and various other goods. We exploit such taxonomic background knowledge for the computation of personalized recommendations. Hereby, relationships between super-concepts and sub-concepts constitute an important cornerstone of our novel approach, providing powerful inference opportunities for profile generation based upon the classification of products that customers have chosen. Ample empirical analysis, both offline and online, demonstrates our proposals superiority over common existing approaches when user information is sparse and implicit ratings prevail.

conference on recommender systems | 2008

Online-updating regularized kernel matrix factorization models for large-scale recommender systems

Steffen Rendle; Lars Schmidt-Thieme

Regularized matrix factorization models are known to generate high quality rating predictions for recommender systems. One of the major drawbacks of matrix factorization is that once computed, the model is static. For real-world applications dynamic updating a model is one of the most important tasks. Especially when ratings on new users or new items come in, updating the feature matrices is crucial. In this paper, we generalize regularized matrix factorization (RMF) to regularized kernel matrix factorization (RKMF). Kernels provide a flexible method for deriving new matrix factorization methods. Furthermore with kernels nonlinear interactions between feature vectors are possible. We propose a generic method for learning RKMF models. From this method we derive an online-update algorithm for RKMF models that allows to solve the new-user/new-item problem. Our evaluation indicates that our proposed online-update methods are accurate in approximating a full retrain of a RKMF model while the runtime of online-updating is in the range of milliseconds even for huge datasets like Netflix.

Archive | 2008

Data Analysis, Machine Learning, and Applications

Christine Preisach; Hans Burkhardt; Lars Schmidt-Thieme; Reinhold Decker

We consider distance-based similarity measures for real-valued vectors of interest in kernel-based machine learning algorithms. In particular, a truncated Euclidean similarity measure and a self-normalized similarity measure related to the Canberra distance. It is proved that they are positive semi-definite (p.s.d.), thus facilitating their use in kernel-based methods, like the Support Vector Machine, a very popular machine learning tool. These kernels may be better suited than standard kernels (like the RBF) in certain situations, that are described in the paper. Some rather general results concerning positivity properties are presented in detail as well as some interesting ways of proving the p.s.d. property.

international symposium on neural networks | 2010

Cost-sensitive learning methods for imbalanced data

Nguyen Thai-Nghe; Zeno Gantner; Lars Schmidt-Thieme

Class imbalance is one of the challenging problems for machine learning algorithms. When learning from highly imbalanced data, most classifiers are overwhelmed by the majority class examples, so the false negative rate is always high. Although researchers have introduced many methods to deal with this problem, including resampling techniques and cost-sensitive learning (CSL), most of them focus on either of these techniques. This study presents two empirical methods that deal with class imbalance using both resampling and CSL. The first method combines and compares several sampling techniques with CSL using support vector machines (SVM). The second method proposes using CSL by optimizing the cost ratio (cost matrix) locally. Our experimental results on 18 imbalanced datasets from the UCI repository show that the first method can reduce the misclassification costs, and the second method can improve the classifier performance.

Explore More