Rob Potharst
Erasmus University Rotterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rob Potharst.
Sigkdd Explorations | 2002
Rob Potharst; Ad Feelders
For classification problems with ordinal attributes very often the class attribute should increase with each or some of the explaining attributes. These are called classification problems with monotonicity constraints. Classical decision tree algorithms such as CART or C4.5 generally do not produce monotone trees, even if the dataset is completely monotone. This paper surveys the methods that have so far been proposed for generating decision trees that satisfy monotonicity constraints. A distinction is made between methods that work only for monotone datasets and methods that work for monotone and non-monotone datasets alike.
European Journal of Operational Research | 2007
Michiel van Wezel; Rob Potharst
In this paper various ensemble learning methods from machine learning and statistics are considered and applied to the customer choice modeling problem. The application of ensemble learning usually improves the prediction quality of flexible models like decision trees and thus leads to improved predictions. We give experimental results for two real-life marketing datasets using decision trees, ensemble versions of decision trees and the logistic regression model, which is a standard approach for this problem. The ensemble models are found to improve upon individual decision trees and outperform logistic regression. Next, an additive decomposition of the prediction error of a model, the bias/variance decomposition, is considered. A model with a high bias lacks the flexibility to fit the data well. A high variance indicates that a model is instable with respect to different datasets. Decision trees have a high variance component and a low bias component in the prediction error, whereas logistic regression has a high bias component and a low variance component. It is shown that ensemble methods aim at minimizing the variance component in the prediction error while leaving the bias component unaltered. Bias/variance decompositions for all models for both customer choice datasets are given to illustrate these concepts.
decision support systems | 2008
Dennis van Heijst; Rob Potharst; Michiel van Wezel
We create a support system for predicting end prices on eBay. The end price predictions are based on the item descriptions found in the item listings of eBay, and on some numerical item features. The system uses text mining and boosting algorithms from the field of machine learning. Our system substantially outperforms the naive method of predicting the category mean price. Moreover, interpretation of the model enables us to identify influential terms in the item descriptions and shows that the item description is more influential than the seller feedback rating, which was shown to be influential in earlier studies.
decision support systems | 2006
Jedid-Jah Jonker; Nanda Piersma; Rob Potharst
Direct marketing firms want to transfer their message as efficiently as possible in order to obtain a profitable long-term relationship with individual customers. Much attention has been paid to address selection of existing customers and on identifying new profitable prospects. Less attention has been paid to the optimal frequency of the contacts with customers. We provide a decision support system that helps the direct mailer to determine mailing frequency for active customers. The system observes the mailing pattern of these customers in terms of the well-known R(ecency), F(requency) and M(onetary) variables. The underlying model is based on an optimization model for the frequency of direct mailings. The system provides the direct mailer with tools to define preferred response behavior and advises the direct mailer on the mailing strategy that will steer the customers towards this preferred response behavior.
Engineering Applications of Artificial Intelligence | 2009
Rob Potharst; Arie Ben-David; Michiel van Wezel
Monotone constraints are very common while dealing with multi-attribute ordinal problems. Grinding wheels hardness selection, timely replacements of costly laser sensors in silicon wafer manufacturing, and the selection of the right personnel for sensitive production facilities, are just a few examples of ordinal problems where monotonicity makes sense. In order to evaluate the performance of various ordinal classifiers one needs both artificially generated as well as real world data sets. Two algorithms are presented for generating monotone ordinal data sets. The first can be used for generating random monotone ordinal data sets without an underlying structure. The second algorithm, which is the main contribution of this paper, describes for the first time how structured monotone data sets can be generated.
international symposium on neural networks | 1996
Jan C. Bioch; O. van der Meer; Rob Potharst
Previously, Bayesian methods have been proposed for neural networks to solve regression and classification problems. These methods claim to overcome some difficulties encountered in the standard approach such as overfitting. However, an implementation of the full Bayesian approach to neural networks as suggested in the literature applied to classification problems is not easy. In fact we are not aware of applications of the full approach to real world classification problems. In this paper we discuss how the Bayesian framework can improve the predictive performance of neural networks. We demonstrate the effects of this approach by an implementation of the full Bayesian framework applied to two real world classification problems. We also discuss the idea of calibration to measure the predictive performance.
intelligent data analysis | 1999
Rob Potharst; Jan C. Bioch
In many classification problems the domains of the attributes and the classes are linearly orderded. For such problems the classification rule often needs to be order-preserving or monotone as we call it. Since the known decision tree methods generate non-monotone trees, these methods are not suitable for monotone classification problems. We provide an order-preserving tree-generation algorithm for multiattribute classification problems with k linearly ordered classes, and an algorithm for repairing non-monotone decision trees. The performance of these algorithms is tested on random monotone datasets.
european conference on principles of data mining and knowledge discovery | 1997
Jan C. Bioch; Onno van der Meer; Rob Potharst
Decision tree methods constitute an important and much used technique for classification problems. When such trees are used in a Datamining and Knowledge Discovery context, ease of interpretation of the resulting trees is an important requirement to be met. Decision trees with tests based on a single variable, as produced by methods such as ID3, C4.5 etc., often require a large number of tests to achieve an acceptable accuracy. This makes interpretation of these trees, which is an important reason for their use, disputable. Recently, a number of methods for constructing decision trees with multivariate tests have been presented. Multivariate decision trees are often smaller and more accurate than univariate trees; however, the use of linear combinations of the variables may result in trees that are hard to interpret. In this paper we consider trees with test bases on combinations of at most two variables. We show that bivariate decision trees are an interesting alternative to both uni- and multivariate trees, especially qua ease of interpretation.
international conference on mathematics of neural networks models algorithms and applications models algorithms and applications | 1997
Jan C. Bioch; Robert Carsouw; Rob Potharst
Linear decision tree classifiers and LVQ-networks divide the input space into convex regions that can be represented by membership functions. These functions are then used to determine the weights of the first layer of a feedforward network. Subject classification: AMS(MOS) 68T05, 92B20
international symposium on neural networks | 1995
Jan C. Bioch; Robert Carsouw; Rob Potharst
In this paper we show that the convex regions induced by a decision tree with linear decision function cannot be represented by linear membership functions as suggested in the literature. It appears that a faithful representation is only possible for subregions. We derive explicit expressions for the membership functions of these subregions. This approximation can be used to initialise a one-hidden-layer neural net.