Koen W. De Bock | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Koen W. De Bock is active.

Explore More

Publication

Featured researches published by Koen W. De Bock.

Expert Systems With Applications | 2011

An empirical evaluation of rotation-based ensemble classifiers for customer churn prediction

Koen W. De Bock; Dirk Van den Poel

Several studies have demonstrated the superior performance of ensemble classification algorithms, whereby multiple member classifiers are combined into one aggregated and powerful classification model, over single models. In this paper, two rotation-based ensemble classifiers are proposed as modeling techniques for customer churn prediction. In Rotation Forests, feature extraction is applied to feature subsets in order to rotate the input data for training base classifiers, while RotBoost combines Rotation Forest with AdaBoost. In an experimental validation based on data sets from four real-life customer churn prediction projects, Rotation Forest and RotBoost are compared to a set of well-known benchmark classifiers. Moreover, variations of Rotation Forest and RotBoost are compared, implementing three alternative feature extraction algorithms: principal component analysis (PCA), independent component analysis (ICA) and sparse random projections (SRP). The performance of rotation-based ensemble classifier is found to depend upon: (i) the performance criterion used to measure classification performance, and (ii) the implemented feature extraction algorithm. In terms of accuracy, RotBoost outperforms Rotation Forest, but none of the considered variations offers a clear advantage over the benchmark algorithms. However, in terms of AUC and top-decile lift, results clearly demonstrate the competitive performance of Rotation Forests compared to the benchmark algorithms. Moreover, ICA-based Rotation Forests outperform all other considered classifiers and are therefore recommended as a well-suited alternative classification technique for the prediction of customer churn that allows for improved marketing decision making.

Computational Statistics & Data Analysis | 2010

Ensemble classification based on generalized additive models

Koen W. De Bock; Kristof Coussement; Dirk Van den Poel

Generalized additive models (GAMs) are a generalization of generalized linear models (GLMs) and constitute a powerful technique which has successfully proven its ability to capture nonlinear relationships between explanatory variables and a response variable in many domains. In this paper, GAMs are proposed as base classifiers for ensemble learning. Three alternative ensemble strategies for binary classification using GAMs as base classifiers are proposed: (i) GAMbag based on Bagging, (ii) GAMrsm based on the Random Subspace Method (RSM), and (iii) GAMens as a combination of both. In an experimental validation performed on 12 data sets from the UCI repository, the proposed algorithms are benchmarked to a single GAM and to decision tree based ensemble classifiers (i.e. RSM, Bagging, Random Forest, and the recently proposed Rotation Forest). From the results a number of conclusions can be drawn. Firstly, the use of an ensemble of GAMs instead of a single GAM always leads to improved prediction performance. Secondly, GAMrsm and GAMens perform comparably, while both versions outperform GAMbag. Finally, the value of using GAMs as base classifiers in an ensemble instead of standard decision trees is demonstrated. GAMbag demonstrates performance comparable to ordinary Bagging. Moreover, GAMrsm and GAMens outperform RSM and Bagging, while these two GAM ensemble variations perform comparably to Random Forest and Rotation Forest. Sensitivity analyses are included for the number of member classifiers in the ensemble, the number of variables included in a random feature subspace and the number of degrees of freedom for GAM spline estimation.

Expert Systems With Applications | 2012

Reconciling performance and interpretability in customer churn prediction using ensemble learning based on generalized additive models

Koen W. De Bock; Dirk Van den Poel

To build a successful customer churn prediction model, a classification algorithm should be chosen that fulfills two requirements: strong classification performance and a high level of model interpretability. In recent literature, ensemble classifiers have demonstrated superior performance in a multitude of applications and data mining contests. However, due to an increased complexity they result in models that are often difficult to interpret. In this study, GAMensPlus, an ensemble classifier based upon generalized additive models (GAMs), in which both performance and interpretability are reconciled, is presented and evaluated in a context of churn prediction modeling. The recently proposed GAMens, based upon Bagging, the Random Subspace Method and semi-parametric GAMs as constituent classifiers, is extended to include two instruments for model interpretability: generalized feature importance scores, and bootstrap confidence bands for smoothing splines. In an experimental comparison on data sets of six real-life churn prediction projects, the competitive performance of the proposed algorithm over a set of well-known benchmark algorithms is demonstrated in terms of four evaluation metrics. Further, the ability of the technique to deliver valuable insight into the drivers of customer churn is illustrated in a case study on data from a European bank. Firstly, it is shown how the generalized feature importance scores allow the analyst to identify the relative importance of churn predictors in function of the criterion that is used to measure the quality of the model predictions. Secondly, the ability of GAMensPlus to identify nonlinear relationships between predictors and churn probabilities is demonstrated.

international conference industrial engineering other applications applied intelligent systems | 2010

Ensembles of probability estimation trees for customer churn prediction

Koen W. De Bock; Dirk Van den Poel

Customer churn prediction is one of the most important elements of a companys Customer Relationship Management (CRM) strategy. In this study, two strategies are investigated to increase the lift performance of ensemble classification models, i.e. (i) using probability estimation trees (PETs) instead of standard decision trees as base classifiers, and (ii) implementing alternative fusion rules based on lift weights for the combination of ensemble members outputs. Experiments are conducted for four popular ensemble strategies on five real-life churn data sets. In general, the results demonstrate how lift performance can be substantially improved by using alternative base classifiers and fusion rules. However, the effect varies for the different ensemble strategies. In particular, the results indicate an increase of lift performance of (i) Bagging by implementing C4.4 base classifiers, (ii) the Random Subspace Method (RSM) by using lift-weighted fusion rules, and (iii) AdaBoost by implementing both.

European Journal of Operational Research | 2018

A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees

Arno De Caigny; Kristof Coussement; Koen W. De Bock

Abstract Decision trees and logistic regression are two very popular algorithms in customer churn prediction with strong predictive performance and good comprehensibility. Despite these strengths, decision trees tend to have problems to handle linear relations between variables and logistic regression has difficulties with interaction effects between variables. Therefore a new hybrid algorithm, the logit leaf model (LLM), is proposed to better classify data. The idea behind the LLM is that different models constructed on segments of the data rather than on the entire dataset lead to better predictive performance while maintaining the comprehensibility from the models constructed in the leaves. The LLM consists of two stages: a segmentation phase and a prediction phase. In the first stage customer segments are identified using decision rules and in the second stage a model is created for every leaf of this tree. This new hybrid approach is benchmarked against decision trees, logistic regression, random forests and logistic model trees with regards to the predictive performance and comprehensibility. The area under the receiver operating characteristics curve (AUC) and top decile lift (TDL) are used to measure the predictive performance for which LLM scores significantly better than its building blocks logistic regression and decision trees and performs at least as well as more advanced ensemble methods random forests and logistic model trees. Comprehensibility is addressed by a case study for which we observe some key benefits using the LLM compared to using decision trees or logistic regression.

european conference on information systems | 2015

Maximize What Matters: Predicting Customer Churn With Decision-Centric Ensemble Selection.

Annika Baumann; Stefan Lessmann; Kristof Coussement; Koen W. De Bock

Churn modeling is important to sustain profitable customer relationships in saturated consumer markets. A churn model predicts the likelihood of customer defection. This helps to target retention offers to the right customers and use marketing resources efficiently. Several statistical prediction methods exist in marketing, but all these suffer an important limitation: they do not allow the analyst to account for campaign planning objectives and constraints during model building. Our key proposition is that creating churn models in awareness of actual business requirements increases the performance of the final model for marketing decision support. To demonstrate this, we propose a decision-centric framework to create churn models. We test our modeling framework on eight real-life churn data sets and find that it performs significantly better than state-of-the-art churn models. We estimate that our approach increases the per customer profits of retention campaigns by

European Journal of Operational Research | 2018

A framework for configuring collaborative filtering-based recommendations derived from purchase data

Stijn Geuens; Kristof Coussement; Koen W. De Bock

.47 on average. Further analysis confirms that this improvement comes directly from maximizing business objectives during model building. The main implication of our study is thus that companies better shift from a purely statistical to a more business-driven modeling approach when predicting customer churn.

Expert Systems With Applications | 2017

The best of two worlds: Balancing model strength and comprehensibility in business failure prediction using spline-rule ensembles

Koen W. De Bock

Abstract This study proposes a decision support framework to help e-commerce companies select the best collaborative filtering algorithms (CF) for generating recommendations on the basis of online binary purchase data. To create this framework, an experimental design applies several CF configurations, which are characterized by different data-reduction techniques, CF methods, and similarity measures, to binary purchase data sets with distinct input data characteristics, i.e., sparsity level, purchase distribution, and item–user ratio. The evaluations in terms of accuracy, diversity, computation time, and trade-offs among these metrics reveal that the best-performing algorithm in terms of accuracy remains consistent regardless of the input-data characteristics. However, for diversity and computation time, the best-performing model varies with the input characteristics. This framework allows e-commerce companies to decide on the optimal CF configuration as a function of their specific binary purchase data sets. They also gain insight into the impact of changes in the input data set on the preferred algorithm configuration.

intelligent data analysis | 2010

Predicting Website Audience Demographics forWeb Advertising Targeting Using Multi-Website Clickstream Data

Koen W. De Bock; Dirk Van den Poel

Numerous organizations and companies rely upon business failure prediction to assess and minimize the risk of initiating business relationships with partners, clients, debtors or suppliers. Advances in research on business failure prediction have been largely dominated by algorithmic development and comparisons led by a focus on improvements in model accuracy. In this context, ensemble learning has recently emerged as a class of particularly well-performing methods, albeit often at the expense of increased model complexity. However, in practice, model choice is rarely based on predictive performance alone. Models should be comprehensible and justifiable to assess their compliance with common sense and business logic, and guarantee their acceptance throughout the organization. A promising ensemble classification algorithm that has been shown to reconcile performance and comprehensibility are rule ensembles. In this study, an extension entitled spline-rule ensembles is introduced and validated in the domain of business failure prediction. Spline-rule ensemble complement rules and linear terms found in conventional rule ensembles with smooth functions with the aim of better accommodating nonlinear simple effects of individual features on business failure. Experiments on a large selection of 21 datasets of European companies in various sectors and countries (i) demonstrate superior predictive performance of spline-rule ensembles over a set of well-established yet powerful benchmark methods, (ii) show the superiority of spline-rule ensembles over conventional rule ensembles and thus demonstrate the value of the incorporation of smoothing splines, (iii) investigate the impact of alternative term regularization procedures and (iv) illustrate the comprehensibility of the resulting models through a case study. In particular, the ability of the technique to reveal the extent and the way in which predictors impact business failure, and if and how variables interact, are exemplified.

Journal of Business Research | 2013