Thomas Verbraken | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thomas Verbraken is active.

Explore More

Publication

Featured researches published by Thomas Verbraken.

IEEE Transactions on Software Engineering | 2013

Toward Comprehensible Software Fault Prediction Models Using Bayesian Network Classifiers

Karel Dejaeger; Thomas Verbraken; Bart Baesens

Software testing is a crucial activity during software development and fault prediction models assist practitioners herein by providing an upfront identification of faulty software code by drawing upon the machine learning literature. While especially the Naive Bayes classifier is often applied in this regard, citing predictive performance and comprehensibility as its major strengths, a number of alternative Bayesian algorithms that boost the possibility of constructing simpler networks with fewer nodes and arcs remain unexplored. This study contributes to the literature by considering 15 different Bayesian Network (BN) classifiers and comparing them to other popular machine learning techniques. Furthermore, the applicability of the Markov blanket principle for feature selection, which is a natural extension to BN theory, is investigated. The results, both in terms of the AUC and the recently introduced H-measure, are rigorously tested using the statistical framework of Demšar. It is concluded that simple and comprehensible networks with less nodes can be constructed using BN classifiers other than the Naive Bayes classifier. Furthermore, it is found that the aspects of comprehensibility and predictive performance need to be balanced out, and also the development context is an item which should be taken into account during model selection.

IEEE Transactions on Knowledge and Data Engineering | 2013

A Novel Profit Maximizing Metric for Measuring Classification Performance of Customer Churn Prediction Models

Thomas Verbraken; Wouter Verbeke; Bart Baesens

The interest for data mining techniques has increased tremendously during the past decades, and numerous classification techniques have been applied in a wide range of business applications. Hence, the need for adequate performance measures has become more important than ever. In this paper, a cost-benefit analysis framework is formalized in order to define performance measures which are aligned with the main objectives of the end users, i.e., profit maximization. A new performance measure is defined, the expected maximum profit criterion. This general framework is then applied to the customer churn problem with its particular cost-benefit structure. The advantage of this approach is that it assists companies with selecting the classifier which maximizes the profit. Moreover, it aids with the practical implementation in the sense that it provides guidance about the fraction of the customer base to be included in the retention campaign.

European Journal of Operational Research | 2014

Development and application of consumer credit scoring models using profit-based classification measures

Thomas Verbraken; Cristián Bravo; Richard Weber; Bart Baesens

This paper presents a new approach for consumer credit scoring, by tailoring a profit-based classification performance measure to credit risk modeling. This performance measure takes into account the expected profits and losses of credit granting and thereby better aligns the model developers’ objectives with those of the lending company. It is based on the Expected Maximum Profit (EMP) measure and is used to find a trade-off between the expected losses – driven by the exposure of the loan and the loss given default – and the operational income given by the loan. Additionally, one of the major advantages of using the proposed measure is that it permits to calculate the optimal cutoff value, which is necessary for model implementation. To test the proposed approach, we use a dataset of loans granted by a government institution, and benchmarked the accuracy and monetary gain of using EMP, accuracy, and the area under the ROC curve as measures for selecting model parameters, and for determining the respective cutoff values. The results show that our proposed profit-based classification measure outperforms the alternative approaches in terms of both accuracy and monetary value in the test set, and that it facilitates model deployment.

European Journal of Operational Research | 2012

A new SOM-based method for profile generation: theory and an application in direct marketing

Alex Seret; Thomas Verbraken; Sébastien Versailles; Bart Baesens

The field of direct marketing is constantly searching for new data mining techniques in order to analyze the increasing available amount of data. Self-organizing maps (SOM) have been widely applied and discussed in the literature, since they give the possibility to reduce the complexity of a high dimensional attribute space while providing a powerful visual exploration facility. Combined with clustering techniques and the extraction of the so-called salient dimensions, it is possible for a direct marketer to gain a high level insight about a dataset of prospects. In this paper, a SOM-based profile generator is presented, consisting of a generic method leading to value-adding and business-oriented profiles for targeting individuals with predefined characteristics. Moreover, the proposed method is applied in detail to a concrete case study from the concert industry. The performance of the method is then illustrated and discussed and possible future research tracks are outlined.

Applied Soft Computing | 2015

Profit-based feature selection using support vector machines - General framework and an application for customer retention

Sebastián Maldonado; Álvaro Flores; Thomas Verbraken; Bart Baesens; Richard Weber

Graphical abstractDisplay Omitted HighlightsA novel profit-based feature selection method for churn prediction with SVM is presented.A backward elimination algorithm is performed to maximize the profit of a retention campaign.Our experiments on churn prediction datasets underline the potential of the proposed approaches. Churn prediction is an important application of classification models that identify those customers most likely to attrite based on their respective characteristics described by e.g. socio-demographic and behavioral variables. Since nowadays more and more of such features are captured and stored in the respective computational systems, an appropriate handling of the resulting information overload becomes a highly relevant issue when it comes to build customer retention systems based on churn prediction models. As a consequence, feature selection is an important step of the classifier construction process. Most feature selection techniques; however, are based on statistically inspired validation criteria, which not necessarily lead to models that optimize goals specified by the respective organization. In this paper we propose a profit-driven approach for classifier construction and simultaneous variable selection based on support vector machines. Experimental results show that our models outperform conventional techniques for feature selection achieving superior performance with respect to business-related goals.

intelligent data analysis | 2014

Profit optimizing customer churn prediction with Bayesian network classifiers

Thomas Verbraken; Wouter Verbeke; Bart Baesens

Customer churn prediction is becoming an increasingly important business analytics problem for telecom operators. In order to increase the efficiency of customer retention campaigns, churn prediction models need to be accurate as well as compact and interpretable. Although a myriad of techniques for churn prediction has been examined, there has been little attention for the use of Bayesian Network classifiers. This paper investigates the predictive power of a number of Bayesian Network algorithms, ranging from the Naive Bayes classifier to General Bayesian Network classifiers. Furthermore, a feature selection method based on the concept of the Markov Blanket, which is genuinely related to Bayesian Networks, is tested. The performance of the classifiers is evaluated with both the Area under the Receiver Operating Characteristic Curve and the recently introduced Maximum Profit criterion. The Maximum Profit criterion performs an intelligent optimization by targeting this fraction of the customer base which would maximize the profit generated by a retention campaign. The results of the experiments are rigorously tested and indicate that most of the analyzed techniques have a comparable performance. Some methods, however, are more preferred since they lead to compact networks, which enhances the interpretability and comprehensibility of the churn prediction models.

Applied Soft Computing | 2014

A new knowledge-based constrained clustering approach

Alex Seret; Thomas Verbraken; Bart Baesens

Graphical abstractDisplay Omitted HighlightsThis paper proposes a straightforward way to apply constrained clustering.Soft attribute-level constraints are generated based on feature order preferences.Practitioners can formalize and use their a priori knowledge.A methodology implementing this approach is applied in a direct marketing context. Clustering has always been an exploratory but critical step in the knowledge discovery process. Often unsupervised, the clustering task received a huge interest when reinforced by different kinds of inputs provided by the user. This paper presents an approach giving the possibility to incorporate business knowledge in order to guide the clustering algorithm. A formalization of the fact that an intuitive a priori prioritization of the variables might exist, is presented in this paper and applied in a direct marketing context using recent data. By providing the analyst with a new approach offering different clustering perspectives, this paper proposes a straightforward way to apply constrained clustering with soft attribute-level constraints based on feature order preferences.

workshop on e-business | 2011

Using Social Network Classifiers for Predicting E-Commerce Adoption

Thomas Verbraken; Frank Goethals; Wouter Verbeke; Bart Baesens

This paper indicates that knowledge about a person’s social network is valuable to predict the intent to purchase books and computers online. Data was gathered about a network of 681 persons and their intent to buy products online. Results of a range of networked classification techniques are compared with the predictive power of logistic regression. This comparison indicates that information about a person’s social network is more valuable to predict a person’s intent to buy online than the person’s characteristics such as age, gender, his intensity of computer use and his enjoyment when working with the computer.

decision support systems | 2014