Stijn Viaene | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stijn Viaene is active.

Explore More

Publication

Featured researches published by Stijn Viaene.

Machine Learning | 2004

Benchmarking Least Squares Support Vector Machine Classifiers

Tony Van Gestel; Johan A. K. Suykens; Bart Baesens; Stijn Viaene; Jan Vanthienen; Guido Dedene; Bart De Moor; Joos Vandewalle

In Support Vector Machines (SVMs), the solution of the classification problem is characterized by a (convex) quadratic programming (QP) problem. In a modified version of SVMs, called Least Squares SVM classifiers (LS-SVMs), a least squares cost function is proposed so as to obtain a linear set of equations in the dual space. While the SVM classifier has a large margin interpretation, the LS-SVM formulation is related in this paper to a ridge regression approach for classification with binary targets and to Fishers linear discriminant analysis in the feature space. Multiclass categorization problems are represented by a set of binary classifiers using different output coding schemes. While regularization is used to control the effective number of parameters of the LS-SVM classifier, the sparseness property of SVMs is lost due to the choice of the 2-norm. Sparseness can be imposed in a second stage by gradually pruning the support value spectrum and optimizing the hyperparameters during the sparse approximation procedure. In this paper, twenty public domain benchmark datasets are used to evaluate the test set performance of LS-SVM classifiers with linear, polynomial and radial basis function (RBF) kernels. Both the SVM and LS-SVM classifier with RBF kernel in combination with standard cross-validation procedures for hyperparameter selection achieve comparable test set performances. These SVM and LS-SVM performances are consistently very good when compared to a variety of methods described in the literature including decision tree based algorithms, statistical algorithms and instance based learning methods. We show on ten UCI datasets that the LS-SVM sparse approximation procedure can be successfully applied.

Journal of the Operational Research Society | 2003

Benchmarking state-of-the-art classification algorithms for credit scoring

Bart Baesens; T. Van Gestel; Stijn Viaene; M Stepanova; Johan A. K. Suykens; Jan Vanthienen

In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.

European Journal of Operational Research | 2002

Bayesian neural network learning for repeat purchase modelling in direct marketing

Bart Baesens; Stijn Viaene; Dirk Van den Poel; Jan Vanthienen; Guido Dedene

We focus on purchase incidence modelling for a European direct mail company. Response models based on statistical and neural network techniques are contrasted. The evidence framework of MacKay is used as an example implementation of Bayesian neural network learning, a method that is fairly robust with respect to problems typically encountered when implementing neural networks. The automatic relevance determination (ARD) method, an integrated feature of this framework, allows us to assess the relative importance of the inputs. The basic response models use operationalisations of the traditionally discussed Recency, Frequency and Monetary (RFM) predictor categories. In a second experiment, the RFM response framework is enriched by the inclusion of other (non-RFM) customer profiling predictors. We contribute to the literature by providing experimental evidence that: (1) Bayesian neural networks offer a viable alternative for purchase incidence modelling; (2) a combined use of all three RFM predictor categories is advocated by the ARD method; (3) the inclusion of non-RFM variables allows to significantly augment the predictive power of the constructed RFM classifiers; (4) this rise is mainly attributed to the inclusion of customer/company interaction variables and a variable measuring whether a customer uses the credit facilities of the direct mailing company.

Journal of Risk and Insurance | 2002

A Comparison of State‐of‐the‐Art Classification Techniques for Expert Automobile Insurance Claim Fraud Detection

Stijn Viaene; Richard A. Derrig; Bart Baesens; Guido Dedene

Several state-of-the-art binary classification techniques are experimentally evaluated in the context of expert automobile insurance claim fraud detection. The predictive power of logistic regression, C4.5 decision tree, k-nearest neighbor, Bayesian learning multilayer perceptron neural network, least-squares support vector machine, naive Bayes, and tree-augmented naive Bayes classification is contrasted. For most of these algorithm types, we report on several operationalizations using alternative hyperparameter or design choices. We compare these in terms of mean percentage correctly classified (PCC) and mean area under the receiver operating characteristic (AUROC) curve using a stratified, blocked, ten-fold cross-validation experiment. We also contrast algorithm type performance visually by means of the convex hull of the receiver operating characteristic (ROC) curves associated with the alternative operationalizations per algorithm type. The study is based on a data set of 1,399 personal injury protection claims from 1993 accidents collected by the Automobile Insurers Bureau of Massachusetts. To stay as close to real-life operating conditions as possible, we consider only predictors that are known relatively early in the life of a claim. Furthermore, based on the qualification of each available claim by both a verbal expert assessment of suspicion of fraud and a ten-point-scale expert suspicion score, we can compare classification for different target/class encoding schemes. Finally, we also investigate the added value of systematically collecting nonflag predictors for suspicion of fraud modeling purposes. From the observed results, we may state that: (1) independent of the target encoding scheme and the algorithm type, the inclusion of nonflag predictors allows us to significantly boost predictive performance; (2) for all the evaluated scenarios, the performance difference in terms of mean PCC and mean AUROC between many algorithm type operationalizations turns out to be rather small; visual comparison of the algorithm type ROC curve convex hulls also shows limited difference in performance over the range of operating conditions; (3) relatively simple and efficient techniques such as linear logistic regression and linear kernel least-squares support vector machine classification show excellent overall predictive capabilities, and (smoothed) naive Bayes also performs well; and (4) the C4.5 decision tree operationalization results are rather disappointing; none of the tree operationalizations are capable of attaining mean AUROC performance in line with the best. Visual inspection of the evaluated scenarios reveals that the C4.5 algorithm type ROC curve convex hull is often dominated in large part by most of the other algorithm type hulls.

International Journal of Intelligent Systems | 2001

Knowledge Discovery In A Direct Marketing Case Using Least Squares Support Vector Machines

Stijn Viaene; Bart Baesens; T. Van Gestel; Johan A. K. Suykens; D. Van den Poel; Jan Vanthienen; B. De Moor; Guido Dedene

We study the problem of repeat‐purchase modeling in a direct marketing setting using Belgian data. More specifically, we investigate the detection and qualification of the most relevant explanatory variables for predicting purchase incidence. The analysis is based on a wrapped form of input selection using a sensitivity based pruning heuristic to guide a greedy, stepwise, and backward traversal of the input space. For this purpose, we make use of a powerful and promising least squares support vector machine (LS‐SVM) classifier formulation. This study extends beyond the standard recency frequency monetary (RFM) modeling semantics in two ways: (1) by including alternative operationalizations of the RFM variables, and (2) by adding several other (non‐RFM) predictors. Results indicate that elimination of redundant/irrelevant inputs allows significant reduction of model complexity. The empirical findings also highlight the importance of frequency and monetary variables, while the recency variable category seems to be of somewhat lesser importance to the case at hand. Results also point to the added value of including non‐RFM variables for improving customer profiling. More specifically, customer/company interaction, measured using indicators of information requests and complaints, and merchandise returns provide additional predictive power to purchase incidence modeling for database marketing. © 2001 John Wiley & Sons, Inc.

international conference on conceptual structures | 2010

Formal concept analysis in knowledge discovery: a survey

Jonas Poelmans; Paul Elzinga; Stijn Viaene; Guido Dedene

In this paper, we analyze the literature on Formal Concept Analysis (FCA) using FCA. We collected 702 papers published between 2003-2009 mentioning Formal Concept Analysis in the abstract. We developed a knowledge browsing environment to support our literature analysis process. The pdf-files containing the papers were converted to plain text and indexed by Lucene using a thesaurus containing terms related to FCA research. We use the visualization capabilities of FCA to explore the literature, to discover and conceptually represent the main research topics in the FCA community. As a case study, we zoom in on the 140 papers on using FCA in knowledge discovery and data mining and give an extensive overview of the contents of this literature.

IEEE Transactions on Knowledge and Data Engineering | 2004

A case study of applying boosting naive Bayes to claim fraud diagnosis

Stijn Viaene; Richard A. Derrig; Guido Dedene

We apply the weight of evidence reformulation of AdaBoosted naive Bayes scoring due to Ridgeway et al. (1998) to the problem of diagnosing insurance claim fraud. The method effectively combines the advantages of boosting and the explanatory power of the weight of evidence scoring framework. We present the results of an experimental evaluation with an emphasis on discriminatory power, ranking ability, and calibration of probability estimates. The data to which we apply the method consists of closed personal injury protection (PIP) automobile insurance claims from accidents that occurred in Massachusetts (USA) during 1993 and were previously investigated for suspicion of fraud by domain experts. The data mimic the most commonly occurring data configuration, that is, claim records consisting of information pertaining to several binary fraud indicators. The findings of the study reveal the method to be a valuable contribution to the design of intelligible, accountable, and efficient fraud detection support.

Business Process Management Journal | 2014

Ten principles of good business process management

Jan vom Brocke; Theresa Schmiedel; Jan Recker; Peter Trkman; Willem Mertens; Stijn Viaene

Purpose – The purpose of this paper is to foster a common understanding of business process management (BPM) by proposing a set of ten principles that characterize BPM as a research domain and guide its successful use in organizational practice. Design/methodology/approach – The identification and discussion of the principles reflects the viewpoint, which was informed by extant literature and focus groups, including 20 BPM experts from academia and practice. Findings – The authors identify ten principles which represent a set of capabilities essential for mastering contemporary and future challenges in BPM. Their antonyms signify potential roadblocks and bad practices in BPM. The authors also identify a set of open research questions that can guide future BPM research. Research limitations/implications – The findings suggest several areas of research regarding each of the identified principles of good BPM. Also, the principles themselves should be systematically and empirically examined in future studies....

International Journal of Intelligent Systems in Accounting, Finance & Management | 2001

Wrapped input selection using multilayer perceptrons for repeat‐purchase modeling in direct marketing

Stijn Viaene; Bart Baesens; Dirk Van den Poel; Guido Dedene; Jan Vanthienen

In this paper, we try to validate existing theory on and develop additional insight into repeat-purchase behavior in a direct marketing setting by means of an illuminating case study. The case involves the detection and qualification of the most relevant RFM (Recency, Frequency and Monetary) variables, using a neural network wrapper as our input pruning method. Results indicate that elimination of redundant and/or irrelevant inputs by means of the discussed input selection method allows us to significantly reduce model complexity without degrading the predictive generalization ability. It is precisely this issue that will enable us to infer some interesting marketing conclusions concerning the relative importance of the RFM predictor categories and their operationalizations. The empirical findings highlight the importance of a combined use of RFM variables in predicting repeat-purchase behavior. However, the study also reveals the dominant role of the frequency category. Results indicate that a model including only frequency variables still yields satisfactory classification accuracy compared to the optimally reduced model.

Expert Systems With Applications | 2005

Auto claim fraud detection using Bayesian learning neural networks

Stijn Viaene; Guido Dedene; Richard A. Derrig

This article explores the explicative capabilities of neural network classifiers with automatic relevance determination weight regularization, and reports the findings from applying these networks for personal injury protection automobile insurance claim fraud detection. The automatic relevance determination objective function scheme provides us with a way to determine which inputs are most informative to the trained neural network model. An implementation of MacKays, (1992a,b) evidence framework approach to Bayesian learning is proposed as a practical way of training such networks. The empirical evaluation is based on a data set of closed claims from accidents that occurred in Massachusetts, USA during 1993. e framework approach to Bayesian learning is proposed as a practical way of training such networks. The empirical evaluation is based on a data set of closed claims from accidents that occurred in Massachusetts, USA during 1993.

Explore More