Christophe Mues | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christophe Mues is active.

Explore More

Publication

Featured researches published by Christophe Mues.

IEEE Transactions on Software Engineering | 2008

Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

Stefan Lessmann; Bart Baesens; Christophe Mues; Swantje Pietsch

Software defect prediction strives to improve software quality and testing efficiency by constructing predictive classification models from code attributes to enable a timely identification of fault-prone modules. Several classification models have been evaluated for this task. However, due to inconsistent findings regarding the superiority of one classifier over another and the usefulness of metric-based classification in general, more research is needed to improve convergence across studies and further advance confidence in experimental results. We consider three potential sources for bias: comparing classifiers over one or a small number of proprietary data sets, relying on accuracy indicators that are conceptually inappropriate for software defect prediction and cross-study comparisons, and, finally, limited use of statistical testing procedures to secure empirical findings. To remedy these problems, a framework for comparative software defect prediction experiments is proposed and applied in a large-scale empirical comparison of 22 classifiers over 10 public domain data sets from the NASA Metrics Data repository. Overall, an appealing degree of predictive accuracy is observed, which supports the view that metric-based classification is useful. However, our results indicate that the importance of the particular classification algorithm may be less than previously assumed since no significant performance differences could be detected among the top 17 classifiers.

Management Science | 2003

Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation

Bart Baesens; Rudy Setiono; Christophe Mues; Jan Vanthienen

Credit-risk evaluation is a very challenging and important management science problem in the domain of financial analysis. Many classification methods have been suggested in the literature to tackle this problem. Neural networks, especially, have received a lot of attention because of their universal approximation property. However, a major drawback associated with the use of neural networks for decision making is their lack of explanation capability. While they can achieve a high predictive accuracy rate, the reasoning behind how they reach their decisions is not readily available. In this paper, we present the results from analysing three real-life credit-risk data sets using neural network rule extraction techniques. Clarifying the neural network decisions by explanatory rules that capture the learned knowledge embedded in the networks can help the credit-risk manager in explaining why a particular applicant is classified as either bad or good. Furthermore, we also discuss how these rules can be visualized as a decision table in a compact and intuitive graphical format that facilitates easy consultation. It is concluded that neural network rule extraction and decision tables are powerful management tools that allow us to build advanced and userfriendly decision-support systems for credit-risk evaluation.

Expert Systems With Applications | 2012

An experimental comparison of classification algorithms for imbalanced credit scoring data sets

Iain Brown; Christophe Mues

In this paper, we set out to compare several techniques that can be used in the analysis of imbalanced credit scoring data sets. In a credit scoring context, imbalanced data sets frequently occur as the number of defaulting loans in a portfolio is usually much lower than the number of observations that do not default. As well as using traditional classification techniques such as logistic regression, neural networks and decision trees, this paper will also explore the suitability of gradient boosting, least square support vector machines and random forests for loan default prediction.Five real-world credit scoring data sets are used to build classifiers and test their performance. In our experiments, we progressively increase class imbalance in each of these data sets by randomly under-sampling the minority class of defaulters, so as to identify to what extent the predictive power of the respective techniques is adversely affected. The performance criterion chosen to measure this effect is the area under the receiver operating characteristic curve (AUC); Friedmans statistic and Nemenyi post hoc tests are used to test for significance of AUC differences between techniques.The results from this empirical study indicate that the random forest and gradient boosting classifiers perform very well in a credit scoring context and are able to cope comparatively well with pronounced class imbalances in these data sets. We also found that, when faced with a large class imbalance, the C4.5 decision tree algorithm, quadratic discriminant analysis and k-nearest neighbours perform significantly worse than the best performing classifiers.

Journal of Systems and Software | 2008

Mining software repositories for comprehensible software fault prediction models

Olivier Vandecruys; David Martens; Bart Baesens; Christophe Mues; Manu De Backer; Raf Haesen

Software managers are routinely confronted with software projects that contain errors or inconsistencies and exceed budget and time limits. By mining software repositories with comprehensible data mining techniques, predictive models can be induced that offer software managers the insights they need to tackle these quality and budgeting problems in an efficient way. This paper deals with the role that the Ant Colony Optimization (ACO)-based classification technique AntMiner+ can play as a comprehensible data mining technique to predict erroneous software modules. In an empirical comparison on three real-world public datasets, the rule-based models produced by AntMiner+ are shown to achieve a predictive accuracy that is competitive to that of the models induced by several other included classification techniques, such as C4.5, logistic regression and support vector machines. In addition, we will argue that the intuitiveness and comprehensibility of the AntMiner+ models can be considered superior to the latter models.

IEEE Transactions on Neural Networks | 2008

Recursive Neural Network Rule Extraction for Data With Mixed Attributes

Rudy Setiono; Bart Baesens; Christophe Mues

In this paper, we present a recursive algorithm for extracting classification rules from feedforward neural networks (NNs) that have been trained on data sets having both discrete and continuous attributes. The novelty of this algorithm lies in the conditions of the extracted rules: the rule conditions involving discrete attributes are disjoint from those involving continuous attributes. The algorithm starts by first generating rules with discrete attributes only to explain the classification process of the NN. If the accuracy of a rule with only discrete attributes is not satisfactory, the algorithm refines this rule by recursively generating more rules with discrete attributes not already present in the rule condition, or by generating a hyperplane involving only the continuous attributes. We show that for three real-life credit scoring data sets, the algorithm generates rules that are not only more accurate but also more comprehensible than those generated by other NN rule extraction methods.

European Journal of Operational Research | 2007

Inferring descriptive and approximate fuzzy rules for credit scoring using evolutionary algorithms

Frank Hoffmann; Bart Baesens; Christophe Mues; T. Van Gestel; Jan Vanthienen

Generating both accurate as well as explanatory classification rules is becoming increasingly important in a knowledge discovery context. In this paper, we investigate the power and usefulness of fuzzy classification rules for data mining purposes. We propose two evolutionary fuzzy rule learners: an evolution strategy that generates approximate fuzzy rules, whereby each rule has its own specific definition of membership functions, and a genetic algorithm that extracts descriptive fuzzy rules, where all fuzzy rules share a common, linguistically interpretable definition of membership functions in disjunctive normal form. The performance of the evolutionary fuzzy rule learners is compared with that of Nefclass, a neurofuzzy classifier, and a selection of other well-known classification algorithms on a number of publicly available data sets and two real life Benelux financial credit scoring data sets. It is shown that the genetic fuzzy classifiers compare favourably with the other classifiers in terms of classification accuracy. Furthermore, the approximate and descriptive fuzzy rules yield about the same classification accuracy across the different data sets

data and knowledge engineering | 1998

An illustration of verification and validation in the modelling phase of KBS development

Jan Vanthienen; Christophe Mues; A Aerts

Reliability has become a key factor in KBS development. For this reason, it has been suggested that verification and validation (V&V) should become an integrated part of activities throughout the whole KBS development cycle. In this paper, it will be illustrated how the PROLOGA workbench integrates V&V aspects into its modelling environment, such that these techniques can be of assistance in the process of knowledge acquisition and representation. To this end, verification has to be performed incrementally and can no longer be delayed until after the system has been completed. It will be shown how this objective can be realised through an approach that uses the decision table formalism as a modelling instrument.

ant colony optimization and swarm intelligence | 2006

Ant-based approach to the knowledge fusion problem

David Martens; Manu De Backer; Raf Haesen; Bart Baesens; Christophe Mues; Jan Vanthienen

Data mining involves the automated process of finding patterns in data and has been a research topic for decades. Although very powerful data mining techniques exist to extract classification models from data, the techniques often infer counter-intuitive patterns or lack patterns that are logical for domain experts. The problem of consolidating the knowledge extracted from the data with the knowledge representing the experience of domain experts, is called the knowledge fusion problem. Providing a proper solution for this problem is a key success factor for any data mining application. In this paper, we explain how the AntMiner+ classification technique can be extended to incorporate such domain knowledge. By changing the environment and influencing the heuristic values, we can respectively limit and direct the search of the ants to those regions of the solution space that the expert believes to be logical and intuitive.

Journal of the Operational Research Society | 2010

Modelling LGD for unsecured personal loans: decision tree approach

Ania Matuszyk; Christophe Mues; Lyn C. Thomas

The New Basel Accord, which was implemented in 2007, has made a significant difference to the use of modelling within financial organisations. In particular it has highlighted the importance of Loss Given Default (LGD) modelling. We propose a decision tree approach to modelling LGD for unsecured consumer loans where the uncertainty in some of the nodes is modelled using a mixture model, where the parameters are obtained using regression. A case study based on default data from the in-house collections department of a UK financial organisation is used to show how such regression can be undertaken.

International Journal of Neural Systems | 2011

Rule extraction from minimal neural networks for credit card screening

Rudy Setiono; Bart Baesens; Christophe Mues

While feedforward neural networks have been widely accepted as effective tools for solving classification problems, the issue of finding the best network architecture remains unresolved, particularly so in real-world problem settings. We address this issue in the context of credit card screening, where it is important to not only find a neural network with good predictive performance but also one that facilitates a clear explanation of how it produces its predictions. We show that minimal neural networks with as few as one hidden unit provide good predictive accuracy, while having the added advantage of making it easier to generate concise and comprehensible classification rules for the user. To further reduce model size, a novel approach is suggested in which network connections from the input units to this hidden unit are removed by a very straightaway pruning procedure. In terms of predictive accuracy, both the minimized neural networks and the rule sets generated from them are shown to compare favorably with other neural network based classifiers. The rules generated from the minimized neural networks are concise and thus easier to validate in a real-life setting.

Explore More