Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ralf Stecking is active.

Publication


Featured researches published by Ralf Stecking.


Journal of the Operational Research Society | 2005

Support vector machines for classifying and describing credit applicants: detecting typical and critical regions

Klaus B. Schebesch; Ralf Stecking

Credit applicants are assigned to good or bad risk classes according to their record of defaulting. Each applicant is described by a high-dimensional input vector of situational characteristics and by an associated class label. A statistical model, which maps the inputs to the labels, can decide whether a new credit applicant should be accepted or rejected, by predicting the class label given the new inputs. Support vector machines (SVM) from statistical learning theory can build such models from the data, requiring extremely weak prior assumptions about the model structure. Furthermore, SVM divide a set of labelled credit applicants into subsets of ‘typical’ and ‘critical’ patterns. The correct class label of a typical pattern is usually very easy to predict, even with linear classification methods. Such patterns do not contain much information about the classification boundary. The critical patterns (the support vectors) contain the less trivial training examples. For instance, linear discriminant analysis with prior training subset selection via SVM also leads to improved generalization. Using non-linear SVM, more ‘surprising’ critical regions may be detected, but owing to the relative sparseness of the data, this potential seems to be limited in credit scoring practice.


Archive | 2005

Support Vector Machines for Credit Scoring: Extension to Non Standard Cases

Klaus B. Schebesch; Ralf Stecking

Credit scoring is being used in order to assign credit applicants to good and bad risk classes. This paper investigates the credit scoring performance of support vector machines (SVM) with weighted classes and moderated outputs. First, we consider the adjustment of support vector machines for credit scoring to a set of non standard situations important to practitioners. Such more sophisticated credit scoring systems will adapt to vastly different proportions of credit worthiness between sample and population. Different costs for different types of misclassification will also be handled. Second, sigmoid output mapping is used to derive default probabilities, important for constructing rating systems and a step towards more “personalized” credit contracts.


Archive | 2003

Support Vector Machines for Credit Scoring: Comparing to and Combining With Some Traditional Classification Methods

Ralf Stecking; Klaus B. Schebesch

Credit scoring is being used in order to assign credit applicants to good and bad risk classes. This paper investigates the credit scoring performance of a nonstandard neural network technique: support vector machines (SVM). Using empirical data, the results of the SVM are compared with more traditional methods including linear discriminant analysis and logistic regression. Furthermore, a two-step approach is being tested: first SVM selects the most informative cases, and subsequently, these are used as inputs to linear discriminant analysis and logistic regression. Extensive experiments show that SVM outperforms the more traditional, computationally less demanding methods.


GfKl | 2006

Comparing and Selecting SVM-Kernels for Credit Scoring

Ralf Stecking; Klaus B. Schebesch

Kernel methods for classification problems map data points into feature spaces where linear separation is performed. Detecting linear relations has been the focus of much research in statistics and machine learning, resulting in efficient algorithms that are well understood, with many applications including credit scoring problems. However, the choice of more appropriate kernel functions using nonlinear feature mapping may still improve this classification performance. We show, how different kernel functions contribute to the solution of a credit scoring problem and we also show how to select and compare such kernels.


A Quarterly Journal of Operations Research | 2007

Combining Support Vector Machines for Credit Scoring

Ralf Stecking; Klaus B. Schebesch

Support vector machines (SVM) from statistical learning theory are powerful classification methods with a wide range of applications including credit scoring. The urgent need to further boost classification performance in many applications leads the machine learning community into developing SVM with multiple kernels and many other combined approaches. Owing to the huge size of the credit market, even small improvements in classification accuracy might considerably reduce effective misclassification costs experienced by banks. Under certain conditions, the combination of different models may reduce or at least stabilize the risk of misclassification. We report on combining several SVM with different kernel functions and variable credit client data sets. We present classification results produced by various combination strategies and we compare them to the results obtained earlier with more traditional single SVM credit scoring models.


GfKl | 2007

Selecting SVM Kernels and Input Variable Subsets in Credit Scoring Models

Klaus B. Schebesch; Ralf Stecking

We explore simultaneous variable subset selection and kernel selection within SVM classification models. First we apply results from SVM classification models with different kernel functions to a fixed subset of credit client variables provided by a German bank. Free variable subset selection for the bank data is discussed next. A simple stochastic search procedure for variable subset selection is also presented.


GfKl | 2012

Classification of Large Imbalanced Credit Client Data with Cluster Based SVM

Ralf Stecking; Klaus B. Schebesch

Credit client scoring on medium sized data sets can be accomplished by means of Support Vector Machines (SVM), a powerful and robust machine learning method. However, real life credit client data sets are usually huge, containing up to hundred thousands of records, with good credit clients vastly outnumbering the defaulting ones. Such data pose severe computational barriers for SVM and other kernel methods, especially if all pairwise data point similarities are requested. Hence, methods which avoid extensive training on the complete data are in high demand. A possible solution is clustering as preprocessing and classification on the more informative resulting data like cluster centers. Clustering variants which avoid the computation of all pairwise similarities robustly filter useful information from the large imbalanced credit client data set, especially when used in conjunction with a symbolic cluster representation. Subsequently, we construct credit client clusters representing both client classes, which are then used for training a non standard SVM adaptable to our imbalanced class set sizes. We also show that SVM trained on symbolic cluster centers result in classification models, which outperform traditional statistical models as well as SVM trained on all our original data.


A Quarterly Journal of Operations Research | 2006

Variable Subset Selection for Credit Scoring with Support Vector Machines

Ralf Stecking; Klaus B. Schebesch

Support Vector Machines (SVM) are very successful kernel based classification methods with a broad range of applications including credit scoring and rating. SVM can use data sets with many variables even when the number of cases is small. However, we are often constrained to reduce the input space owing to changing data availability, cost and speed of computation. We first evaluate variable subsets in the context of credit scoring. Then we apply previous results of using SVM with different kernel functions to a specific subset of credit client variables. Finally, rating of the credit data pool is presented.


GfKl | 2005

Informative Patterns for Credit Scoring: Support Vector Machines Preselect Data Subsets for Linear Discriminant Analysis

Ralf Stecking; Klaus B. Schebesch

Pertinent statistical methods for credit scoring can be very simple like e.g. linear discriminant analysis (LDA) or more sophisticated like e.g. support vector machines (SVM). There is mounting evidence of the consistent superiority of SVM over LDA or related methods on real world credit scoring problems. Methods like LDA are preferred by practitioners owing to the simplicity of the resulting decision function and owing to the ease of interpreting single input variables. Can one productively combine SVM and simpler methods? To this end, we use SVM as the preselection method. This subset preselection results in a final classification performance consistently above that of the simple methods used on the entire data.


A Quarterly Journal of Operations Research | 2017

Topological Data Analysis for Extracting Hidden Features of Client Data

Klaus B. Schebesch; Ralf Stecking

Computational Topological Data Analysis (TDA) is a collection of procedures which permits extracting certain robust features of high dimensional data, even when the number of data points is relatively small. Classical statistical data analysis is not very successful at or even cannot handle such situations altogether. Hidden features or structure in high dimensional data expresses some direct and indirect links between data points. Such may be the case when there are no explicit links between persons like clients in a database but there may still be important implicit links which characterize client populations and which also make different such populations more comparable. We explore the potential usefulness of applying TDA to different versions of credit scoring data, where clients are credit takers with a known defaulting behavior.

Collaboration


Dive into the Ralf Stecking's collaboration.

Top Co-Authors

Avatar

Klaus B. Schebesch

University of Western Ontario

View shared research outputs
Top Co-Authors

Avatar

Klaus B. Schebesch

University of Western Ontario

View shared research outputs
Researchain Logo
Decentralizing Knowledge