Gero Szepannek
Technical University of Dortmund
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gero Szepannek.
Engineering Applications of Artificial Intelligence | 2009
Gero Szepannek; Bernd Bischl; Claus Weihs
Classification methods generally rely on some idea about the data structure. If the specific assumptions are not met, a classifier may fail. In this paper, the possibility of combining classifiers in multi-class problems is investigated. Multi-class classification problems are split into two class problems. For each of the latter problems an optimal classifier is determined. The results of applying the optimal classifiers on the two class problems can be combined using a pairwise coupling algorithm. In this paper, exemplary situations are investigated where the respective assumptions of Naive Bayes or the classical Linear Discriminant Analysis (LDA) fail. It is investigated at which degree of violations of the assumptions it may be advantageous to use single methods or a classifier combination by pairwise coupling.
international conference on data mining | 2009
Claus Weihs; Gero Szepannek
The notion of distance is the most important basis for classification. This is especially true for unsupervised learning, i.e. clustering, since there is no validation mechanism by means of objects of known groups. But also for supervised learning standard distances often do not lead to appropriate results. For every individual problem the adequate distance is to be decided upon. This is demonstrated by means of three practical examples from very different application areas, namely social science, music science, and production economics. In social science, clustering is applied to spatial regions with very irregular borders. Then adequate spatial distances may have to be taken into account for clustering. In statistical musicology the main problem is often to find an adequate transformation of the input time series as an adequate basis for distance definition. Also, local modelling is proposed in order to account for different subpopulations, e.g. instruments. In production economics often many quality criteria have to be taken into account with very different scaling. In order to find a compromise optimum classification, this leads to a pre-transformation onto the same scale, called desirability.
international conference on data mining | 2006
Gero Szepannek; Claus Weihs
Sometimes one may be confronted with classification problems where classes are constituted of several subclasses that possess different distributions and therefore destroy accurate models of the entire classes as one similar group. An issue is modelling via local models of several subclasses. In this paper, a method is presented of how to handle such classification problems where the subclasses are furthermore characterized by different subsets of the variables. Situations are outlined and tested where such local models in different variable subspaces dramatically improve the classification error.
biomedical circuits and systems conference | 2006
Tamás Harczos; Gero Szepannek; András Kátai; Frank Klefenz
Meaningful feature extraction is a very important challenge indispensable to allow good classification results. In Automatic Speech Recognition human performance is still superior to technical solutions. In this paper a feature extraction for sound data is presented that is motivated by the neural processing of the human auditory system. The physiological mechanisms of signal transduction in the human ear and its neural representation are described. The generated pulse spiking trains of the auditory nerve fibers are connected to a feed forward timing artificial Hubel-Wiesel network, which is a structured computational map for higher cognitive functions as e.g. vowel recognition. According to former cochlea studies a signal triggers a set of delay trajectories on the basilar membrane, which will be projected further to connecting structures. In our approach this phenomenon is employed for classification of vowels from different speakers.
GfKl | 2009
Julia Schiffner; Gero Szepannek; Thierry Monthé; Claus Weihs
In localized logistic regression (cp. Loader, Local regression and likelihood, Springer, New York, 1999; Tutz and Binder, Statistics and Computing 15:155–166, 2005) at each target point where a prediction is required a logistic regression model is fitted locally. This is achieved by weighting the training observations in the log-likelihood based on their distances to the target observation. For interval-scaled influential factors these weights usually depend on Euclidean distances. This paper aims to combine localized logistic regression with dissimilarity measures more suitable for categorical data.
Informatik Spektrum | 2005
Gero Szepannek; Frank Klefenz; Claus Weihs
ZusammenfassungAnlehnend an die Überlegenheit des Menschen bei der Differenzierung von Schallsignalen, verglichen mit dem aktuellen Stand der Technik, widmet sich dieser Artikel dem Aufzeigen der Mechanismen und Vorgänge, die sich bei der Verarbeitung von mechanischem Schall in Nervenimpulse innerhalb des Ohrs vollziehen, als Basis für eine mögliche technische Nachbildung, z.B. zur Erkennung von Tonhöhen in Musik.
Technical reports | 2005
Gero Szepannek; Claus Weihs
In classification, with an increasing number of variables, the required number of observations grows drastically. In this paper we present an approach to put into effect the maximal possible variable selection, by splitting a K class classification problem into pairwise problems. The principle makes use of the possibility that a variable that discriminates two classes will not necessarily do so for all such class pairs. We further present the construction of a classification rule based on the pairwise solutions by the Pairwise Coupling algorithm according to Hastie and Tibshirani (1998). The suggested proceedure can be applied to any classification method. Finally, situations with lack of data in multidimensional spaces are investigated on different simulated data sets to illustrate the problem and the possible gain. The principle is compared to the classical approach of linear and quadratic discriminant analysis.
Archive | 2010
Gero Szepannek; Matthias Gruhne; Bernd Bischl; Sebastian Krey; Tamás Harczos; Frank Klefenz; Christian Dittmar; Claus Weihs
Solving the task of phoneme recognition in music sound files may help for several practical applications: it enables lyrics transcription and as a consequence could provide further relevant information for the task of an automatic song classification. Beyond it can be used for lyrics alignment e.g. in karaoke applications. The effect of both different feature signal representations as well as the choice of the appropriate classifier are investigated. Besides, a unified R framework for classifier optimization is be presented.
machine learning and data mining in pattern recognition | 2007
Gero Szepannek; Bernd Bischl; Claus Weihs
If their assumptions are not met, classifiers may fail. In this paper, the possibility of combining classifiers in multi-class problems is investigated. Multi-class classification problems are split into two class problems. For each of the latter problems an optimal classifier is determined. The results of applying the optimal classifiers on the two class problems can be combined using the Pairwise Couplingalgorithm by Hastie and Tibshirani (1998). In this paper exemplary situations are investigated where the respective assumptions of Naive Bayes or the classical Linear Discriminant Analysis (LDA, Fisher, 1936) fail. It is investigated at which degree of violations of the assumptions it may be advantageous to use single methods or a classifier combination by Pairwise Coupling.
Archive | 2010
Tina Müller; Julia Schiffner; Holger Schwender; Gero Szepannek; Claus Weihs; Katja Ickstadt
SNP association studies investigate the relationship between complex diseases and one’s genetic predisposition through Single Nucleotide Poly- morphisms. The studies provide the analyst with a wealth of data and lots of challenges as the moderate to small risk changes are hard to detect and, moreover, the interest focusses not on the identification of single influential SNPs, but of (high-order) SNP interactions. Thus, the studies usually contain more variables than observations. An additional problem arises as there might be alternative ways of developing a disease. To face the challenges of high dimension, interaction effects and local differences, we use associative classification and localised logistic regression to classify the observations into cases and controls. These methods contain great potential for the local analysis of SNP data as applications to both simulated and real-world whole-genome data show.