Friedhelm Schwenker | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Friedhelm Schwenker is active.

Explore More

Publication

Featured researches published by Friedhelm Schwenker.

Neural Networks | 2001

Three learning phases for radial-basis-function networks

Friedhelm Schwenker; Hans A. Kestler; Günther Palm

In this paper, learning algorithms for radial basis function (RBF) networks are discussed. Whereas multilayer perceptrons (MLP) are typically trained with backpropagation algorithms, starting the training procedure with a random initialization of the MLPs parameters, an RBF network may be trained in many different ways. We categorize these RBF training methods into one-, two-, and three-phase learning schemes. Two-phase RBF learning is a very common learning scheme. The two layers of an RBF network are learnt separately; first the RBF layer is trained, including the adaptation of centers and scaling parameters, and then the weights of the output layer are adapted. RBF centers may be trained by clustering, vector quantization and classification tree algorithms, and the output layer by supervised learning (through gradient descent or pseudo inverse solution). Results from numerical experiments of RBF classifiers trained by two-phase learning are presented in three completely different pattern recognition applications: (a) the classification of 3D visual objects; (b) the recognition hand-written digits (2D objects); and (c) the categorization of high-resolution electrocardiograms given as a time series (ID objects) and as a set of features extracted from these time series. In these applications, it can be observed that the performance of RBF classifiers trained with two-phase learning can be improved through a third backpropagation-like training phase of the RBF network, adapting the whole set of parameters (RBF centers, scaling parameters, and output layer weights) simultaneously. This, we call three-phase learning in RBF networks. A practical advantage of two- and three-phase learning in RBF networks is the possibility to use unlabeled training data for the first training phase. Support vector (SV) learning in RBF networks is a different learning approach. SV learning can be considered, in this context of learning, as a special type of one-phase learning, where only the output layer weights of the RBF network are calculated, and the RBF centers are restricted to be a subset of the training data. Numerical experiments with several classifier schemes including k-nearest-neighbor, learning vector quantization and RBF classifiers trained through two-phase, three-phase and support vector learning are given. The performance of the RBF classifiers trained through SV learning and three-phase learning are superior to the results of two-phase learning, but SV learning often leads to complex network structures, since the number of support vectors is not a small fraction of the total number of data points.

Neural Networks | 1996

Iterative retrieval of sparsely coded associative memory patterns

Friedhelm Schwenker; Friedrich T. Sommer; Günther Palm

Abstract We investigate the pattern completion performance of neural auto-associative memories composed of binary threshold neurons for sparsely coded binary memory patterns. By focussing on iterative retrieval, we are able to introduce effective threshold control strategies. These are investigated by means of computer simulation experiments and analytical treatment. To evaluate the systems performance we consider the completion capacity C and the mean retrieval errors. The asymptotic completion capacity values for the recall of sparsely coded binary patterns in one-step retrieval is known to be ln 2 4 ≈ 17.32% for binary Hebbian learning, and 1 (8 ln 2) ≈ 18% for additive Hebbian learning. These values are accomplished with vanishing error probability and yet are higher than those obtained in other known neural memory models. Recent investigations on binary Hebbian learning have proved that iterative retrieval as a more refined retrieval method does not improve the asymptotic completion capacity of one step retrieval. In a finite size auto-associative memory we show that iterative retrieval achieves higher capacity and better error correction than one-step retrieval. One-step retrieval produces high retrieval errors at optimal memory load. Iterative retrieval reduces the retrieval errors within a few iteration steps (t ⩽ 5). Experiments with additive Hebbian learning show that in the finite model, binary Hebbian learning exhibits much better performance. Thus the main concern of this paper is binary Hebbian learning. We examine iterative retrieval in experiments with up to n = 20,000 threshold neurons. With this system size one-step retrieval yields a completion capacity of about 16%, the second retrieval step increases this value to 17.9% and with iterative retrieval we obtain 19%. The first two retrieval steps in the finite system have also been treated analytically. For one-step retrieval the asymptotic capacity value is approximated from below with growing system size. In the second retrieval step (and as the experiments suggest also for iterative retrieval) the finite size behaviour is different. The capacity exceeds the asymptotic value, reaches an optimum for finite system size, and decreases to the asymptotic limit.

international conference on knowledge based and intelligent information and engineering systems | 2000

Hierarchical support vector machines for multi-class pattern recognition

Friedhelm Schwenker

Support vector machines (SVM) are learning algorithms derived from statistical learning theory. The SVM approach was originally developed for binary classification problems. In this paper SVM architectures for multi-class classification problems are discussed, in particular we consider binary trees of SVMs to solve the multi-class problem. Numerical results for different classifiers on a benchmark data set of handwritten digits are presented.

affective computing and intelligent interaction | 2011

Multiple classifier systems for the classificatio of audio-visual emotional states

Michael Glodek; Stephan Tschechne; Georg Layher; Martin Schels; Tobias Brosch; Stefan Scherer; Markus Kächele; Miriam Schmidt; Heiko Neumann; Günther Palm; Friedhelm Schwenker

Research activities in the field of human-computer interaction increasingly addressed the aspect of integrating some type of emotional intelligence. Human emotions are expressed through different modalities such as speech, facial expressions, hand or body gestures, and therefore the classification of human emotions should be considered as a multimodal pattern recognition problem. The aim of our paper is to investigate multiple classifier systems utilizing audio and visual features to classify human emotional states. For that a variety of features have been derived. From the audio signal the fundamental frequency, LPCand MFCC coefficients, and RASTA-PLP have been used. In addition to that two types of visual features have been computed, namely form and motion features of intermediate complexity. The numerical evaluation has been performed on the four emotional labels Arousal, Expectancy, Power, Valence as defined in the AVEC data set. As classifier architectures multiple classifier systems are applied, these have been proven to be accurate and robust against missing and noisy data.

Pattern Recognition Letters | 2014

Pattern classification and clustering: A review of partially supervised learning approaches

Friedhelm Schwenker; Edmondo Trentin

The paper categorizes and reviews the state-of-the-art approaches to the partially supervised learning (PSL) task. Special emphasis is put on the fields of pattern recognition and clustering involving partially (or, weakly) labeled data sets. The major instances of PSL techniques are categorized into the following taxonomy: (i) active learning for training set design, where the learning algorithm has control over the training data; (ii) learning from fuzzy labels, whenever multiple and discordant human experts are involved in the (complex) data labeling process; (iii) semi-supervised learning (SSL) in pattern classification (further sorted out into: self-training, SSL with generative models, semi-supervised support vector machines; SSL with graphs); (iv) SSL in data clustering, using additional constraints to incorporate expert knowledge into the clustering process; (v) PSL in ensembles and learning by disagreement; (vi) PSL in artificial neural networks. In addition to providing the reader with the general background and categorization of the area, the paper aims at pointing out the main issues which are still open, motivating the on-going investigations in PSL research.

Computer Speech & Language | 2013

Investigating fuzzy-input fuzzy-output support vector machines for robust voice quality classification

Stefan Scherer; John Kane; Christer Gobl; Friedhelm Schwenker

The dynamic use of voice qualities in spoken language can reveal useful information on a speakers attitude, mood and affective states. This information may be very desirable for a range of, both input and output, speech technology applications. However, voice quality annotation of speech signals may frequently produce far from consistent labeling. Groups of annotators may disagree on the perceived voice quality, but whom should one trust or is the truth somewhere in between? The current study looks first to describe a voice quality feature set that is suitable for differentiating voice qualities on a tense to breathy dimension. Further, the study looks to include these features as inputs to a fuzzy-input fuzzy-output support vector machine (F^2SVM) algorithm, which is in turn capable of softly categorizing voice quality recordings. The F^2SVM is compared in a thorough analysis to standard crisp approaches and shows promising results, while outperforming for example standard support vector machines with the sole difference being that the F^2SVM approach receives fuzzy label information during training. Overall, it is possible to achieve accuracies of around 90% for both speaker dependent (cross validation) and speaker independent (leave one speaker out validation) experiments. Additionally, the approach using F^2SVM performs at an accuracy of 82% for a cross corpus experiment (i.e. training and testing on entirely different recording conditions) in a frame-wise analysis and of around 97% after temporally integrating over full sentences. Furthermore, the output of fuzzy measures gave performances close to that of human annotators.

international conference on human computer interaction | 2011

Multimodal emotion classification in naturalistic user behavior

Steffen Walter; Stefan Scherer; Martin Schels; Michael Glodek; David Hrabal; Miriam Schmidt; Ronald Böck; Kerstin Limbrecht; Harald C. Traue; Friedhelm Schwenker

The design of intelligent personalized interactive systems, having knowledge about the users state, his desires, needs and wishes, currently poses a great challenge to computer scientists. In this study we propose an information fusion approach combining acoustic, and biophysiological data, comprising multiple sensors, to classify emotional states. For this purpose a multimodal corpus has been created, where subjects undergo a controlled emotion eliciting experiment, passing several octants of the valence arousal dominance space. The temporal and decision level fusion of the multiple modalities outperforms the single modality classifiers and shows promising results.

Ksii Transactions on Internet and Information Systems | 2012

Spotting laughter in natural multiparty conversations: A comparison of automatic online and offline approaches using audiovisual data

Stefan Scherer; Michael Glodek; Friedhelm Schwenker; Nick Campbell; Günther Palm

It is essential for the advancement of human-centered multimodal interfaces to be able to infer the current users state or communication state. In order to enable a system to do that, the recognition and interpretation of multimodal social signals (i.e., paralinguistic and nonverbal behavior) in real-time applications is required. Since we believe that laughs are one of the most important and widely understood social nonverbal signals indicating affect and discourse quality, we focus in this work on the detection of laughter in natural multiparty discourses. The conversations are recorded in a natural environment without any specific constraint on the discourses using unobtrusive recording devices. This setup ensures natural and unbiased behavior, which is one of the main foci of this work. To compare results of methods, namely Gaussian Mixture Model (GMM) supervectors as input to a Support Vector Machine (SVM), so-called Echo State Networks (ESN), and a Hidden Markov Model (HMM) approach, are utilized in online and offline detection experiments. The SVM approach proves very accurate in the offline classification task, but is outperformed by the ESN and HMM approach in the online detection (F1 scores: GMM SVM 0.45, ESN 0.63, HMM 0.72). Further, we were able to utilize the proposed HMM approach in a cross-corpus experiment without any retraining with respectable generalization capability (F1score: 0.49). The results and possible reasons for these outcomes are shown and discussed in the article. The proposed methods may be directly utilized in practical tasks such as the labeling or the online detection of laughter in conversational data and affect-aware applications.

Intelligent Environments, 2007. IE 07. 3rd IET International Conference on | 2007

Classifier Fusion for Emotion Recognition from Speech

Stefan Scherer; Friedhelm Schwenker; Günther Palm

The intention of this work is the investigation of the performance of an automatic emotion recognizer using biologically motivated features, comprising perceived loudness features proposed by Zwicker, robust RASTA-PLP features, and novel long-term modulation spectrum-based features . Single classifiers using only one type of features and multi-classifier systems utilizing all three types are examined using two-classifier fusion techniques. For all the experiments the standard Berlin Database of Emotional Speech comprising recordings of seven different emotions is used to evaluate the performance of the proposed multi-classifier system. The performance is compared with earlier work as well as with human recognition performance. The results reveal that using simple fusion techniques could improve the performance significantly, outperforming other classifiers used in earlier work. The generalization ability of the proposed system is further investigated in a leave-out one-speaker experiment, uncovering a strong ability to recognize emotions expressed by unknown speakers. Moreover, similarities between earlier speech analysis and the automatic emotion recognition results were found.

computing in cardiology conference | 1998

De-noising of high-resolution ECG signals by combining the discrete wavelet transform with the Wiener filter

Hans A. Kestler; M. Haschka; W. Kratz; Friedhelm Schwenker; Günther Palm; Vinzenz Hombach; Martin Höher

In this study the authors applied a combination of the discrete wavelet transform and the Wiener filter to the noise-reduction of high-resolution ECG signals. The procedure is optimal in the least squares sense in that it separates a signal from additive noise. It was compared to a popular de-noising algorithm by Donoho (1993) on artificially generated signals and on a high-resolution ECG signal corrupted by noise.

Explore More