Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gernot A. Fink is active.

Publication


Featured researches published by Gernot A. Fink.


Speech Communication | 2002

Combining acoustic and articulatory feature information for robust speech recognition

Katrin Kirchhoff; Gernot A. Fink; Gerhard Sagerer

Abstract The idea of using articulatory representations for automatic speech recognition (ASR) continues to attract much attention in the speech community. Representations which are grouped under the label “articulatory” include articulatory parameters derived by means of acoustic-articulatory transformations (inverse filtering), direct physical measurements or classification scores for pseudo-articulatory features. In this study, we revisit the use of features belonging to the third category. In particular, we concentrate on the potential benefits of pseudo-articulatory features in adverse acoustic environments and on their combination with standard acoustic features. Systems based on articulatory features only and combined acoustic-articulatory systems are tested on two different recognition tasks: telephone-speech continuous numbers recognition and conversational speech recognition. We show that articulatory feature (AF) systems are capable of achieving a superior performance at high noise levels and that the combination of acoustic and AFs consistently leads to a significant reduction of word error rate across all acoustic conditions.


International Journal on Document Analysis and Recognition | 2009

Markov models for offline handwriting recognition: a survey

Thomas Plötz; Gernot A. Fink

Since their first inception more than half a century ago, automatic reading systems have evolved substantially, thereby showing impressive performance on machine-printed text. The recognition of handwriting can, however, still be considered an open research problem due to its substantial variation in appearance. With the introduction of Markovian models to the field, a promising modeling and recognition paradigm was established for automatic offline handwriting recognition. However, so far, no standard procedures for building Markov-model-based recognizers could be established though trends toward unified approaches can be identified. It is therefore the goal of this survey to provide a comprehensive overview of the application of Markov models in the research field of offline handwriting recognition, covering both the widely used hidden Markov models and the less complex Markov-chain or n-gram models. First, we will introduce the typical architecture of a Markov-model-based offline handwriting recognition system and make the reader familiar with the essential theoretical concepts behind Markovian models. Then, we will give a thorough review of the solutions proposed in the literature for the open problems how to apply Markov-model-based approaches to automatic offline handwriting recognition.


Robotics and Autonomous Systems | 2003

Multi-modal anchoring for human-robot interaction

Jannik Fritsch; Marcus Kleinehagenbrock; Sebastian Lang; Thomas Plötz; Gernot A. Fink; Gerhard Sagerer

Abstract This paper presents a hybrid approach for tracking humans with a mobile robot that integrates face and leg detection results extracted from image and laser range data, respectively. The different percepts are linked to their symbolic counterparts legs and face by anchors as defined by Coradeschi and Saffiotti [Anchoring symbols to sensor data: preliminary report, in: Proceedings of the Conference of the American Association for Artificial Intelligence, 2000, pp. 129–135]. In order to anchor the composite object person we extend the anchoring framework to combine different component anchors belonging to the same person. This allows to deal with perceptual algorithms having different spatio-temporal properties and provides a structured way for integrating anchor data from multiple modalities. An evaluation demonstrates the performance of our approach.


robot and human interactive communication | 2002

Person tracking with a mobile robot based on multi-modal anchoring

Marcus Kleinehagenbrock; Sebastian Lang; Jannik Fritsch; Frank Lömker; Gernot A. Fink; Gerhard Sagerer

The ability to robustly track a person is an important prerequisite for human-robot-interaction. This paper presents a hybrid approach for integrating vision and laser range data to track a human. The legs of a person can be extracted from laser range data while skin-colored faces are detectable in camera images showing the upper body part of a person. As these algorithms provide different percepts originating from the same person, the perceptual results have to be combined. We link the percepts to their symbolic counterparts legs and face by anchoring processes as defined by Coradeschi and Saffiotti. To anchor the composite symbol person we extend the anchoring framework with a fusion module integrating the individual anchors. This allows to deal with perceptual algorithms having different spatio-temporal properties and provides a structured way for integrating anchors from multiple modalities. An example with a mobile robot tracking a person demonstrates the performance of our approach.


intelligent robots and systems | 2002

Multi-modal human-machine communication for instructing robot grasping tasks

Patrick C. McGuire; Jannik Fritsch; Jochen J. Steil; Frank Röthling; Gernot A. Fink; Sven Wachsmuth; Gerhard Sagerer; Helge Ritter

A major challenge for the realization of intelligent robots is to supply them with cognitive abilities in order to allow ordinary users to program them easily and intuitively. One approach to such programming is teaching work tasks by interactive demonstration. To make this effective and convenient for the user, the machine must be capable of establishing a common focus of attention and be able to use and integrate spoken instructions, visual perception, and non-verbal clues like gestural commands. We report progress in building a hybrid architecture that combines statistical methods, neural networks, and finite state machines into an integrated system for instructing grasping tasks by man-machine interaction. The system combines the GRAVIS-robot for visual attention and gestural instruction with an intelligent interface for speech recognition and linguistic interpretation, and a modality fusion module to allow multi-modal task-oriented man-machine communication with respect to dextrous robot manipulation of objects.


robot and human interactive communication | 2002

Improving adaptive skin color segmentation by incorporating results from face detection

Jannik Fritsch; Sebastian Lang; A. Kleinehagenbrock; Gernot A. Fink; Gerhard Sagerer

The visual tracking of human faces is a basic functionality needed for human-machine interfaces. This paper describes an approach that explores the combined use of adaptive skin color segmentation and face detection for improved face tracking on a mobile robot. To cope with inhomogeneous lighting within a single image, the color of each tracked image region is modeled with an individual, unimodal Gaussian. Face detection is performed locally on all segmented skin-colored regions. If a face is detected, the appropriate color model is updated with the image pixels in an elliptical area around the face position. Updating is restricted to pixels that are contained in a global skin color distribution obtained off-line. The presented method allows us to track faces that undergo changes in lighting conditions while at the same time providing information about the attention of the user, i.e. whether the user looks at the robot. This forms the basis for developing more sophisticated human-machine interfaces capable of dealing with unrestricted environments.


international conference on document analysis and recognition | 2013

Bag-of-Features HMMs for Segmentation-Free Word Spotting in Handwritten Documents

Leonard Rothacker; Marçal Rusiñol; Gernot A. Fink

Recent HMM-based approaches to handwritten word spotting require large amounts of learning samples and mostly rely on a prior segmentation of the document. We propose to use Bag-of-Features HMMs in a patch-based segmentation-free framework that are estimated by a single sample. Bag-of-Features HMMs use statistics of local image feature representatives. Therefore they can be considered as a variant of discrete HMMs allowing to model the observation of a number of features at a point in time. The discrete nature enables us to estimate a query model with only a single example of the query provided by the user. This makes our method very flexible with respect to the availability of training data. Furthermore, we are able to outperform state-of-the-art results on the George Washington dataset.


computer analysis of images and patterns | 2009

Face Detection Using GPU-Based Convolutional Neural Networks

Fabian Nasse; Christian Thurau; Gernot A. Fink

In this paper, we consider the problem of face detection under pose variations. Unlike other contributions, a focus of this work resides within efficient implementation utilizing the computational powers of modern graphics cards. The proposed system consists of a parallelized implementation of convolutional neural networks (CNNs) with a special emphasize on also parallelizing the detection process. Experimental validation in a smart conference room with 4 active ceiling-mounted cameras shows a dramatic speed-gain under real-life conditions.


international conference on acoustics, speech, and signal processing | 2014

A Bag-of-Features approach to acoustic event detection

Axel Plinge; Rene Grzeszick; Gernot A. Fink

The classification of acoustic events in indoor environments is an important task for many practical applications in smart environments. In this paper a novel approach for classifying acoustic events that is based on a Bag-of-Features approach is proposed. Mel and gammatone frequency cepstral coefficients that originate from psychoacoustic models are used as input features for the Bag-of representation. Rather than using a prior classification or segmentation step to eliminate silence and background noise, Bag-of-Features representations are learned for a background class. Supervised learning of codebooks and temporal coding are shown to improve the recognition rates. Three different databases are used for the experiments: the CLEAR sound event dataset, the D-CASE event dataset and a new set of smart room recordings.


2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays | 2011

Towards acoustic self-localization of ad hoc smartphone arrays

Marius H. Hennecke; Gernot A. Fink

The advent of the smartphone in recent years opened new possibilities for the concept of ubiquitous computing. We propose to use multiple smartphones spontaneously assembled into an ad hoc microphone array as part of a teleconferencing system. The unknown spatial positions, the asynchronous sampling and the unknown time offsets between clocks of smartphones in the ad hoc array are the main problems for such an application as well as for almost all other acoustic signal processing algorithms. A maximum likelihood approach using time of arrival measurements of short calibration pulses is proposed to solve this self-localization problem. The global orientation of each phone, obtained by the means of nowadays common built-in geomagnetic compasses, in combination with the constant microphone-loudspeaker distance lead to a nonlinear optimization problem with a reduced dimensionality in contrast to former methods. The applicability of the proposed self-localization is shown in simulation and via recordings in a typical reverberant and noisy conference room.

Collaboration


Dive into the Gernot A. Fink's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Thomas Plötz

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Rene Grzeszick

Technical University of Dortmund

View shared research outputs
Top Co-Authors

Avatar

Leonard Rothacker

Technical University of Dortmund

View shared research outputs
Top Co-Authors

Avatar

Sebastian Sudholt

Technical University of Dortmund

View shared research outputs
Top Co-Authors

Avatar

Szilárd Vajda

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Axel Plinge

Technical University of Dortmund

View shared research outputs
Top Co-Authors

Avatar

Jan Richarz

Technical University of Dortmund

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge