Philippe Dreuw
Bosch
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Philippe Dreuw.
british machine vision conference | 2009
Philippe Dreuw; Pascal Steingrube; Harald Hanselmann; Hermann Ney
We analyze the usage of Speeded Up Robust Features (SURF) as local descriptors for face recognition. The effect of different feature extraction and viewpoint consistency constrained matching approaches are analyzed. Furthermore, a RANSAC based outlier removal for system combination is proposed. The proposed approach allows to match faces under partial occlusions, and even if they are not perfectly aligned or illuminated. Current approaches are sensitive to registration errors and usually rely on a very good initial alignment and illumination of the faces to be recognized. A grid-based and dense extraction of local features in combination with a block-based matching accounting for different viewpoint constraints is proposed, as interest-point based feature extraction approaches for face recognition often fail. The proposed SURF descriptors are compared to SIFT descriptors. Experimental results on the AR-Face and CMU-PIE database using manually aligned faces, unaligned faces, and partially occluded faces show that the proposed approach is robust and can outperform current generic approaches.
international conference on automatic face and gesture recognition | 2006
Philippe Dreuw; Thomas Deselaers; David Rybach; Daniel Keysers; Hermann Ney
We present a novel tracking algorithm that uses dynamic programming to determine the path of target objects and that is able to track an arbitrary number of different objects. The traceback method used to track the targets avoids taking possibly wrong local decisions and thus reconstructs the best tracking paths using the whole observation sequence. The tracking method can be compared to the nonlinear time alignment in automatic speech recognition (ASR) and it can analogously be integrated into a hidden Markov model based recognition process. We show how the method can be applied to the tracking of hands and the face for automatic sign language recognition
conference on image and video retrieval | 2009
Thomas Deselaers; Tobias Gass; Philippe Dreuw; Hermann Ney
In this paper we present a method to jointly optimise the relevance and the diversity of the results in image retrieval. Without considering diversity, image retrieval systems often mainly find a set of very similar results, so called near duplicates, which is often not the desired behaviour. From the user perspective, the ideal result consists of documents which are not only relevant but ideally also diverse. Most approaches addressing diversity in image or information retrieval use a two-step approach where in a first step a set of potentially relevant images is determined and in a second step these images are reranked to be diverse among the first positions. In contrast to these approaches, our method addresses the problem directly and jointly optimises the diversity and the relevance of the images in the retrieval ranking using techniques inspired by dynamic programming algorithms. We quantitatively evaluate our method on the ImageCLEF 2008 photo retrieval data and obtain results which outperform the state of the art. Additionally, we perform a qualitative evaluation on a new product search task and it is observed that the diverse results are more attractive to an average user.
computer vision and pattern recognition | 2008
Thomas Deselaers; Philippe Dreuw; Hermann Ney
We present a method to fully automatically fit videos in 16:9 format on 4:3 screens and vice versa. It can be applied to arbitrary aspect ratios and can be used to make videos suitable for mobile viewing devices with small and possibly uncommonly sized displays. The cropping sequence is optimised over time to create smooth transitions and thus leads to an excellent viewing experience. Current televisions have simple and often disturbing methods which either show the centre region of the image, distort the image, or pad it with black borders. The technique presented here can fully automatically find the ldquorightrdquo viewing area for each image in a video sequence. It works in real-time with only very little time-shift. We employ different low-level features and a log-linear model to learn how to find the right area. The method is able to automatically decide whether padding with black borders is necessary or whether all relevant image areas fit on screen by cropping the image. Evaluation is done on ten videos from five different types of content and the baseline methods are clearly outperformed.
international conference on pattern recognition | 2008
Philippe Dreuw; Stephan M. Jonas; Hermann Ney
We propose to explicitly model white-spaces for Arabic handwriting recognition within different writing variants. Position-dependent character shapes in Arabic handwriting allow for large white-spaces between characters even within words. Here, a separate character model for white-spaces in combination with a lexicon using different writing variants and character model length adaptation is proposed. Current handwriting recognition systems model the white-spaces implicitly within the character models leading to possibly degraded models, or try to explicitly segment the Arabic words into pieces of Arabic words being prone to segmentation errors. Several white-space modeling approaches are analyzed on the well known IFN/ENIT database and outperform the best reported error rates.
international conference on document analysis and recognition | 2009
Philippe Dreuw; David Rybach; Christian Gollan; Hermann Ney
We present a writer adaptive training and writer clustering approach for an HMM based Arabic handwriting recognition system to handle different handwriting styles and their variations. Additionally, a writing variant model refinement for specific writing variants is proposed.Current approaches try to compensate the impact of different writing styles during preprocessing and normalization steps.Writer adaptive training with a CMLLR based feature adaptation is used to train writer dependent models. An unsupervised writer clustering with Bayesian information criterion based stopping condition for a CMLLR based feature adaptation during a two-pass decoding process is used to cluster different handwriting styles of unknown test writers.The proposed methods are evaluated on the IFN/ENIT Arabic handwriting database.
international conference on image processing | 2011
Philippe Dreuw; Patrick Doetsch; Christian Plahl; Hermann Ney
We use neural network based features extracted by a hierarchical multilayer-perceptron (MLP) network either in a hybrid MLP/HMM approach or to discriminatively retrain a Gaussian hidden Markov model (GHMM) system in a tandem approach. MLP networks have been successfully used to model long-term and non-linear features dependencies in automatic speech and optical character recognition. In offline handwriting recognition, MLPs have been mostly used for isolated character and word recognition in hybrid approaches. Here we analyze MLPs within an LVCSR framework for continuous handwriting recognition using discriminative MMI/MPE training. Especially hybrid MLP/HMM and discriminatively retrained MLP-GHMM tandem approaches are evaluated. Significant improvements and competitive results are reported for a closed-vocabulary task on the IfN/ENIT Arabic handwriting database and for a large-vocabulary task using the IAM English handwriting database.
Gesture-Based Human-Computer Interaction and Simulation | 2009
Philippe Dreuw; Daniel Stein; Hermann Ney
In automatic sign language translation, one of the main problems is the usage of spatial information in sign language and its proper representation and translation, e.g. the handling of spatial reference points in the signing space. Such locations are encoded at static points in signing space as spatial references for motion events. We present a new approach starting from a large vocabulary speech recognition system which is able to recognize sentences of continuous sign language speaker independently. The manual features obtained from the tracking are passed to the statistical machine translation system to improve its accuracy. On a publicly available benchmark database, we achieve a competitive recognition performance and can similarly improve the translation performance by integrating the tracking features.
international conference on document analysis and recognition | 2009
Philippe Dreuw; Georg Heigold; Hermann Ney
We present a novel confidence-based discriminative training for model adaptation approach for an HMM based Arabic handwriting recognition system to handle different handwriting styles and their variations.Most current approaches are maximum-likelihood trained HMM systems and try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer specific data.Discriminative training based on the Maximum Mutual Information criterion is used to train writer independent handwriting models. For model adaptation during decoding, an unsupervised confidence-based discriminative training on a word and frame level within a two-pass decoding process is proposed. Additionally, the training criterion is extended to incorporate a margin term.The proposed methods are evaluated on the IFN/ENIT Arabic handwriting database, where the proposed novel adaptation approach can decrease the word-error-rate by 33% relative.
International Journal on Document Analysis and Recognition | 2011
Philippe Dreuw; Georg Heigold; Hermann Ney
We present a novel confidence- and margin-based discriminative training approach for model adaptation of a hidden Markov model (HMM)-based handwriting recognition system to handle different handwriting styles and their variations. Most current approaches are maximum-likelihood (ML) trained HMM systems and try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer-specific data. Here, discriminative training based on the maximum mutual information (MMI) and minimum phone error (MPE) criteria are used to train writer-independent handwriting models. For model adaptation during decoding, an unsupervised confidence-based discriminative training on a word and frame level within a two-pass decoding process is proposed. The proposed methods are evaluated for closed-vocabulary isolated handwritten word recognition on the IFN/ENIT Arabic handwriting database, where the word error rate is decreased by 33% relative compared to a ML trained baseline system. On the large-vocabulary line recognition task of the IAM English handwriting database, the word error rate is decreased by 25% relative.