Jürgen Gall | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jürgen Gall is active.

Explore More

Publication

Featured researches published by Jürgen Gall.

computer vision and pattern recognition | 2010

A Hough transform-based voting framework for action recognition

Angela Yao; Jürgen Gall; Luc Van Gool

We present a method to classify and localize human actions in video using a Hough transform voting framework. Random trees are trained to learn a mapping between densely-sampled feature patches and their corresponding votes in a spatio-temporal-action Hough space. The leaves of the trees form a discriminative multi-class codebook that share features between the action classes and vote for action centers in a probabilistic manner. Using low-level features such as gradients and optical flow, we demonstrate that Hough-voting can achieve state-of-the-art performance on several datasets covering a wide range of action-recognition scenarios.

european conference on computer vision | 2012

Motion capture of hands in action using discriminative salient points

Luca Ballan; Aparna Taneja; Jürgen Gall; Luc Van Gool; Marc Pollefeys

Capturing the motion of two hands interacting with an object is a very challenging task due to the large number of degrees of freedom, self-occlusions, and similarity between the fingers, even in the case of multiple cameras observing the scene. In this paper we propose to use discriminatively learned salient points on the fingers and to estimate the finger-salient point associations simultaneously with the estimation of the hand pose. We introduce a differentiable objective function that also takes edges, optical flow and collisions into account. Our qualitative and quantitative evaluations show that the proposed approach achieves very accurate results for several challenging sequences containing hands and objects in action.

IEEE Transactions on Multimedia | 2010

A 3-D Audio-Visual Corpus of Affective Communication

Gabriele Fanelli; Jürgen Gall; Harald Romsdorfer; Thibaut Weise; L. Van Gool

Communication between humans deeply relies on the capability of expressing and recognizing feelings. For this reason, research on human-machine interaction needs to focus on the recognition and simulation of emotional states, prerequisite of which is the collection of affective corpora. Currently available datasets still represent a bottleneck for the difficulties arising during the acquisition and labeling of affective data. In this work, we present a new audio-visual corpus for possibly the two most important modalities used by humans to communicate their emotional states, namely speech and facial expression in the form of dense dynamic 3-D face geometries. We acquire high-quality data by working in a controlled environment and resort to video clips to induce affective states. The annotation of the speech signal includes: transcription of the corpus text into the phonological representation, accurate phone segmentation, fundamental frequency extraction, and signal intensity estimation of the speech signals. We employ a real-time 3-D scanner to acquire dense dynamic facial geometries and track the faces throughout the sequences, achieving full spatial and temporal correspondences. The corpus is a valuable tool for applications like affective visual speech synthesis or view-independent facial expression recognition.

computer vision and pattern recognition | 2010

An object-dependent hand pose prior from sparse training data

Henning Hamer; Jürgen Gall; Thibaut Weise; Luc Van Gool

In this paper, we propose a prior for hand pose estimation that integrates the direct relation between a manipulating hand and a 3d object. This is of particular interest for a variety of applications since many tasks performed by humans require hand-object interaction. Inspired by the ability of humans to learn the handling of an object from a single example, our focus lies on very sparse training data. We express estimated hand poses in local object coordinates and extract for each individual hand segment, the relative position and orientation as well as contact points on the object. The prior is then modeled as a spatial distribution conditioned to the object. Given a new object of the same object class and new hand dimensions, we can transfer the prior by a procedure involving a geometric warp. In our experiments, we demonstrate that the prior may be used to improve the robustness of a 3d hand tracker and to synthesize a new hand grasping a new object. For this, we integrate the prior into a unified belief propagation framework for tracking and synthesis.

british machine vision conference | 2009

Hough transform-based mouth localization for audio-visual speech recognition

Gabriele Fanelli; Jürgen Gall; Luc Van Gool

We present a novel method for mouth localization in the context of multimodal speech recognition where audio and visual cues are fused to improve the speech recognition accuracy. While facial feature points like mouth corners or lip contours are commonly used to estimate at least scale, position, and orientation of the mouth, we propose a Hough transform-based method. Instead of relying on a predefined sparse subset of mouth features, it casts probabilistic votes for the mouth center from several patches in the neighborhood and accumulates the votes in a Hough image. This makes the localization more robust as it does not rely on the detection of a single feature. In addition, we exploit the different shape properties of eyes and mouth in order to localize the mouth more efficiently. Using the rotation invariant representation of the iris, scale and orientation can be efficiently inferred from the localized eye positions. The superior accuracy of our method and quantitative improvements for audio-visual speech recognition over monomodal approaches are demonstrated on two datasets.

workshop on applications of computer vision | 2009

A comparison of 3d model-based tracking approaches for human motion capture in uncontrolled environments

Mohammed Shaheen; Jürgen Gall; Robert Strzodka; Luc Van Gool; Hans-Peter Seidel

This work addresses the problem of tracking humans with skeleton-based shape models where video footage is acquired by multiple cameras. Since the shape deformations are parameterized by the skeleton, the position, orientation, and configuration of the human skeleton are estimated such that the deformed shape model is best explained by the image data. To solve this problem, several algorithms have been proposed over the last years. The approaches usually rely on filtering, local optimization, or global optimization. The global optimization algorithms can be further divided into single hypothesis (SHO) and multiple hypothesis optimization (MHO). We briefly compare the underlying mathematical models and evaluate the performance of one representative algorithm for each class. Furthermore, we compare several likelihoods and parameter settings with respect to accuracy and computation cost. A thorough evaluation is performed on two sequences with uncontrolled lighting conditions and non-static background. In addition, we demonstrate the impact of the likelihood on the HumanEva benchmark. Our results provide a guidance on algorithm design for different applications related to human motion capture.

Untitled Event | 2009

Class-Specific Hough Forests for Object Detection

Jürgen Gall; Victor S. Lempitsky

Untitled Event | 2009

Motion Capture Using Joint Skeleton Tracking and Surface Estimation

Jürgen Gall; Carsten Stoll; Edilson de Aguiar; Christian Theobalt; Bodo Rosenhahn; Hans-Peter Seidel

Untitled Event | 2006

Learning for Multi-view 3D Tracking in the Context of Particle Filters

Jürgen Gall; Bodo Rosenhahn; Thomas Brox; Hans-Peter Seidel; George Bebis; Richard Boyle; Bahram Parvin; Darko Koracin; Paolo Remagnino; Ara V. Nefian; Gopi Meenakshisundaram; Valerio Pascucci; Jiri Zara; Jose Molineros; Holger Theisel; Tom Malzbender

international conference on computer vision | 2012