Martin Böhme | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Böhme is active.

Explore More

Publication

Featured researches published by Martin Böhme.

international symposium on signals, circuits and systems | 2007

Geometric Invariants for Facial Feature Tracking with 3D TOF Cameras

Martin Haker; Martin Böhme; Thomas Martinetz; Erhardt Barth

This paper presents a very simple feature-based nose detector in combined range and amplitude data obtained by a 3D time-of-flight camera. The robust localization of image attributes, such as the nose, can be used for accurate object tracking. We use geometric features that are related to the intrinsic dimensionality of surfaces. To find a nose in the image, the features are computed per pixel; pixels whose feature values lie inside a certain bounding box in feature space are classified as nose pixels, and all other pixels are classified as non-nose pixels. The extent of the bounding box is learned on a labeled training set. Despite its simplicity this procedure generalizes well, that is, a bounding box determined for one group of subjects accurately detects noses of other subjects. The performance of the detector is demonstrated by robustly identifying the nose of a person in a wide range of head orientations. An important result is that the combination of both range and amplitude data dramatically improves the accuracy in comparison to the use of a single type of data. This is reflected in the equal error rates (EER) obtained on a database of head poses. Using only the range data, we detect noses with an EER of 0.66. Results on the amplitude data are slightly better with an EER of 0.42. The combination of both types of data yields a substantially improved EER of 0.03.

Neurocomputing | 2006

Eye Movement Predictions on Natural Videos

Martin Böhme; Michael Dorr; Christopher Krause; Thomas Martinetz; Erhardt Barth

Abstract We analyze the predictability of eye movements of observers viewing dynamic scenes. We first assess the effectiveness of model-based prediction. The model is divided into intersaccade prediction, which is based on a limited history of attended locations, and saccade prediction, which is based on a list of salient locations. The quality of the predictions and of the underlying saliency maps is tested on a large set of eye movement data recorded on high-resolution real-world video sequences. In addition, frequently fixated locations are used to predict individual eye movements to obtain a reference for model-based predictions.

eye tracking research & application | 2008

A software framework for simulating eye trackers

Martin Böhme; Michael Dorr; Mathis Graw; Thomas Martinetz; Erhardt Barth

We describe an open-source software framework that simulates the measurements made using one or several cameras in a video-oculographic eye tracker. The framework can be used to compare objectively the performance of different eye tracking setups (number and placement of cameras and light sources) and gaze estimation algorithms. We demonstrate the utility of the framework by using it to compare two remote eye tracking methods, one using a single camera, the other using two cameras.

perception and interactive technologies | 2006

A single-camera remote eye tracker

André Meyer; Martin Böhme; Thomas Martinetz; Erhardt Barth

Many eye-tracking systems either require the user to keep their head still or involve cameras or other equipment mounted on the users head. While acceptable for research applications, these limitations make the systems unsatisfactory for prolonged use in interactive applications. Since the goal of our work is to use eye trackers for improved visual communication through gaze guidance [1,2] and for Augmentative and Alternative Communication (AAC) [3], we are interested in less invasive eye tracking techniques.

NeuroImage | 2002

A Robust Transcortical Profile Scanner for Generating 2-D Traverses in Histological Sections of Richly Curved Cortical Courses

Oliver Schmitt; Martin Böhme

Quantitative analysis of the cerebral cortex has become more important since neuroimaging methods have revealed many subfunctions of cortical regions that were thought to be typical for only one specific function. Furthermore, it is often unknown if a certain area may be subdivided observer independently into subareas. These questions lead to an analytical problem. How can we analyze the cytoarchitecture of the human cerebral cortex in a quantitative manner in order to confirm classical transition regions between distinct areas and to detect new ones. Scanning the cerebral cortex is difficult because it presents a richly curved course and sectioning always leads to partially nonperpendicular sectioned regions of the tissue. Therefore, different methods were tested to determine which of them are most reliable with respect to generating perpendicular testlines in the cerebral cortex. We introduce a new technique based on electrical field theory. The results of this technique are compared with those of conventional techniques. It was found that straight traverses generated by the electrodynamic model present significantly smaller intertraversal differences than the conventional approaches.

electronic imaging | 2006

Guiding the mind's eye: improving communication and vision by external control of the scanpath

Erhardt Barth; Michael Dorr; Martin Böhme; Karl R. Gegenfurtner; Thomas Martinetz

Larry Stark has emphasised that what we visually perceive is very much determined by the scanpath, i.e. the pattern of eye movements. Inspired by his view, we have studied the implications of the scanpath for visual communication and came up with the idea to not only sense and analyse eye movements, but also guide them by using a special kind of gaze-contingent information display. Our goal is to integrate gaze into visual communication systems by measuring and guiding eye movements. For guidance, we first predict a set of about 10 salient locations. We then change the probability for one of these candidates to be attended: for one candidate the probability is increased, for the others it is decreased. To increase saliency, for example, we add red dots that are displayed very briefly such that they are hardly perceived consciously. To decrease the probability, for example, we locally reduce the temporal frequency content. Again, if performed in a gaze-contingent fashion with low latencies, these manipulations remain unnoticed. Overall, the goal is to find the real-time video transformation minimising the difference between the actual and the desired scanpath without being obtrusive. Applications are in the area of vision-based communication (better control of what information is conveyed) and augmented vision and learning (guide a persons gaze by the gaze of an expert or a computer-vision system). We believe that our research is very much in the spirit of Larry Starks views on visual perception and the close link between vision research and engineering.

Dyn3D '09 Proceedings of the DAGM 2009 Workshop on Dynamic 3D Imaging | 2009

Self-Organizing Maps for Pose Estimation with a Time-of-Flight Camera

Martin Haker; Martin Böhme; Thomas Martinetz; Erhardt Barth

We describe a technique for estimating human pose from an image sequence captured by a time-of-flight camera. The pose estimation is derived from a simple model of the human body that we fit to the data in 3D space. The model is represented by a graph consisting of 44 vertices for the upper torso, head, and arms. The anatomy of these body parts is encoded by the edges, i.e. an arm is represented by a chain of pairwise connected vertices whereas the torso consists of a 2-dimensional grid. The model can easily be extended to the representation of legs by adding further chains of pairwise connected vertices to the lower torso. The model is fit to the data in 3D space by employing an iterative update rule common to self-organizing maps. Despite the simplicity of the model, it captures the human pose robustly and can thus be used for tracking the major body parts, such as arms, hands, and head. The accuracy of the tracking is around 5---6 cm root mean square (RMS) for the head and shoulders and around 2 cm RMS for the head. The implementation of the procedure is straightforward and real-time capable.

computer vision and pattern recognition | 2008

Scale-invariant range features for time-of-flight camera applications

Martin Haker; Martin Böhme; Thomas Martinetz; Erhardt Barth

We describe a technique for computing scale-invariant features on range maps produced by a range sensor, such as a time-of-flight camera. Scale invariance is achieved by computing the features on the reconstructed three-dimensional surface of the object. The technique is general and can be applied to a wide range of operators. Features are computed in the frequency domain; the transform from the irregularly sampled mesh to the frequency domain uses the Nonequispaced Fast Fourier Transform. We demonstrate the technique on a facial feature detection task. On a dataset containing faces at various distances from the camera, the equal error rate (EER) for the case of scale-invariant features is halved compared to features computed on the range map in the conventional way. When the scale-invariant range features are combined with intensity features, the error rate on the test set reduces to zero.

international symposium on signals, circuits and systems | 2009

Head tracking with combined face and nose detection

Martin Böhme; Martin Haker; Thomas Martinetz; Erhardt Barth

We present a facial feature detector for time-of-flight (TOF) cameras that extends previous work by combining a nose detector based on geometric features with a face detector. The goal is to prevent false detections outside the area of the face. To detect the nose in the image, we first compute the geometric features per pixel. We then augment these geometric features with two additional features: The horizontal and vertical distance to the most likely face detected by a cascade-of-boosted-ensembles face detector. We use a very simple classifier based on an axis-aligned bounding box in feature space; pixels whose feature values fall within the box are classified as nose pixels, and all other pixels are classified as “non-nose”. The extent of the bounding box is learned on a labeled training set. Despite its simiplicity, this detector already delivers satisfactory results on the geometric features alone; adding the face detector improves the equal error rate (EER) from 22.2% (without face detector) to 10.4% (with face detector). (Note when comparing with our previous results from [1] and [2] that, in contrast to this paper, the test data used there did not contain scale variations.)

GW'09 Proceedings of the 8th international conference on Gesture in Embodied Communication and Human-Computer Interaction | 2009

Deictic gestures with a time-of-flight camera

Martin Haker; Martin Böhme; Thomas Martinetz; Erhardt Barth

We present a robust detector for deictic gestures based on a time-of-flight (TOF) camera, a combined range and intensity image sensor. Pointing direction is used to determine whether the gesture is intended for the system at all and to assign different meanings to the same gesture depending on pointing direction. We use the gestures to control a slideshow presentation: Making a “thumbs-up” gesture while pointing to the left or right of the screen switches to the previous or next slide. Pointing at the screen causes a “virtual laser pointer” to appear. Since the pointing direction is estimated in 3D, the user can move freely within the field of view of the camera after the system was calibrated. The pointing direction is measured with an absolute accuracy of 0.6 degrees and a measurement noise of 0.9 degrees near the center of the screen.

Explore More