Jörg Stückler | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jörg Stückler is active.

Explore More

Publication

Featured researches published by Jörg Stückler.

intelligent robots and systems | 2015

Large-scale direct SLAM with stereo cameras

Jakob Engel; Jörg Stückler; Daniel Cremers

We propose a novel Large-Scale Direct SLAM algorithm for stereo cameras (Stereo LSD-SLAM) that runs in real-time at high frame rate on standard CPUs. In contrast to sparse interest-point based methods, our approach aligns images directly based on the photoconsistency of all high-contrast pixels, including corners, edges and high texture areas. It concurrently estimates the depth at these pixels from two types of stereo cues: Static stereo through the fixed-baseline stereo camera setup as well as temporal multi-view stereo exploiting the camera motion. By incorporating both disparity sources, our algorithm can even estimate depth of pixels that are under-constrained when only using fixed-baseline stereo. Using a fixed baseline, on the other hand, avoids scale-drift that typically occurs in pure monocular SLAM.We furthermore propose a robust approach to enforce illumination invariance, capable of handling aggressive brightness changes between frames - greatly improving the performance in realistic settings. In experiments, we demonstrate state-of-the-art results on stereo SLAM benchmarks such as Kitti or challenging datasets from the EuRoC Challenge 3 for micro aerial vehicles.

Journal of Visual Communication and Image Representation | 2014

Multi-resolution surfel maps for efficient dense 3D modeling and tracking

Jörg Stückler; Sven Behnke

Highlights? Multi-resolution surfel maps as compact RGB-D image representation. ? Maps support rapid extraction from images and fast registration on CPU. ? Object and scene reconstruction through on-line graph optimization of key view poses. ? Real-time object tracking from a wide range of view angles and distances. ? State-of-the-art results of image registration, SLAM, and tracking on benchmark datasets. Building consistent models of objects and scenes from moving sensors is an important prerequisite for many recognition, manipulation, and navigation tasks. Our approach integrates color and depth measurements seamlessly in a multi-resolution map representation. We process image sequences from RGB-D cameras and consider their typical noise properties. In order to align the images, we register view-based maps efficiently on a CPU using multi-resolution strategies. For simultaneous localization and mapping (SLAM), we determine the motion of the camera by registering maps of key views and optimize the trajectory in a probabilistic framework. We create object models and map indoor scenes using our SLAM approach which includes randomized loop closing to avoid drift. Camera motion relative to the acquired models is then tracked in real-time based on our registration method. We benchmark our method on publicly available RGB-D datasets, demonstrate accuracy, efficiency, and robustness of our method, and compare it with state-of-the-art approaches. We also report on several successful public demonstrations where it was used in mobile manipulation tasks.

human-robot interaction | 2011

Learning to interpret pointing gestures with a time-of-flight camera

David Droeschel; Jörg Stückler; Sven Behnke

Pointing gestures are a common and intuitive way to draw somebodys attention to a certain object. While humans can easily interpret robot gestures, the perception of human behavior using robot sensors is more difficult. In this work, we propose a method for perceiving pointing gestures using a Time-of-Flight (ToF) camera. To determine the intended pointing target, frequently the line between a persons eyes and hand is assumed to be the pointing direction. However, since people tend to keep the line-of-sight free while they are pointing, this simple approximation is inadequate. Moreover, depending on the distance and angle to the pointing target, the line between shoulder and hand or elbow and hand may yield better interpretations of the pointing direction. In order to achieve a better estimate, we extract a set of body features from depth and amplitude images of a ToF camera and train a model of pointing directions using Gaussian Process Regression. We evaluate the accuracy of the estimated pointing direction in a quantitative study. The results show that our learned model achieves far better accuracy than simple criteria like head-hand, shoulder-hand, or elbow-hand line.

computer vision and pattern recognition | 2017

Semi-Supervised Deep Learning for Monocular Depth Map Prediction

Yevhen Kuznietsov; Jörg Stückler; Bastian Leibe

Supervised deep learning often suffers from the lack of sufficient training data. Specifically in the context of monocular depth map prediction, it is barely possible to determine dense ground truth depth images in realistic dynamic outdoor environments. When using LiDAR sensors, for instance, noise is present in the distance measurements, the calibration between sensors cannot be perfect, and the measurements are typically much sparser than the camera images. In this paper, we propose a novel approach to depth map prediction from monocular images that learns in a semi-supervised way. While we use sparse ground-truth depth for supervised learning, we also enforce our deep network to produce photoconsistent dense depth maps in a stereo setup using a direct image alignment loss. In experiments we demonstrate superior performance in depth map prediction from single images compared to the state-of-the-art methods.

international conference on multisensor fusion and integration for intelligent systems | 2012

Integrating depth and color cues for dense multi-resolution scene mapping using RGB-D cameras

Jörg Stückler; Sven Behnke

The mapping of environments is a prerequisite for many navigation and manipulation tasks. We propose a novel method for acquiring 3D maps of indoor scenes from a freely moving RGB-D camera. Our approach integrates color and depth cues seamlessly in a multi-resolution map representation. We consider measurement noise characteristics and exploit dense image neighborhood to rapidly extract maps from RGB-D images. An efficient ICP variant allows maps to be registered in real-time at VGA resolution on a CPU. For simultaneous localization and mapping, we extract key views and optimize the trajectory in a probabilistic framework. Finally, we propose an efficient randomized loop-closure technique that is designed for on-line operation. We benchmark our method on a publicly available RGB-D dataset and compare it with a state-of-the-art approach that uses sparse image features.

international conference on robotics and automation | 2013

Mobile bin picking with an anthropomorphic service robot

Matthias Nieuwenhuisen; David Droeschel; Dirk Holz; Jörg Stückler; Alexander Berner; Jun Li; Reinhard Klein; Sven Behnke

Grasping individual objects from an unordered pile in a box has been investigated in static scenarios so far. In this paper, we demonstrate bin picking with an anthropomorphic mobile robot. To this end, we extend global navigation techniques by precise local alignment with a transport box. Objects are detected in range images using a shape primitive-based approach. Our approach learns object models from single scans and employs active perception to cope with severe occlusions. Grasps and arm motions are planned in an efficient local multiresolution height map. All components are integrated and evaluated in a bin picking and part delivery task.

international conference on robotics and automation | 2016

Direct visual-inertial odometry with stereo cameras

Vladyslav C. Usenko; Jakob Engel; Jörg Stückler; Daniel Cremers

We propose a novel direct visual-inertial odometry method for stereo cameras. Camera pose, velocity and IMU biases are simultaneously estimated by minimizing a combined photometric and inertial energy functional. This allows us to exploit the complementary nature of vision and inertial data. At the same time, and in contrast to all existing visual-inertial methods, our approach is fully direct: geometry is estimated in the form of semi-dense depth maps instead of manually designed sparse keypoints. Depth information is obtained both from static stereo - relating the fixed-baseline images of the stereo camera - and temporal stereo - relating images from the same camera, taken at different points in time. We show that our method outperforms not only vision-only or loosely coupled approaches, but also can achieve more accurate results than state-of-the-art keypoint-based methods on different datasets, including rapid motion and significant illumination changes. In addition, our method provides high-fidelity semi-dense, metric reconstructions of the environment, and runs in real-time on a CPU.

international conference on robotics and automation | 2014

Local Multi-Resolution Representation for 6D Motion Estimation and Mapping with a Continuously Rotating 3D Laser Scanner

David Droeschel; Jörg Stückler; Sven Behnke

Micro aerial vehicles (MAV) pose a challenge in designing sensory systems and algorithms due to their size and weight constraints and limited computing power. We present an efficient 3D multi-resolution map that we use to aggregate measurements from a lightweight continuously rotating laser scanner. We estimate the robots motion by means of visual odometry and scan registration, aligning consecutive 3D scans with an incrementally built map. By using local multi-resolution, we gain computational efficiency by having a high resolution in the near vicinity of the robot and a lower resolution with increasing distance from the robot, which correlates with the sensors characteristics in relative distance accuracy and measurement density. Compared to uniform grids, local multi-resolution leads to the use of fewer grid cells without loosing information and consequently results in lower computational costs. We efficiently and accurately register new 3D scans with the map in order to estimate the motion of the MAV and update the map in-flight. In experiments, we demonstrate superior accuracy and efficiency of our registration approach compared to state-of-the-art methods such as GICP. Our approach builds an accurate 3D obstacle map and estimates the vehicles trajectory in real-time.

ieee-ras international conference on humanoid robots | 2009

Integrating indoor mobility, object manipulation, and intuitive interaction for domestic service tasks

Jörg Stückler; Sven Behnke

Domestic service tasks require three main skills from autonomous robots: robust navigation, mobile manipulation, and intuitive communication with the users. Most robot platforms, however, support only one or two of the above skills. In this paper we present Dynamaid, a new robot platform for research on domestic service applications. For robust navigation, Dynamaid has a base with four individually steerable differential wheel pairs, which allow omnidirectional motion. For mobile manipulation, Dynamaid is additionally equipped with two anthropomorphic arms that include a gripper, and with a trunk that can be lifted as well as twisted. For intuitive multimodal communication, the robot has a microphone, stereo cameras, and a movable head. Its humanoid upper body supports natural interaction. It can perceive persons in its environment, recognize and synthesize speech. We developed software for the tests of the RoboCup@Home competitions, which serve as benchmarks for domestic service robots. With Dynamaid and our communication robot Robotinho, our team Nimbro@Home took part in the RoboCup German Open 2009 and RoboCup 2009 competitions in which we came in second and third, respectively. We also won the innovation award for innovative robot design, empathic behaviors, and robot-robot cooperation.

Journal of Real-time Image Processing | 2015

Dense real-time mapping of object-class semantics from RGB-D video

Jörg Stückler; Benedikt Waldvogel; Hannes Schulz; Sven Behnke

We propose a real-time approach to learn semantic maps from moving RGB-D cameras. Our method models geometry, appearance, and semantic labeling of surfaces. We recover camera pose using simultaneous localization and mapping while concurrently recognizing and segmenting object classes in the images. Our object-class segmentation approach is based on random decision forests and yields a dense probabilistic labeling of each image. We implemented it on GPU to achieve a high frame rate. The probabilistic segmentation is fused in octree-based 3D maps within a Bayesian framework. In this way, image segmentations from various view points are integrated within a 3D map which improves segmentation quality. We evaluate our system on a large benchmark dataset and demonstrate state-of-the-art recognition performance of our object-class segmentation and semantic mapping approaches.

Explore More