Bernd Kitt
Karlsruhe Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bernd Kitt.
ieee intelligent vehicles symposium | 2010
Bernd Kitt; Andreas Geiger; Henning Lategahn
A common prerequisite for many vision-based driver assistance systems is the knowledge of the vehicles own movement. In this paper we propose a novel approach for estimating the egomotion of the vehicle from a sequence of stereo images. Our method is directly based on the trifocal geometry between image triples, thus no time expensive recovery of the 3-dimensional scene structure is needed. The only assumption we make is a known camera geometry, where the calibration may also vary over time. We employ an Iterated Sigma Point Kalman Filter in combination with a RANSAC-based outlier rejection scheme which yields robust frame-to-frame motion estimation even in dynamic environments. A high-accuracy inertial navigation system is used to evaluate our results on challenging real-world video sequences. Experiments show that our approach is clearly superior compared to other filtering techniques in terms of both, accuracy and run-time.
international conference on robotics and automation | 2011
Henning Lategahn; Andreas Geiger; Bernd Kitt
Simultaneous Localization and Mapping (SLAM) and Visual SLAM (V-SLAM) in particular have been an active area of research lately. In V-SLAM the main focus is most often laid on the localization part of the problem allowing for a drift free motion estimate. To this end, a sparse set of landmarks is tracked and their position is estimated. However, this set of landmarks (rendering the map) is often too sparse for tasks in autonomous driving such as navigation, path planning, obstacle avoidance etc. Some methods keep the raw measurements for past robot poses to address the sparsity problem often resulting in a pose only SLAM akin to laser scanner SLAM. For the stereo case, this is however impractical due to the high noise of stereo reconstructed point clouds. In this paper we propose a dense stereo V-SLAM algorithm that estimates a dense 3D map representation which is more accurate than raw stereo measurements. Thereto, we run a sparse V-SLAM system, take the resulting pose estimates to compute a locally dense representation from dense stereo correspondences. This dense representation is expressed in local coordinate systems which are tracked as part of the SLAM estimate. This allows the dense part to be continuously updated. Our system is driven by visual odometry priors to achieve high robustness when tracking landmarks. Moreover, the sparse part of the SLAM system uses recently published sub mapping techniques to achieve constant runtime complexity most of the time. The improved accuracy over raw stereo measurements is shown in a Monte Carlo simulation. Finally, we demonstrate the feasibility of our method by presenting outdoor experiments of a car like robot.
ieee intelligent vehicles symposium | 2013
Henning Lategahn; Johannes Beck; Bernd Kitt; Christoph Stiller
Place recognition for loop closure detection lies at the heart of every Simultaneous Localization and Mapping (SLAM) method. Recently methods that use cameras and describe the entire image by one holistic feature vector have experienced a resurgence. Despite the success of these methods, it remains unclear how a descriptor should be constructed for this particular purpose. The problem of choosing the right descriptor becomes even more pronounced in the context of life long mapping. The appearance of a place may vary considerably under different illumination conditions and over the course of a day. None of the handcrafted descriptors published in literature are particularly designed for this purpose. Herein, we propose to use a set of elementary building blocks from which millions of different descriptors can be constructed automatically. Moreover, we present an evaluation function which evaluates the performance of a given image descriptor for place recognition under severe lighting changes. Finally we present an algorithm to efficiently search the space of descriptors to find the best suited one. Evaluating the trained descriptor on a test set shows a clear superiority over its hand crafted counter parts like BRIEF and U-SURF. Finally we show how loop closures can be reliably detected using the automatically learned descriptor. Two overlapping image sequences from two different days and times are merged into one pose graph. The resulting merged pose graph is optimized and does not contain a single false link while at the same time all true loop closures were detected correctly. The descriptor and the place recognizer source code is published with datasets on http://www.mrt.kit.edu/libDird.php.
intelligent robots and systems | 2011
Andrew Chambers; Supreeth Achar; Stephen Nuske; Joern Rehder; Bernd Kitt; Lyle Chamberlain; Justin Haines; Sebastian Scherer; Sanjiv Singh
Rivers with heavy vegetation are hard to map from the air. Here we consider the task of mapping their course and the vegetation along the shores with the specific intent of determining river width and canopy height. A complication in such riverine environments is that only intermittent GPS may be available depending on the thickness of the surrounding canopy. We present a multimodal perception system to be used for the active exploration and mapping of a river from a small rotorcraft flying a few meters above the water. We describe three key components that use computer vision, laser scanning, and inertial sensing to follow the river without the use of a prior map, estimate motion of the rotorcraft, ensure collision-free operation, and create a three dimensional representation of the riverine environment. While the ability to fly simplifies the navigation problem, it also introduces an additional set of constraints in terms of size, weight and power. Hence, our solutions are cognizant of the need to perform multi-kilometer missions with a small payload. We present experimental results along a 2km loop of river using a surrogate system.
ieee intelligent vehicles symposium | 2010
Henning Lategahn; Wojciech Waclaw Derendarz; Thorsten Graf; Bernd Kitt; Jan Effertz
We present a complete processing chain for computing 2D occupancy grids from image sequences. A multi layer grid is introduced which serves several purposes. First the 3D points reconstructed from the images are distributed onto the underlying grid. Thereafter a virtual measurement is computed for each cell thus reducing computational complexity and rejecting potential outliers. Subsequently a height profile is updated from which the current measurement is partitioned into ground and obstacle pixels. Different height profile update strategies are tested and compared yielding a stable height profile estimation. Lastly the occupancy layer of the grid is updated. To asses the algorithm we evaluate it quantitatively by comparing the output of it to ground truth data illustrating its accuracy. We show the applicability of the algorithm by using both, dense stereo reconstructed and sparse structure and motion points. The algorithm was implemented and run online on one of our test vehicles in real time.
international conference on intelligent transportation systems | 2010
Bernd Kitt; Benjamin Ranft; Henning Lategahn
In this paper we propose an approach for dynamic scene perception from a moving vehicle equipped with a stereo camera rig. The approach is solely based on visual information, hence it is applicable to a large class of autonomous robots working in indoor as well as in outdoor environments. The proposed approach consists of an egomotion estimation based on disparity and optical flow using the Longuet-Higgins-Equations combined with an implicit extended Kalman-Filter. Based on this egomotion estimation a moving object detection and tracking is performed. Each tracked object is labeled with a unique ID while visible in the images. The proposed algorithm was evaluated on numerous challenging real world image sequences.
intelligent robots and systems | 2010
Bernd Kitt; Frank Moosmann; Christoph Stiller
Visually estimating a robots own motion has been an active field of research within the last years. Though impressive results have been reported, some application areas still exhibit huge challenges. Especially for car-like robots in urban environments even the most robust estimation techniques fail due to a vast portion of independently moving objects. Hence, we move one step further and propose a method that combines ego-motion estimation with low-level object detection. We specifically design the method to be general and applicable in real-time. Pre-classifying interest points is a key step, which rejects matches on possibly moving objects and reduces the computational load of further steps. Employing an Iterated Sigma Point Kalman Filter in combination with a RANSAC based outlier rejection scheme yields a robust frame-to-frame motion estimation even in the case when many independently moving objects cover the image. Extensive experiments show the robustness of the proposed approach in highly dynamic environments with speeds up to 20m/s.
international conference on intelligent transportation systems | 2010
Bernd Kitt; Benjamin Ranft; Henning Lategahn
In this paper we propose a new block-matching based approach for the estimation of nearly dense optical flow fields in image sequences. We focus on applications to autonomous vehicles where a dominant movement of the camera along its optical axis is present. The presented algorithm exploits the geometric relations between the two viewpoints induced by the epipolar geometry, hence it is applicable for the static parts of the scene. These relations are used to remap the images so that the resulting virtual images are similar to images captured by an axial stereo camera setup. This alignment dramatically reduces the computational complexity of the correspondence search and avoids false correspondences e.g. caused by repeated patterns. Experiments on challenging real-world sequences show the accuracy of the proposed approach.
ieee intelligent vehicles symposium | 2010
Andreas Geiger; Bernd Kitt
We present and evaluate a novel scene descriptor for classifying urban traffic by object motion. Atomic 3D flow vectors are extracted and compensated for the vehicles egomotion, using stereo video sequences. Votes cast by each flow vector are accumulated in a birds eye view histogram grid. Since we are directly using low-level object flow, no prior object detection or tracking is needed. We demonstrate the effectiveness of the proposed descriptor by comparing it to two simpler baselines on the task of classifying more than 100 challenging video sequences into intersection and non-intersection scenarios. Our experiments reveal good classification performance in busy traffic situations, making our method a valuable complement to traditional approaches based on lane markings.
international conference on intelligent transportation systems | 2012
Bernd Kitt; Henning Lategahn
Motion is an important clue for many tasks in visual scene perception. In this paper, we present a new matching-based algorithm to estimate nearly dense optical flow fields for the static parts of the scene, i.e. those parts whose motion is induced by the moving observer only. Our algorithm is designed for applications in intelligent vehicles usually equipped with stereo camera rigs. To address the computational effort of matching-based approaches we use constraints arising from the geometry between multiple views. To this end, we compute both an approximated optical flow field and an approximated disparity field between left and right image. Hence, we can predict the position of the corresponding candidate and limit the search space to a small neighborhood around the predicted position leading to near real-time capabilities. Experiments on different challenging real world images show the accuracy and efficiency of the proposed approach.