Andreas Geiger | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andreas Geiger is active.

Explore More

Publication

Featured researches published by Andreas Geiger.

computer vision and pattern recognition | 2012

Are we ready for autonomous driving? The KITTI vision benchmark suite

Andreas Geiger; Philip Lenz; Raquel Urtasun

Today, visual recognition systems are still rarely employed in robotics applications. Perhaps one of the main reasons for this is the lack of demanding benchmarks that mimic such scenarios. In this paper, we take advantage of our autonomous driving platform to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection. Our recording platform is equipped with four high resolution video cameras, a Velodyne laser scanner and a state-of-the-art localization system. Our benchmarks comprise 389 stereo and optical flow image pairs, stereo visual odometry sequences of 39.2 km length, and more than 200k 3D object annotations captured in cluttered scenarios (up to 15 cars and 30 pedestrians are visible per image). Results from state-of-the-art algorithms reveal that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world. Our goal is to reduce this bias by providing challenging benchmarks with novel difficulties to the computer vision community. Our benchmarks are available online at: www.cvlibs.net/datasets/kitti.

The International Journal of Robotics Research | 2013

Vision meets robotics: The KITTI dataset

Andreas Geiger; Philip Lenz; Christoph Stiller; Raquel Urtasun

We present a novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research. In total, we recorded 6 hours of traffic scenarios at 10–100 Hz using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras, a Velodyne 3D laser scanner and a high-precision GPS/IMU inertial navigation system. The scenarios are diverse, capturing real-world traffic situations, and range from freeways over rural areas to inner-city scenes with many static and dynamic objects. Our data is calibrated, synchronized and timestamped, and we provide the rectified and raw image sequences. Our dataset also contains object labels in the form of 3D tracklets, and we provide online benchmarks for stereo, optical flow, object detection and other tasks. This paper describes our recording platform, the data format and the utilities that we provide.

asian conference on computer vision | 2010

Efficient large-scale stereo matching

Andreas Geiger; Martin Roser; Raquel Urtasun

In this paper we propose a novel approach to binocular stereo for fast matching of high-resolution images. Our approach builds a prior on the disparities by forming a triangulation on a set of support points which can be robustly matched, reducing the matching ambiguities of the remaining points. This allows for efficient exploitation of the disparity search space, yielding accurate dense reconstruction without the need for global optimization. Moreover, our method automatically determines the disparity range and can be easily parallelized. We demonstrate the effectiveness of our approach on the large-scale Middlebury benchmark, and show that state-of-the-art performance can be achieved with significant speedups. Computing the left and right disparity maps for a one Megapixel image pair takes about one second on a single CPU core.

ieee intelligent vehicles symposium | 2010

Visual odometry based on stereo image sequences with RANSAC-based outlier rejection scheme

Bernd Kitt; Andreas Geiger; Henning Lategahn

A common prerequisite for many vision-based driver assistance systems is the knowledge of the vehicles own movement. In this paper we propose a novel approach for estimating the egomotion of the vehicle from a sequence of stereo images. Our method is directly based on the trifocal geometry between image triples, thus no time expensive recovery of the 3-dimensional scene structure is needed. The only assumption we make is a known camera geometry, where the calibration may also vary over time. We employ an Iterated Sigma Point Kalman Filter in combination with a RANSAC-based outlier rejection scheme which yields robust frame-to-frame motion estimation even in dynamic environments. A high-accuracy inertial navigation system is used to evaluate our results on challenging real-world video sequences. Experiments show that our approach is clearly superior compared to other filtering techniques in terms of both, accuracy and run-time.

computer vision and pattern recognition | 2015

Object scene flow for autonomous vehicles

Moritz Menze; Andreas Geiger

This paper proposes a novel model and dataset for 3D scene flow estimation with an application to autonomous driving. Taking advantage of the fact that outdoor scenes often decompose into a small number of independently moving objects, we represent each element in the scene by its rigid motion parameters and each superpixel by a 3D plane as well as an index to the corresponding object. This minimal representation increases robustness and leads to a discrete-continuous CRF where the data term decomposes into pairwise potentials between superpixels and objects. Moreover, our model intrinsically segments the scene into its constituting dynamic components. We demonstrate the performance of our model on existing benchmarks as well as a novel realistic dataset with scene flow ground truth. We obtain this dataset by annotating 400 dynamic scenes from the KITTI raw data collection using detailed 3D CAD models for all vehicles in motion. Our experiments also reveal novel challenges which cannot be handled by existing methods.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2014

3D Traffic Scene Understanding From Movable Platforms

Andreas Geiger; Martin Lauer; Christian Wojek; Christoph Stiller; Raquel Urtasun

In this paper, we present a novel probabilistic generative model for multi-object traffic scene understanding from movable platforms which reasons jointly about the 3D scene layout as well as the location and orientation of objects in the scene. In particular, the scene topology, geometry, and traffic activities are inferred from short video sequences. Inspired by the impressive driving capabilities of humans, our model does not rely on GPS, lidar, or map knowledge. Instead, it takes advantage of a diverse set of visual cues in the form of vehicle tracklets, vanishing points, semantic scene labels, scene flow, and occupancy grids. For each of these cues, we propose likelihood functions that are integrated into a probabilistic generative model. We learn all model parameters from training data using contrastive divergence. Experiments conducted on videos of 113 representative intersections show that our approach successfully infers the correct layout in a variety of very challenging scenarios. To evaluate the importance of each feature cue, experiments using different feature combinations are conducted. Furthermore, we show how by employing context derived from the proposed method we are able to improve over the state-of-the-art in terms of object detection and object orientation estimation in challenging and cluttered urban environments.

international conference on intelligent transportation systems | 2013

A new performance measure and evaluation benchmark for road detection algorithms

Jannik Fritsch; Tobias Kühnl; Andreas Geiger

Detecting the road area and ego-lane ahead of a vehicle is central to modern driver assistance systems. While lane-detection on well-marked roads is already available in modern vehicles, finding the boundaries of unmarked or weakly marked roads and lanes as they appear in inner-city and rural environments remains an unsolved problem due to the high variability in scene layout and illumination conditions, amongst others. While recent years have witnessed great interest in this subject, to date no commonly agreed upon benchmark exists, rendering a fair comparison amongst methods difficult. In this paper, we introduce a novel open-access dataset and benchmark for road area and ego-lane detection. Our dataset comprises 600 annotated training and test images of high variability from the KITTI autonomous driving project, capturing a broad spectrum of urban road scenes. For evaluation, we propose to use the 2D Birds Eye View (BEV) space as vehicle control usually happens in this 2D world, requiring detection results to be represented in this very same space. Furthermore, we propose a novel, behavior-based metric which judges the utility of the extracted ego-lane area for driver assistance applications by fitting a driving corridor to the road detection results in the BEV. We believe this to be important for a meaningful evaluation as pixel-level performance is of limited value for vehicle control. State-of-the-art road detection algorithms are used to demonstrate results using classical pixel-level metrics in perspective and BEV space as well as the novel behavior-based performance measure. All data and annotations are made publicly available on the KITTI online evaluation website in order to serve as a common benchmark for road terrain detection algorithms.

international conference on robotics and automation | 2012

Automatic camera and range sensor calibration using a single shot

Andreas Geiger; Frank Moosmann; Omer Car; Bernhard Schuster

As a core robotic and vision problem, camera and range sensor calibration have been researched intensely over the last decades. However, robotic research efforts still often get heavily delayed by the requirement of setting up a calibrated system consisting of multiple cameras and range measurement units. With regard to removing this burden, we present a toolbox with web interface for fully automatic camera-to-camera and camera-to-range calibration. Our system is easy to setup and recovers intrinsic and extrinsic camera parameters as well as the transformation between cameras and range sensors within one minute. In contrast to existing calibration approaches, which often require user intervention, the proposed method is robust to varying imaging conditions, fully automatic, and easy to use since a single image and range scan proves sufficient for most calibration scenarios. Experimentally, we demonstrate that the proposed checkerboard corner detector significantly outperforms current state-of-the-art. Furthermore, the proposed camera-to-range registration method is able to discover multiple solutions in the case of ambiguities. Experiments using a variety of sensors such as grayscale and color cameras, the Kinect 3D sensor and the Velodyne HDL-64 laser scanner show the robustness of our method in different indoor and outdoor settings and under various lighting conditions.

IEEE Transactions on Intelligent Transportation Systems | 2012

Team AnnieWAY's Entry to the 2011 Grand Cooperative Driving Challenge

Andreas Geiger; Martin Lauer; Frank Moosmann; Benjamin Ranft; Holger H. Rapp; Christoph Stiller; Julius Ziegler

In this paper, we present the concepts and methods developed for the autonomous vehicle known as AnnieWAY, which is our winning entry to the 2011 Grand Cooperative Driving Challenge. We describe algorithms for sensor fusion, vehicle-to-vehicle communication, and cooperative control. Furthermore, we analyze the performance of the proposed methods and compare them with those of competing teams. We close with our results from the competition and lessons learned.

computer vision and pattern recognition | 2013

Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization

Marcus A. Brubaker; Andreas Geiger; Raquel Urtasun

In this paper we propose an affordable solution to self-localization, which utilizes visual odometry and road maps as the only inputs. To this end, we present a probabilistic model as well as an efficient approximate inference algorithm, which is able to utilize distributed computation to meet the real-time requirements of autonomous systems. Because of the probabilistic nature of the model we are able to cope with uncertainty due to noisy visual odometry and inherent ambiguities in the map (e.g., in a Manhattan world). By exploiting freely available, community developed maps and visual odometry measurements, we are able to localize a vehicle up to 3m after only a few seconds of driving on maps which contain more than 2,150km of drivable roads.

Explore More