Davide Scaramuzza | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Davide Scaramuzza is active.

Explore More

Publication

Featured researches published by Davide Scaramuzza.

IEEE Robotics & Automation Magazine | 2011

Visual Odometry [Tutorial]

Davide Scaramuzza; Friedrich Fraundorfer

Visual odometry (VO) is the process of estimating the egomotion of an agent (e.g., vehicle, human, and robot) using only the input of a single or If multiple cameras attached to it. Application domains include robotics, wearable computing, augmented reality, and automotive. The term VO was coined in 2004 by Nister in his landmark paper. The term was chosen for its similarity to wheel odometry, which incrementally estimates the motion of a vehicle by integrating the number of turns of its wheels over time. Likewise, VO operates by incrementally estimating the pose of the vehicle through examination of the changes that motion induces on the images of its onboard cameras. For VO to work effectively, there should be sufficient illumination in the environment and a static scene with enough texture to allow apparent motion to be extracted. Furthermore, consecutive frames should be captured by ensuring that they have sufficient scene overlap.

international conference on robotics and automation | 2014

SVO: Fast semi-direct monocular visual odometry

Christian Forster; Matia Pizzoli; Davide Scaramuzza

We propose a semi-direct monocular visual odometry algorithm that is precise, robust, and faster than current state-of-the-art methods. The semi-direct approach eliminates the need of costly feature extraction and robust matching techniques for motion estimation. Our algorithm operates directly on pixel intensities, which results in subpixel precision at high frame-rates. A probabilistic mapping method that explicitly models outlier measurements is used to estimate 3D points, which results in fewer outliers and more reliable points. Precise and high frame-rate motion estimation brings increased robustness in scenes of little, repetitive, and high-frequency texture. The algorithm is applied to micro-aerial-vehicle state-estimation in GPS-denied environments and runs at 55 frames per second on the onboard embedded computer and at more than 300 frames per second on a consumer laptop. We call our approach SVO (Semi-direct Visual Odometry) and release our implementation as open-source software.

international conference on robotics and automation | 2010

Vision based MAV navigation in unknown and unstructured environments

Michael Blösch; Stephan Weiss; Davide Scaramuzza; Roland Siegwart

Within the research on Micro Aerial Vehicles (MAVs), the field on flight control and autonomous mission execution is one of the most active. A crucial point is the localization of the vehicle, which is especially difficult in unknown, GPS-denied environments. This paper presents a novel vision based approach, where the vehicle is localized using a downward looking monocular camera. A state-of-the-art visual SLAM algorithm tracks the pose of the camera, while, simultaneously, building an incremental map of the surrounding region. Based on this pose estimation a LQG/LTR based controller stabilizes the vehicle at a desired setpoint, making simple maneuvers possible like take-off, hovering, setpoint following or landing. Experimental data show that this approach efficiently controls a helicopter while navigating through an unknown and unstructured environment. To the best of our knowledge, this is the first work describing a micro aerial vehicle able to navigate through an unexplored environment (independently of any external aid like GPS or artificial beacons), which uses a single camera as only exteroceptive sensor.

Journal of Field Robotics | 2011

Monocular-SLAM–based navigation for autonomous micro helicopters in GPS-denied environments

Stephan Weiss; Davide Scaramuzza; Roland Siegwart

Autonomous micro aerial vehicles (MAVs) will soon play a major role in tasks such as search and rescue, environment monitoring, surveillance, and inspection. They allow us to easily access environments to which no humans or other vehicles can get access. This reduces the risk for both the people and the environment. For the above applications, it is, however, a requirement that the vehicle is able to navigate without using GPS, or without relying on a preexisting map, or without specific assumptions about the environment. This will allow operations in unstructured, unknown, and GPS-denied environments. We present a novel solution for the task of autonomous navigation of a micro helicopter through a completely unknown environment by using solely a single camera and inertial sensors onboard. Many existing solutions suffer from the problem of drift in the xy plane or from the dependency on a clean GPS signal. The novelty in the here-presented approach is to use a monocular simultaneous localization and mapping (SLAM) framework to stabilize the vehicle in six degrees of freedom. This way, we overcome the problem of both the drift and the GPS dependency. The pose estimated by the visual SLAM algorithm is used in a linear optimal controller that allows us to perform all basic maneuvers such as hovering, set point and trajectory following, vertical takeoff, and landing. All calculations including SLAM and controller are running in real time and online while the helicopter is flying. No offline processing or preprocessing is done. We show real experiments that demonstrate that the vehicle can fly autonomously in an unknown and unstructured environment. To the best of our knowledge, the here-presented work describes the first aerial vehicle that uses onboard monocular vision as a main sensor to navigate through an unknown GPS-denied environment and independently of any external artificial aids.

IEEE Robotics & Automation Magazine | 2012

Visual Odometry : Part II: Matching, Robustness, Optimization, and Applications

Friedrich Fraundorfer; Davide Scaramuzza

Part II of the tutorial has summarized the remaining building blocks of the VO pipeline: specifically, how to detect and match salient and repeatable features across frames and robust estimation in the presence of outliers and bundle adjustment. In addition, error propagation, applications, and links to publicly available code are included. VO is a well understood and established part of robotics. VO has reached a maturity that has allowed us to successfully use it for certain classes of applications: space, ground, aerial, and underwater. In the presence of loop closures, VO can be used as a building block for a complete SLAM algorithm to reduce motion drift. Challenges that still remain are to develop and demonstrate large-scale and long-term implementations, such as driving autonomous cars for hundreds of miles. Such systems have recently been demonstrated using Lidar and Radar sensors [86]. However, for VO to be used in such systems, technical issues regarding robustness and, especially, long-term stability have to be resolved. Eventually, VO has the potential to replace Lidar-based systems for egomotion estimation, which are currently leading the state of the art in accuracy, robustness, and reliability. VO offers a cheaper and mechanically easier-to-manufacture solution for egomotion estimation, while, additionally, being fully passive. Furthermore, the ongoing miniaturization of digital cameras offers the possibility to develop smaller and smaller robotic systems capable of ego-motion estimation.

international conference on computer vision systems | 2006

A Flexible Technique for Accurate Omnidirectional Camera Calibration and Structure from Motion

Davide Scaramuzza; Agostino Martinelli; Roland Siegwart

In this paper, we present a flexible new technique for single viewpoint omnidirectional camera calibration. The proposed method only requires the camera to observe a planar pattern shown at a few different orientations. Either the camera or the planar pattern can be freely moved. No a priori knowledge of the motion is required, nor a specific model of the omnidirectional sensor. The only assumption is that the image projection function can be described by a Taylor series expansion whose coefficients are estimated by solving a two-step least-squares linear minimization problem. To test the proposed technique, we calibrated a panoramic camera having a field of view greater than 200 in the vertical direction, and we obtained very good results. To investigate the accuracy of the calibration, we also used the estimated omni-camera model in a structure from motion experiment. We obtained a 3D metric reconstruction of a scene from two highly distorted omnidirectional images by using image correspondences only. Compared with classical techniques, which rely on a specific parametric model of the omnidirectional camera, the proposed procedure is independent of the sensor, easy to use, and flexible.

IEEE Transactions on Robotics | 2008

Appearance-Guided Monocular Omnidirectional Visual Odometry for Outdoor Ground Vehicles

Davide Scaramuzza; Roland Siegwart

In this paper, we describe a real-time algorithm for computing the ego-motion of a vehicle relative to the road. The algorithm uses as input only those images provided by a single omnidirectional camera mounted on the roof of the vehicle. The front ends of the system are two different trackers. The first one is a homography-based tracker that detects and matches robust scale-invariant features that most likely belong to the ground plane. The second one uses an appearance-based approach and gives high-resolution estimates of the rotation of the vehicle. This planar pose estimation method has been successfully applied to videos from an automotive platform. We give an example of camera trajectory estimated purely from omnidirectional images over a distance of 400 m. For performance evaluation, the estimated path is superimposed onto a satellite image. In the end, we use image mosaicing to obtain a textured 2-D reconstruction of the estimated path.

computer vision and pattern recognition | 2011

A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation

Laurent Kneip; Davide Scaramuzza; Roland Siegwart

The Perspective-Three-Point (P3P) problem aims at determining the position and orientation of the camera in the world reference frame from three 2D-3D point correspondences. This problem is known to provide up to four solutions that can then be disambiguated using a fourth point. All existing solutions attempt to first solve for the position of the points in the camera reference frame, and then compute the position and orientation of the camera in the world frame, which alignes the two point sets. In contrast, in this paper we propose a novel closed-form solution to the P3P problem, which computes the aligning transformation directly in a single stage, without the intermediate derivation of the points in the camera frame. This is made possible by introducing intermediate camera and world reference frames, and expressing their relative position and orientation using only two parameters. The projection of a world point into the parametrized camera pose then leads to two conditions and finally a quartic equation for finding up to four solutions for the parameter pair. A subsequent backsubstitution directly leads to the corresponding camera poses with respect to the world reference frame. We show that the proposed algorithm offers accuracy and precision comparable to a popular, standard, state-of-the-art approach but at much lower computational cost (15 times faster). Furthermore, it provides improved numerical stability and is less affected by degenerate configurations of the selected world points. The superior computational efficiency is particularly suitable for any RANSAC-outlier-rejection step, which is always recommended before applying PnP or non-linear optimization of the final solution.

international conference on robotics and automation | 2009

Real-time monocular visual odometry for on-road vehicles with 1-point RANSAC

Davide Scaramuzza; Friedrich Fraundorfer; Roland Siegwart

This paper presents a system capable of recovering the trajectory of a vehicle from the video input of a single camera at a very high frame-rate. The overall frame-rate is limited only by the feature extraction process, as the outlier removal and the motion estimation steps take less than 1 millisecond with a normal laptop computer. The algorithm relies on a novel way of removing the outliers of the feature matching process.We show that by exploiting the nonholonomic constraints of wheeled vehicles it is possible to use a restrictive motion model which allows us to parameterize the motion with only 1 feature correspondence. Using a single feature correspondence for motion estimation is the lowest model parameterization possible and results in the most efficient algorithms for removing outliers. Here we present two methods for outlier removal. One based on RANSAC and the other one based on histogram voting. We demonstrate the approach using an omnidirectional camera placed on a vehicle during a peak time tour in the city of Zurich. We show that the proposed algorithm is able to cope with the large amount of clutter of the city (other moving cars, buses, trams, pedestrians, sudden stops of the vehicle, etc.). Using the proposed approach, we cover one of the longest trajectories ever reported in real-time from a single omnidirectional camera and in cluttered urban scenes, up to 3 kilometers.

Journal of Intelligent and Robotic Systems | 2011

Fusion of IMU and Vision for Absolute Scale Estimation in Monocular SLAM

Gabriel Nützi; Stephan Weiss; Davide Scaramuzza; Roland Siegwart

The fusion of inertial and visual data is widely used to improve an object’s pose estimation. However, this type of fusion is rarely used to estimate further unknowns in the visual framework. In this paper we present and compare two different approaches to estimate the unknown scale parameter in a monocular SLAM framework. Directly linked to the scale is the estimation of the object’s absolute velocity and position in 3D. The first approach is a spline fitting task adapted from Jung and Taylor and the second is an extended Kalman filter. Both methods have been simulated offline on arbitrary camera paths to analyze their behavior and the quality of the resulting scale estimation. We then embedded an online multi rate extended Kalman filter in the Parallel Tracking and Mapping (PTAM) algorithm of Klein and Murray together with an inertial sensor. In this inertial/monocular SLAM framework, we show a real time, robust and fast converging scale estimation. Our approach does not depend on known patterns in the vision part nor a complex temporal synchronization between the visual and inertial sensor.

Explore More