Stefan Leutenegger
Imperial College London
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stefan Leutenegger.
international conference on computer vision | 2011
Stefan Leutenegger; Margarita Chli; Roland Siegwart
Effective and efficient generation of keypoints from an image is a well-studied problem in the literature and forms the basis of numerous Computer Vision applications. Established leaders in the field are the SIFT and SURF algorithms which exhibit great performance under a variety of image transformations, with SURF in particular considered as the most computationally efficient amongst the high-performance methods to date. In this paper we propose BRISK1, a novel method for keypoint detection, description and matching. A comprehensive evaluation on benchmark datasets reveals BRISKs adaptive, high quality performance as in state-of-the-art algorithms, albeit at a dramatically lower computational cost (an order of magnitude faster than SURF in cases). The key to speed lies in the application of a novel scale-space FAST-based detector in combination with the assembly of a bit-string descriptor from intensity comparisons retrieved by dedicated sampling of each keypoint neighborhood.
The International Journal of Robotics Research | 2015
Stefan Leutenegger; Simon Lynen; Michael Bosse; Roland Siegwart; Paul Timothy Furgale
Combining visual and inertial measurements has become popular in mobile robotics, since the two sensing modalities offer complementary characteristics that make them the ideal choice for accurate visual–inertial odometry or simultaneous localization and mapping (SLAM). While historically the problem has been addressed with filtering, advancements in visual estimation suggest that nonlinear optimization offers superior accuracy, while still tractable in complexity thanks to the sparsity of the underlying problem. Taking inspiration from these findings, we formulate a rigorously probabilistic cost function that combines reprojection errors of landmarks and inertial terms. The problem is kept tractable and thus ensuring real-time operation by limiting the optimization to a bounded window of keyframes through marginalization. Keyframes may be spaced in time by arbitrary intervals, while still related by linearized inertial terms. We present evaluation results on complementary datasets recorded with our custom-built stereo visual–inertial hardware that accurately synchronizes accelerometer and gyroscope measurements with imagery. A comparison of both a stereo and monocular version of our algorithm with and without online extrinsics estimation is shown with respect to ground truth. Furthermore, we compare the performance to an implementation of a state-of-the-art stochastic cloning sliding-window filter. This competitive reference implementation performs tightly coupled filtering-based visual–inertial odometry. While our approach declaredly demands more computation, we show its superior performance in terms of accuracy.
robotics: science and systems | 2015
Thomas Whelan; Stefan Leutenegger; Renato F. Salas-Moreno; Ben Glocker; Andrew J. Davison
We present a novel approach to real-time dense visual SLAM. Our system is capable of capturing comprehensive dense globally consistent surfel-based maps of room scale environments explored using an RGB-D camera in an incremental online fashion, without pose graph optimisation or any postprocessing steps. This is accomplished by using dense frame-tomodel camera tracking and windowed surfel-based fusion coupled with frequent model refinement through non-rigid surface deformations. Our approach applies local model-to-model surface loop closure optimisations as often as possible to stay close to the mode of the map distribution, while utilising global loop closure to recover from arbitrary drift and maintain global consistency.
robotics science and systems | 2012
Michael Bloesch; Marco Hutter; Mark A. Hoepflinger; Stefan Leutenegger; Christian Gehring; C. D. Remy; Roland Siegwart
This paper introduces a state estimation framework for legged robots that allows estimating the full pose of the robot without making any assumptions about the geometrical structure of its environment. This is achieved by means of an Observability Constrained Extended Kalman Filter that fuses kinematic encoder data with on-board IMU measurements. By including the absolute position of all footholds into the filter state, simple model equations can be formulated which accurately capture the uncertainties associated with the intermittent ground contacts. The resulting filter simultaneously estimates the position of all footholds and the pose of the main body. In the algorithmic formulation, special attention is paid to the consistency of the linearized filter: it maintains the same observability properties as the nonlinear system, which is a prerequisite for accurate state estimation. The presented approach is implemented in simulation and validated experimentally on an actual quadrupedal robot.
international conference on robotics and automation | 2007
Dominik J. Bell; Stefan Leutenegger; K. M. Hammar; Lixin Dong; Bradley J. Nelson
A propulsion system similar in size and motion to the helical bacterial flagella motor is presented. The system consists of a magnetic nanocoil as a propeller (27 nm thick ribbon, 3 mun in diameter, 30-40 mum long) driven by an arrangement of macro coils. The macro coils generate a rotating field that induces rotational motion in the nanocoil. Viscous forces during rotation result in a net axial propulsion force on the nanocoil. Modeling of fluid mechanics and magnetics was used to estimate the requirements for such a system. The fabrication of the magnetic nanocoils and the system setup are explained. Experimental results from electromagnetic actuation of nanocoils as well as from their propulsion in both paraffin oil and water are presented. This is the first time a propulsion system of this size and motion-type has been fabricated and experimentally verified.
ieee aerospace conference | 2013
Janosch Nikolic; Michael Burri; Joern Rehder; Stefan Leutenegger; Christoph Huerzeler; Roland Siegwart
This work presents a small-scale Unmanned Aerial System (UAS) capable of performing inspection tasks in enclosed industrial environments. Vehicles with such capabilities have the potential to reduce human involvement in hazardous tasks and can minimize facility outage periods. The results presented generalize to UAS exploration tasks in almost any GPS-denied indoor environment. The contribution of this work is twofold. First, results from autonomous flights inside an industrial boiler of a power plant are presented. A lightweight, vision-aided inertial navigation system provides reliable state estimates under difficult environmental conditions typical for such sites. It relies solely on measurements from an on-board MEMS inertial measurement unit and a pair of cameras arranged in a classical stereo configuration. A model-predictive controller allows for efficient trajectory following and enables flight in close proximity to the boiler surface. As a second contribution, we highlight ongoing developments by displaying state estimation and structure recovery results acquired with an integrated visual/inertial sensor that will be employed on future aerial service robotic platforms. A tight integration in hardware facilitates spatial and temporal calibration of the different sensors and thus enables more accurate and robust ego-motion estimates. Comparison with ground truth obtained from a laser tracker shows that such a sensor can provide motion estimates with drift rates of only few centimeters over the period of a typical flight.
The International Journal of Robotics Research | 2016
Thomas Whelan; Renato F. Salas-Moreno; Ben Glocker; Andrew J. Davison; Stefan Leutenegger
We present a novel approach to real-time dense visual simultaneous localisation and mapping. Our system is capable of capturing comprehensive dense globally consistent surfel-based maps of room scale environments and beyond explored using an RGB-D camera in an incremental online fashion, without pose graph optimization or any post-processing steps. This is accomplished by using dense frame-to-model camera tracking and windowed surfel-based fusion coupled with frequent model refinement through non-rigid surface deformations. Our approach applies local model-to-model surface loop closure optimizations as often as possible to stay close to the mode of the map distribution, while utilizing global loop closure to recover from arbitrary drift and maintain global consistency. In the spirit of improving map quality as well as tracking accuracy and robustness, we furthermore explore a novel approach to real-time discrete light source detection. This technique is capable of detecting numerous light sources in indoor environments in real-time as a user handheld camera explores the scene. Absolutely no prior information about the scene or number of light sources is required. By making a small set of simple assumptions about the appearance properties of the scene our method can incrementally estimate both the quantity and location of multiple light sources in the environment in an online fashion. Our results demonstrate that our technique functions well in many different environments and lighting configurations. We show that this enables (a) more realistic augmented reality rendering; (b) a richer understanding of the scene beyond pure geometry and; (c) more accurate and robust photometric tracking.
computer vision and pattern recognition | 2016
Edward Johns; Stefan Leutenegger; Andrew J. Davison
A multi-view image sequence provides a much richer capacity for object recognition than from a single image. However, most existing solutions to multi-view recognition typically adopt hand-crafted, model-based geometric methods, which do not readily embrace recent trends in deep learning. We propose to bring Convolutional Neural Networks to generic multi-view recognition, by decomposing an image sequence into a set of image pairs, classifying each pair independently, and then learning an object classifier by weighting the contribution of each pair. This allows for recognition over arbitrary camera trajectories, without requiring explicit training over the potentially infinite number of camera paths and lengths. Building these pairwise relationships then naturally extends to the next-best-view problem in an active recognition framework. To achieve this, we train a second Convolutional Neural Network to map directly from an observed image to next viewpoint. Finally, we incorporate this into a trajectory optimisation task, whereby the best recognition confidence is sought for a given trajectory length. We present state-of-the-art results in both guided and unguided multi-view recognition on the ModelNet dataset, and show how our method can be used with depth images, greyscale images, or both.
international symposium on safety, security, and rescue robotics | 2012
Lorenzo Marconi; Claudio Melchiorri; Michael Beetz; Dejan Pangercic; Roland Siegwart; Stefan Leutenegger; Raffaella Carloni; Stefano Stramigioli; Herman Bruyninckx; Patrick Doherty; Alexander Kleiner; Vincenzo Lippiello; Alberto Finzi; Bruno Siciliano; A. Sala; Nicola Tomatis
The goal of the paper is to present the foreseen research activity of the European project “SHERPA” whose activities will start officially on February 1th 2013. The goal of SHERPA is to develop a mixed ground and aerial robotic platform to support search and rescue activities in a real-world hostile environment, like the alpine scenario that is specifically targeted in the project. Looking into the technological platform and the alpine rescuing scenario, we plan to address a number of research topics about cognition and control. What makes the project potentially very rich from a scientific viewpoint is the heterogeneity and the capabilities to be owned by the different actors of the SHERPA system: the human rescuer is the “busy genius”, working in team with the ground vehicle, as the “intelligent donkey”, and with the aerial platforms, i.e. the “trained wasps” and “patrolling hawks”. Indeed, the research activity focuses on how the “busy genius” and the “SHERPA animals” interact and collaborate with each other, with their own features and capabilities, toward the achievement of a common goal.
international conference on robotics and automation | 2017
John McCormac; Ankur Handa; Andrew J. Davison; Stefan Leutenegger
Ever more robust, accurate and detailed mapping using visual sensing has proven to be an enabling factor for mobile robots across a wide variety of applications. For the next level of robot intelligence and intuitive user interaction, maps need to extend beyond geometry and appearance — they need to contain semantics. We address this challenge by combining Convolutional Neural Networks (CNNs) and a state-of-the-art dense Simultaneous Localization and Mapping (SLAM) system, ElasticFusion, which provides long-term dense correspondences between frames of indoor RGB-D video even during loopy scanning trajectories. These correspondences allow the CNNs semantic predictions from multiple view points to be probabilistically fused into a map. This not only produces a useful semantic 3D map, but we also show on the NYUv2 dataset that fusing multiple predictions leads to an improvement even in the 2D semantic labelling over baseline single frame predictions. We also show that for a smaller reconstruction dataset with larger variation in prediction viewpoint, the improvement over single frame segmentation increases. Our system is efficient enough to allow real-time interactive use at frame-rates of ≈25Hz.