Javier Civera | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Javier Civera is active.

Explore More

Publication

Featured researches published by Javier Civera.

robotics science and systems | 2006

Unified Inverse Depth Parametrization for Monocular SLAM

J. M. M. Montiel; Javier Civera; Andrew J. Davison

Recent work has shown that the probabilistic SLAM approach of explicit uncertainty propagation can succeed in permitting repeatable 3D real-time localization and mapping even in the ‘pure vision’ domain of a single agile camera with no extra sensing. An issue which has caused difficulty in monocular SLAM however is the initialization of features, since information from multiple images acquired during motion must be combined to achieve accurate depth estimates. This has led algorithms to deviate from the desirable Gaussian uncertainty representation of the EKF and related probabilistic filters during special initialization steps. In this paper we present a new unified parametrization for point features within monocular SLAM which permits efficient and accurate representation of uncertainty during undelayed initialisation and beyond, all within the standard EKF (Extended Kalman Filter). The key concept is direct parametrization of inverse depth, where there is a high degree of linearity. Importantly, our parametrization can cope with features which are so far from the camera that they present little parallax during motion, maintaining sufficient representative uncertainty that these points retain the opportunity to ‘come in’ from infinity if the camera makes larger movements. We demonstrate the parametrization using real image sequences of large-scale indoor and outdoor scenes.

computer vision and pattern recognition | 2012

Learning object class detectors from weakly annotated video

Alessandro Prest; Christian Leistner; Javier Civera; Cordelia Schmid; Vittorio Ferrari

Object detectors are typically trained on a large set of still images annotated by bounding-boxes. This paper introduces an approach for learning object detectors from real-world web videos known only to contain objects of a target class. We propose a fully automatic pipeline that localizes objects in a set of videos of the class and learns a detector for it. The approach extracts candidate spatio-temporal tubes based on motion segmentation and then selects one tube per video jointly over all videos. To compare to the state of the art, we test our detector on still images, i.e., Pascal VOC 2007. We observe that frames extracted from web videos can differ significantly in terms of quality to still images taken by a good camera. Thus, we formulate the learning from videos as a domain adaptation task. We show that training from a combination of weakly annotated videos and fully annotated still images using domain adaptation improves the performance of a detector trained from still images alone.

Robotics and Autonomous Systems | 2014

C2TAM: A Cloud framework for cooperative tracking and mapping

Luis Riazuelo; Javier Civera; J. M. M. Montiel

The Simultaneous Localization And Mapping by an autonomous mobile robot-known by its acronym SLAM-is a computationally demanding process for medium and large-scale scenarios, in spite of the progress both in the algorithmic and hardware sides. As a consequence, a robot with SLAM capabilities has to be equipped with the latest computers whose weight and power consumption might limit its autonomy. This paper describes a visual SLAM system based on a distributed framework where the expensive map optimization and storage is allocated as a service in the Cloud, while a light camera tracking client runs on a local computer. The robot onboard computers are freed from most of the computation, the only extra requirement being an internet connection. The data flow from and to the Cloud is low enough to be supported by a standard wireless connection. The experimental section is focused on showing real-time performance for single-robot and cooperative SLAM using an RGBD camera. The system provides the interface to a map database where: (1) a map can be built and stored, (2) stored maps can be reused by other robots, (3) a robot can fuse its map online with a map already in the database, and (4) several robots can estimate individual maps and fuse them together if an overlap is detected.

international conference on robotics and automation | 2007

Inverse Depth to Depth Conversion for Monocular SLAM

Javier Civera; Andrew J. Davison; J. M. M. Montiel

Recently it has been shown that an inverse depth parametrization can improve the performance of real-time monocular EKF SLAM, permitting undelayed initialization of features at all depths. However, the inverse depth parametrization requires the storage of 6 parameters in the state vector for each map point. This implies a noticeable computing overhead when compared with the standard 3 parameter XYZ Euclidean encoding of a 3D point, since the computational complexity of the EKF scales poorly with state vector size. In this work we propose to restrict the inverse depth parametrization only to cases where the standard Euclidean encoding implies a departure from linearity in the measurement equations. Every new map feature is still initialized using the 6 parameter inverse depth method. However, as the estimation evolves, if according to a linearity index the alternative XYZ coding can be considered linear, we show that feature parametrization can be transformed from inverse depth to XYZ for increased computational efficiency with little reduction in accuracy. We present a theoretical development of the necessary linearity indices, along with simulations to analyze the influence of the conversion threshold. Experiments performed with with a 30 frames per second real-time system are reported. An analysis of the increase in the map size that can be successfully managed is included.

International Journal of Computer Vision | 2012

Impact of Landmark Parametrization on Monocular EKF-SLAM with Points and Lines

Joan Sola; Teresa A. Vidal-Calleja; Javier Civera; J. M. M. Montiel

This paper explores the impact that landmark parametrization has in the performance of monocular, EKF-based, 6-DOF simultaneous localization and mapping (SLAM) in the context of undelayed landmark initialization.Undelayed initialization in monocular SLAM challenges EKF because of the combination of non-linearity with the large uncertainty associated with the unmeasured degrees of freedom. In the EKF context, the goal of a good landmark parametrization is to improve the model’s linearity as much as possible, improving the filter consistency, achieving robuster and more accurate localization and mapping.This work compares the performances of eight different landmark parametrizations: three for points and five for straight lines. It highlights and justifies the keys for satisfactory operation: the use of parameters behaving proportionally to inverse-distance, and landmark anchoring. A unified EKF-SLAM framework is formulated as a benchmark for points and lines that is independent of the parametrization used. The paper also defines a generalized linearity index suited for the EKF, and uses it to compute and compare the degrees of linearity of each parametrization. Finally, all eight parametrizations are benchmarked employing analytical tools (the linearity index) and statistical tools (based on Monte Carlo error and consistency analyses), with simulations and real imagery data, using the standard and the robocentric EKF-SLAM formulations.

intelligent robots and systems | 2009

1-point RANSAC for EKF-based Structure from Motion

Javier Civera; Oscar G. Grasa; Andrew J. Davison; J. M. M. Montiel

Recently, classical pairwise Structure From Motion (SfM) techniques have been combined with non-linear global optimization (Bundle Adjustment, BA) over a sliding window to recursively provide camera pose and feature location estimation from long image sequences. Normally called Visual Odometry, these algorithms are nowadays able to estimate with impressive accuracy trajectories of hundreds of meters; either from an image sequence (usually stereo) as the only input, or combining visual and propioceptive information from inertial sensors or wheel odometry.

intelligent robots and systems | 2011

Towards semantic SLAM using a monocular camera

Javier Civera; Dorian Gálvez-López; Luis Riazuelo; Juan D. Tardós; J. M. M. Montiel

Monocular SLAM systems have been mainly focused on producing geometric maps just composed of points or edges; but without any associated meaning or semantic content. In this paper, we propose a semantic SLAM algorithm that merges in the estimated map traditional meaningless points with known objects. The non-annotated map is built using only the information extracted from a monocular image sequence. The known object models are automatically computed from a sparse set of images gathered by cameras that may be different from the SLAM camera. The models include both visual appearance and tridimensional information. The semantic or annotated part of the map -the objects- are estimated using the information in the image sequence and the precomputed object models.

international conference on robotics and automation | 2011

EKF monocular SLAM with relocalization for laparoscopic sequences

Oscar G. Grasa; Javier Civera; J. M. M. Montiel

In recent years, research on visual SLAM has produced robust algorithms providing, in real time at 30 Hz, both the 3D model of the observed rigid scene and the 3D camera motion using as only input the gathered image sequence. These algorithms have been extensively validated in rigid human-made environments -indoor and outdoor- showing robust performance in dealing with clutter, occlusions or sudden motions. Medical endoscopic sequences naturally pose a monocular SLAM problem: an unknown camera motion in an unknown environment. The corresponding map would be useful in providing 3D information to assist surgeons, to support augmented reality insertions or to be exploited by medical robots. In this paper we propose the combination EKF Monocular SLAM + 1-Point RANSAC + Randomised List Relocalization to process laparoscopic sequences -abdominal cavity images-. The sequences are challenging due to: 1) cluttering produced by tools; 2) sudden motions of the camera; 3) laparoscope frequently goes in and out of abdominal cavity; 4) tissue deformation caused by respiration, heartbeats and/or surgical tools. Real medical image sequences provide experimental validation.

intelligent robots and systems | 2015

Stereo parallel tracking and mapping for robot localization

Taihú Pire; Thomas Fischer; Javier Civera; Pablo De Cristóforis; Julio Jacobo Berllés

This paper describes a visual SLAM system based on stereo cameras and focused on real-time localization for mobile robots. To achieve this, it heavily exploits the parallel nature of the SLAM problem, separating the time-constrained pose estimation from less pressing matters such as map building and refinement tasks. On the other hand, the stereo setting allows to reconstruct a metric 3D map for each frame of stereo images, improving the accuracy of the mapping process with respect to monocular SLAM and avoiding the well-known bootstrapping problem. Also, the real scale of the environment is an essential feature for robots which have to interact with their surrounding workspace. A series of experiments, on-line on a robot as well as off-line with public datasets, are performed to validate the accuracy and real-time performance of the developed method.

intelligent robots and systems | 2015

DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence

Alejo Concha; Javier Civera

This paper proposes a direct monocular SLAM algorithm that estimates a dense reconstruction of a scene in real-time on a CPU. Highly textured image areas are mapped using standard direct mapping techniques [1], that minimize the photometric error across different views. We make the assumption that homogeneous-color regions belong to approximately planar areas. Our contribution is a new algorithm for the estimation of such planar areas, based on the information of a superpixel segmentation and the semidense map from highly textured areas. We compare our approach against several alternatives using the public TUM dataset [2] and additional live experiments with a hand-held camera. We demonstrate that our proposal for piecewise planar monocular SLAM is faster, more accurate and more robust than the piecewise planar baseline [3]. In addition, our experimental results show how the depth regularization of monocular maps can damage its accuracy, being the piecewise planar assumption a reasonable option in indoor scenarios.

Explore More