Alejo Concha | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alejo Concha is active.

Explore More

Publication

Featured researches published by Alejo Concha.

intelligent robots and systems | 2015

DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence

Alejo Concha; Javier Civera

This paper proposes a direct monocular SLAM algorithm that estimates a dense reconstruction of a scene in real-time on a CPU. Highly textured image areas are mapped using standard direct mapping techniques [1], that minimize the photometric error across different views. We make the assumption that homogeneous-color regions belong to approximately planar areas. Our contribution is a new algorithm for the estimation of such planar areas, based on the information of a superpixel segmentation and the semidense map from highly textured areas. We compare our approach against several alternatives using the public TUM dataset [2] and additional live experiments with a hand-held camera. We demonstrate that our proposal for piecewise planar monocular SLAM is faster, more accurate and more robust than the piecewise planar baseline [3]. In addition, our experimental results show how the depth regularization of monocular maps can damage its accuracy, being the piecewise planar assumption a reasonable option in indoor scenarios.

international conference on robotics and automation | 2014

Using superpixels in monocular SLAM.

Alejo Concha; Javier Civera

Monocular SLAM and Structure from Motion have been traditionally based on finding point correspondences in highly-textured image areas. Large textureless regions, usually found in indoor and urban environments, are difficult to reconstruct by these systems. In this paper we augment for the first time the traditional point-based monocular SLAM maps with superpixels. Superpixels are middle-level features consisting of image regions of homogeneous texture. We propose a novel scheme for superpixel matching, 3D initialization and optimization that overcomes the difficulties of salient point-based approaches in these areas of homogeneous texture. Our experimental results show the validity of our approach. First, we compare our proposal with a state-of-the-art multiview stereo system; being able to reconstruct the textureless regions that the latest cannot. Secondly, we present experimental results of our algorithm integrated with the point-based PTAM [1]; estimating, now in real-time, the superpixel textureless areas. Finally, we show the accuracy of the presented algorithm with a quantitative analysis of the estimation error.

international conference on robotics and automation | 2016

Visual-inertial direct SLAM

Alejo Concha; Giuseppe Loianno; Vijay Kumar; Javier Civera

The so-called direct visual SLAM methods have shown a great potential in estimating a semidense or fully dense reconstruction of the scene, in contrast to the sparse reconstructions of the traditional feature-based algorithms. In this paper, we propose for the first time a direct, tightly-coupled formulation for the combination of visual and inertial data. Our algorithm runs in real-time on a standard CPU. The processing is split in three threads. The first thread runs at frame rate and estimates the camera motion by a joint non-linear optimization from visual and inertial data given a semidense map. The second one creates a semidense map of high-gradient areas only for camera tracking purposes. Finally, the third thread estimates a fully dense reconstruction of the scene at a lower frame rate. We have evaluated our algorithm in several real sequences with ground truth trajectory data, showing a state-of-the-art performance.

Autonomous Robots | 2015

Incorporating scene priors to dense monocular mapping

Alejo Concha; Wajahat Hussain; Luis Montano; Javier Civera

This paper presents a dense monocular mapping algorithm that improves the accuracy of the state-of-the-art variational and multiview stereo methods by incorporating scene priors into its formulation. Most of the improvement of our proposal is in low-textured image regions and for low-parallax camera motions; two typical failure cases of multiview mapping. The specific priors we model are the planarity of homogeneous color regions, the repeating geometric primitives of the scene—that can be learned from data—and the Manhattan structure of indoor rooms. We evaluate the performance of our method in our own sequences and in the publicly available NYU dataset, emphasizing its strengths and weaknesses in different cases.

robotics: science and systems | 2014

Manhattan and Piecewise-Planar Constraints for Dense Monocular Mapping.

Alejo Concha; Muhammad Wajahat Hussain; Luis Montano; Javier Civera

This paper presents a variational formulation for real-time dense 3D mapping from a RGB monocular sequence that incorporates Manhattan and piecewise-planar constraints in indoor and outdoor man-made scenes. The state-of-the-art variational approaches are based on the minimization of an energy functional composed of two terms, the first one accounting for the photometric compatibility in multiple views, and the second one favoring smooth solutions. We show that the addition of a third energy term modelling Manhattan and piecewise-planar structures greatly improves the accuracy of the dense visual maps, particularly for low-textured man-made environments where the data term can be ambiguous. We evaluate two different methods to provide such Manhattan and piecewise-planar constraints based on 1) multiview superpixel geometry and 2) multiview layout estimation and scene understanding. Our experiments include the largest map produced by variational methods from a RGB sequence and demonstrate a reduction in the median depth error up to a factor 5×.

intelligent robots and systems | 2015

Layout aware visual tracking and mapping

Marta Salas; Wajahat Hussain; Alejo Concha; Luis Montano; Javier Civera; J. M. M. Montiel

Nowadays real time visual Simultaneous Localization And Mapping (SLAM) algorithms exist and rely on consistent measurements across multiple views. In indoor environments, where majority of robots activity takes place, severe occlusions can occur, e.g., when turning around a corner or moving from one room to another. In these situations, SLAM algorithms can not establish correspondences across views, which leads to failures in camera localization or map construction. This work takes advantage of the recent scene box layout descriptor to make the above mentioned SLAM systems occlusion aware. This room box reasoning helps the sequential tracker to reason about possible occlusions and therefore look for matches in only potentially visible features instead of the entire map. This increases the life of the tracker, as it does not consider itself lost under the occlusion state. Additionally, focusing on the potentially visible portion of the map, i.e., the current room features, it improves the computational efficiency without compromising the accuracy. Finally, this room level reasoning helps in better image selection for bundle adjustment. The image bundle coming from the same room has little occlusion, which leads to better dense reconstruction. We demonstrate the superior performance of layout aware SLAM on several long monocular sequences acquired in difficult indoor situations, specifically in a room-room transition and turning around a corner.

european conference on mobile robots | 2015

An evaluation of robust cost functions for RGB direct mapping

Alejo Concha; Javier Civera

The so-called direct SLAM methods have shown an impressive performance in estimating a dense 3D reconstruction from RGB sequences in real-time [1], [2], [3]. They are based on the minimization of an error function composed of several terms that account for the photometric consistency of corresponding pixels and the smoothness and the planarity priors on the reconstructed surfaces. In this paper we evaluate several robust error functions that reduce the influence of large individual contributions -that most likely correspond to outliers- to the total error. Our experimental results show that the differences between the robust functions are considerable, the best of them reducing the estimation error up to 25%.

international conference on robotics and automation | 2017

Single-View and Multi-View Depth Fusion

José M. Fácil; Alejo Concha; Luis Montesano; Javier Civera

Dense and accurate 3-D mapping from a monocular sequence is a key technology for several applications and still an open research area. This letter leverages recent results on single-view convolutional network (CNN)-based depth estimation and fuses them with multiview depth estimation. Both approaches present complementary strengths. Multiview depth is highly accurate but only in high-texture areas and high-parallax cases. Single-view depth captures the local structure of midlevel regions, including texture-less areas, but the estimated depth lacks global coherence. The single and multiview fusion we propose is challenging in several aspects. First, both depths are related by a deformation that depends on the image content. Second, the selection of multiview points of high accuracy might be difficult for low-parallax configurations. We present contributions for both problems. Our results in the public datasets of NYUv2 and TUM shows that our algorithm outperforms the individual single and multiview approaches. A video showing the key aspects of mapping in our single and multiview depth proposal is available at https://youtu.be/ipc5HukTb4k.Dense 3D mapping from a monocular sequence is a key technology for several applications and still a research problem. This paper leverages recent results on single-view CNN-based depth estimation and fuses them with direct multiview depth estimation. Both approaches present complementary strengths. Multi-view depth estimation is highly accurate but only in high-texture and high-parallax cases. Single-view depth captures the local structure of mid-level regions, including textureless areas, but the estimated depth lacks global coherence. The single and multi-view fusion we propose has several challenges. First, both depths are related by a non-rigid deformation that depends on the image content. And second, the selection of multi-view points of high accuracy might be difficult for low-parallax configurations. We present contributions for both problems. Our results in the public datasets of NYU and TUM shows that our algorithm outperforms the individual single and multi-view approaches.

oceans conference | 2015