Carolina Redondo-Cabrera
University of Alcalá
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Carolina Redondo-Cabrera.
eurographics | 2012
Bo Li; Afzal Godil; Masaki Aono; X. Bai; Takahiko Furuya; L. Li; Roberto Javier López-Sastre; Henry Johan; Ryutarou Ohbuchi; Carolina Redondo-Cabrera; Atsushi Tatsuma; Tomohiro Yanagimachi; S. Zhang
Generic 3D shape retrieval is a fundamental research area in the field of content-based 3D model retrieval. The aim of this track is to measure and compare the performance of generic 3D shape retrieval methods implemented by different participants over the world. The track is based on a new generic 3D shape benchmark, which contains 1200 triangle meshes that are equally classified into 60 categories. In this track, 16 runs have been submitted by 5 groups and their retrieval accuracies were evaluated using 7 commonly used performance metrics.
computer vision and pattern recognition | 2012
Carolina Redondo-Cabrera; Roberto Javier López-Sastre; Javier Acevedo-Rodríguez; Saturnino Maldonado-Bascón
This paper proposes a novel approach to recognize object categories in point clouds. By quantizing 3D SURF local descriptors, computed on partial 3D shapes extracted from the point clouds, a vocabulary of 3D visual words is generated. Using this codebook, we build a Bag-of-Words representation in 3D, which is used in conjunction with a SVM classification machinery. We also introduce the 3D Spatial Pyramid Matching Kernel, which works by partitioning a working volume into fine sub-volumes, and computing a hierarchical weighted sum of histogram intersections at each level of the pyramid structure. With the aim of increasing both the classification accuracy and the computational efficiency of the kernel, we propose selective hierarchical volume decomposition strategies, based on representative and discriminative (sub-)volume selection processes, which drastically reduce the pyramid to consider. Results on the challenging large-scale RGB-D object dataset show that our kernels significantly outperform the state-of-the-art results by using a single 3D shape feature type extracted from individual depth images.
Computers & Graphics | 2013
Roberto Javier López-Sastre; A. García-Fuertes; Carolina Redondo-Cabrera; Francisco Javier Acevedo-Rodríguez; Saturnino Maldonado-Bascón
This paper focuses on the problem of 3D shape categorization. For a given set of training 3D shapes, a 3D shape recognition system must be able to predict the class label for a test 3D shape. We introduce a novel discriminative approach for recognizing 3D shape categories which is based on a 3D Spatial Pyramid (3DSP) decomposition. 3D local descriptors computed on the 3D shapes have to be extracted, to be then quantized in order to build a 3D visual vocabulary for characterizing the shapes. Our approach repeatedly subdivides a cube inscribed in the 3D shape, and computes a weighted sum of histogram of visual word occurrences at increasingly fine sub-volumes. Additionally, we integrate this pyramidal representation with different types of kernels, such as the Histogram Intersection Kernel and the extended Gaussian Kernel with ? 2 distance. Finally, we perform a thorough evaluation on different publicly available datasets, defining an elaborate experimental setup to be used for establishing further comparisons among different 3D shape categorization methods. Graphical abstractDisplay Omitted HighlightsWe introduce the 3D spatial pyramid representation for 3D shape categorization.This pyramid representation repeatedly subdivides a cube inscribed in the 3D shape.Then, a weighted sum of histogram of visual word occurrences is computed.Different types of kernels are integrated into the approach for training a SVM.Results on publicly available benchmarks have been reported.
british machine vision conference | 2014
Carolina Redondo-Cabrera; Roberto Javier López-Sastre; Tinne Tuytelaars
Simultaneous object detection and pose estimation is a challenging task in computer vision. In this paper, we tackle the problem using Hough Forests. Unlike most methods in the literature, we focus on the problem of continuous pose estimation. Moreover, we aim for a probabilistic output. We first introduce a new pose purity criterion for splitting a node during the forest training. Second, we propose the concept of Probabilistic Locally Enhanced Voting (PLEV), a novel regression strategy which consists in modulating the regression with a kernel density estimation to consolidate the votes in a local region near the maxima detected in the Hough space. And third, we propose a pose-based backprojection strategy to improve the bounding box estimation. With these three additions, we show that our Hough Forest can achieve state-of-the-art results without needing 3D CAD models. We present a quite versatile method, showing results for different categories (cars as well as faces) and for different modalities (RGB as well as depth images).
Image and Vision Computing | 2014
Carolina Redondo-Cabrera; Roberto Javier López-Sastre; Javier Acevedo-Rodríguez; Saturnino Maldonado-Bascón
This paper proposes a novel approach to recognize object and scene categories in depth images. We introduce a Bag of Words (BoW) representation in 3D, the Selective 3D Spatial Pyramid Matching Kernel (3DSPMK). It starts quantizing 3D local descriptors, computed from point clouds, to build a vocabulary of 3D visual words. This codebook is used to build the 3DSPMK, which starts partitioning a working volume into fine sub-volumes, and computing a hierarchical weighted sum of histogram intersections of visual words at each level of the 3D pyramid structure. With the aim of increasing both the classification accuracy and the computational efficiency of the kernel, we propose two selective hierarchical volume decomposition strategies, based on representative and discriminative sub-volume selection processes, which drastically reduce the pyramid to consider. Results on different RGBD datasets show that our approaches obtain state-of-the-art results for both object recognition and scene categorization. Display Omitted We introduce the 3DSPMK for object and scene recognition in depth images.Our model repeatedly subdivides a cube inscribed in the point cloud.Then, a weighted sum of histogram of visual word occurrences is computed.Results on publicly available benchmarks have been reported.
iberian conference on pattern recognition and image analysis | 2015
Mario García-Montero; Carolina Redondo-Cabrera; Roberto Javier López-Sastre; Tinne Tuytelaars
This paper describes a Hough Forest based approach for fast head pose estimation in RGB images. The system has been designed for Human-Computer Interaction (HCI), in a way that with just a simple web-cam, our solution is able to detect the head and simultaneously estimate its pose. We leverage the Hough Forest with Probabilistic Locally Enhanced Voting model, and integrate it into a system with a skin detection step and a tracking filter for the head orientation. Our implementation drastically speeds up the head pose estimations, improving their accuracy with respect to the original model. We present extensive experiments on a publicly available and challenging dataset, where our approach outperforms the state-of-the-art.
british machine vision conference | 2015
Carolina Redondo-Cabrera; Roberto Javier López-Sastre
In this work, we proceed to deconstruct the HF learning model to investigate whether a considerable better performance can be obtained detecting multi-aspect object categories. We introduce the novel Boosted Hough Forest (BHF): a HF where all the decision trees of the forest are trained in a stage-wise fashion, by optimizing a global differentiable loss function with Gradient Boosting, and using the concept of intermediate Hough voting spaces. This is in contrast to the local optimization performed in each tree node during the training of a standard HF. We also show how the multiple aspects of the object categories can be incorporated into the learning model by simply augmenting the dimensionality of the Hough voting spaces of the BHF. This allows our approach to naturally infer the pose of an object, simultaneously with the detection, for example. The experimental validation, considering four different datasets, confirms that the performance of the HF is improved by the new BHF.
european conference on computer vision | 2016
Carolina Redondo-Cabrera; Roberto Javier López-Sastre; Yu Xiang; Tinne Tuytelaars; Silvio Savarese
This paper proposes a thorough diagnosis for the problem of object detection and pose estimation. We provide a diagnostic tool to examine the impact in the performance of the different types of false positives, and the effects of the main object characteristics. We focus our study on the PASCAL 3D+ dataset, developing a complete diagnosis of four different state-of-the-art approaches, which span from hand-crafted models, to deep learning solutions. We show that gaining a clear understanding of typical failure cases and the effects of object characteristics on the performance of the models, is fundamental in order to facilitate further progress towards more accurate solutions for this challenging task.
Lecture Notes in Computer Science | 2016
Carolina Redondo-Cabrera; Roberto Javier López-Sastre; Yu Xiang; Tinne Tuytelaars; Silvio Savarese
Approaches inspired by Newtonian mechanics have been successfully applied for detecting abnormal behaviors in crowd scenarios, being the most notable example the Social Force Model (SFM). This class of approaches describes the movements and local interactions among individuals in crowds by means of repulsive and attractive forces. Despite their promising performance, recent socio-psychology studies have shown that current SFM-based methods may not be capable of explaining behaviors in complex crowd scenarios. An alternative approach consists in describing the cognitive processes that gives rise to the behavioral patterns observed in crowd using heuristics. Inspired by these studies, we propose a new hybrid framework to detect violent events in crowd videos. More specifically, (i) we define a set of simple behavioral heuristics to describe people behaviors in crowd, and (ii) we implement these heuristics into physical equations, being able to model and classify such behaviors in the videos. The resulting heuristic maps are used to extract video features to distinguish violence from normal events. Our violence detection results set the new state of the art on several standard benchmarks and demonstrate the superiority of our method compared to standard motion descriptors, previous physics-inspired models used for crowd analysis and pre-trained ConvNet for crowd behavior analysis.
Image and Vision Computing | 2018
Daniel Oñoro-Rubio; Roberto Javier López-Sastre; Carolina Redondo-Cabrera; Pedro Gil-Jiménez
Abstract Detecting objects and estimating their pose remains as one of the major challenges of the computer vision research community. There exists a compromise between localizing the objects and estimating their viewpoints. The detector ideally needs to be view-invariant, while the pose estimation process should be able to generalize towards the category-level. This work is an exploration of using deep learning models for solving both problems simultaneously. For doing so, we propose three novel deep learning architectures, which are able to perform a joint detection and pose estimation, where we gradually decouple the two tasks. We also investigate whether the pose estimation problem should be solved as a classification or regression problem, being this still an open question in the computer vision community. We detail a comparative analysis of all our solutions and the methods that currently define the state of the art for this problem. We use PASCAL3D+ and ObjectNet3D datasets to present the thorough experimental evaluation and main results. With the proposed models we achieve the state-of-the-art performance in both datasets.