Ricardo Toledo
Autonomous University of Barcelona
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ricardo Toledo.
international conference on document analysis and recognition | 2011
Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Lladós
In this paper, we present a segmentation-free word spotting method that is able to deal with heterogeneous document image collections. We propose a patch-based framework where patches are represented by a bag-of-visual-words model powered by SIFT descriptors. A later refinement of the feature vectors is performed by applying the latent semantic indexing technique. The proposed method performs well on both handwritten and typewritten historical document images. We have also tested our method on documents written in non-Latin scripts.
Pattern Recognition | 2015
Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Lladós
In this paper we present an efficient segmentation-free word spotting method, applied in the context of historical document collections, that follows the query-by-example paradigm. We use a patch-based framework where local patches are described by a bag-of-visual-words model powered by SIFT descriptors. By projecting the patch descriptors to a topic space with the latent semantic analysis technique and compressing the descriptors with the product quantization method, we are able to efficiently index the document information both in terms of memory and time. The proposed method is evaluated using four different collections of historical documents achieving good performances on both handwritten and typewritten scenarios. The yielded performances outperform the recent state-of-the-art keyword spotting approaches. HighlightsWe present a query-by-example keyword spotting method for historical collections.The method is segmentation-free and avoids any pre-processing step.We use a compact and efficient vectorial representation to index large collections.We outperform the recent state-of-the-art keyword spotting approaches.
Sensors | 2012
Cristhian A. Aguilera; Fernando Barrera; Felipe Lumbreras; Angel Domingo Sappa; Ricardo Toledo
This paper presents a novel feature point descriptor for the multispectral image case Far-Infrared and Visible Spectrum images. It allows matching interest points on images of the same scene but acquired in different spectral bands. Initially, points of interest are detected on both images through a SIFT-like based scale space representation. Then, these points are characterized using an Edge Oriented Histogram (EOH) descriptor. Finally, points of interest from multispectral images are matched by finding nearest couples using the information from the descriptor. The provided experimental results and comparisons with similar methods show both the validity of the proposed approach as well as the improvements it offers with respect to the current state-of-the-art.
international conference on pattern recognition | 2000
C. Canero; Petia Radeva; Ricardo Toledo; Juan José Villanueva; Josepa Mauri
Stent implantation for coronary disease treatment is a highly important minimally invasive technique that avoids surgery interventions. In order to assure the success of such an intervention, it is very important to determine the real length of the lesion as exactly as possible. Currently, lesion measures are performed directly from the angiography without considering the system projective parameters or, alternatively, from the 3D reconstruction obtained from a correspondence of points defined by the physicians. In this paper, we present a method for 3D vessel reconstruction from biplane images by means of deformable models. In particular, we study the known shortcoming of point-based 3D vessel reconstruction (no intersection of projective beams) and illustrate that by using snakes the reconstruction error is minimal. We validate out method by a computer-generated phantom, a real phantom and coronary vessels.
computer vision and pattern recognition | 2010
David Aldavert; Ramon López de Mántaras; Arnau Ramisa; Ricardo Toledo
We propose an efficient method, built on the popular Bag of Features approach, that obtains robust multiclass pixellevel object segmentation of an image in less than 500ms, with results comparable or better than most state of the art methods. We introduce the Integral Linear Classifier (ILC), that can readily obtain the classification score for any image sub-window with only 6 additions and 1 product by fusing the accumulation and classification steps in a single operation. In order to design a method as efficient as possible, our building blocks are carefully selected from the quickest in the state of the art. More precisely, we evaluate the performance of three popular local descriptors, that can be very efficiently computed using integral images, and two fast quantization methods: the Hierarchical K-Means, and the Extremely Randomized Forest. Finally, we explore the utility of adding spatial bins to the Bag of Features histograms and that of cascade classifiers to improve the obtained segmentation. Our method is compared to the state of the art in the difficult Graz-02 and PASCAL 2007 Segmentation Challenge datasets.
international conference on document analysis and recognition | 2013
David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Lladós
In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character n-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances.
computer vision and pattern recognition | 2000
Ricardo Toledo; Xavier Orriols; Xavier Binefa; Petia Radeva; Jordi Vitrià; Juan José Villanueva
In this paper we introduce a statistic snake that learns and tracks image features by means of statistic learning techniques. Using probabilistic principal component analysis a feature description is obtained from a training set of object profiles. In our approach a sound statistical model is introduced to define a likelihood estimate of the grey-level local image profiles together with their local orientation. This likelihood estimate allows to define a probabilistic potential field of the snake where the elastic curve deforms to maximise the overall probability of detecting learned image features. To improve the convergence of snake deformation, we enhance the likelihood map by a physics-based model simulating a dipole-dipole interaction. A new extended local coherent interaction is introduced defined in terms of extended structure tensor of the image to give priority to parallel coherence vectors.
Autonomous Robots | 2009
Arnau Ramisa; Adriana Tapus; David Aldavert; Ricardo Toledo; Ramon López de Mántaras
This paper presents a vision-based approach for mobile robot localization. The model of the environment is topological. The new approach characterizes a place using a signature. This signature consists of a constellation of descriptors computed over different types of local affine covariant regions extracted from an omnidirectional image acquired rotating a standard camera with a pan-tilt unit. This type of representation permits a reliable and distinctive environment modelling. Our objectives were to validate the proposed method in indoor environments and, also, to find out if the combination of complementary local feature region detectors improves the localization versus using a single region detector. Our experimental results show that if false matches are effectively rejected, the combination of different covariant affine region detectors increases notably the performance of the approach by combining the different strengths of the individual detectors. In order to reduce the localization time, two strategies are evaluated: re-ranking the map nodes using a global similarity measure and using standard perspective view field of 45°.In order to systematically test topological localization methods, another contribution proposed in this work is a novel method to see the degradation in localization performance as the robot moves away from the point where the original signature was acquired. This allows to know the robustness of the proposed signature. In order for this to be effective, it must be done in several, variated, environments that test all the possible situations in which the robot may have to perform localization.
IEEE Transactions on Intelligent Transportation Systems | 2015
Tarek Mouats; Nabil Aouf; Angel Domingo Sappa; Cristhian A. Aguilera; Ricardo Toledo
In this paper, we investigate the problem of visual odometry for ground vehicles based on the simultaneous utilization of multispectral cameras. It encompasses a stereo rig composed of an optical (visible) and thermal sensors. The novelty resides in the localization of the cameras as a stereo setup rather than two monocular cameras of different spectrums. To the best of our knowledge, this is the first time such task is attempted. Log-Gabor wavelets at different orientations and scales are used to extract interest points from both images. These are then described using a combination of frequency and spatial information within the local neighborhood. Matches between the pairs of multimodal images are computed using the cosine similarity function based on the descriptors. Pyramidal Lucas-Kanade tracker is also introduced to tackle temporal feature matching within challenging sequences of the data sets. The vehicle egomotion is computed from the triangulated 3-D points corresponding to the matched features. A windowed version of bundle adjustment incorporating Gauss-Newton optimization is utilized for motion estimation. An outlier removal scheme is also included within the framework to deal with outliers. Multispectral data sets were generated and used as test bed. They correspond to real outdoor scenarios captured using our multimodal setup. Finally, detailed results validating the proposed strategy are illustrated.
international conference on pattern recognition | 2000
Ricardo Toledo; Xavier Orriols; Petia Radeva; Xavier Binefa; Jordi Vitrià; C. Canero; J.J. Villanuev
We introduce a new deformable model, called eigensnake, for segmentation of elongated structures in a probabilistic framework. Instead of snake attraction by specific image features extracted independently of the snake, our eigensnake learns an optimal object description and searches for such image feature in the target image. This is achieved applying principal component analysis on image responses of a bank of Gaussian derivative filters. Therefore, attraction by eigensnakes is defined in terms of classification of image features. The potential energy for the snake is defined in terms of likelihood in the feature space and incorporated into a new energy minimising scheme. Hence, the snake deforms to minimise the mahalanobis distance in the feature space. A real application of segmenting and tracking coronary vessels in angiography is considered and the results are very encouraging.
Collaboration
Dive into the Ricardo Toledo's collaboration.
French Institute for Research in Computer Science and Automation
View shared research outputs