Is this you? Create Your Porfile

David Aldavert

Autonomous University of Barcelona

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David Aldavert is active.

Explore More

Publication

Featured researches published by David Aldavert.

international conference on document analysis and recognition | 2011

Browsing Heterogeneous Document Collections by a Segmentation-Free Word Spotting Method

Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Lladós

In this paper, we present a segmentation-free word spotting method that is able to deal with heterogeneous document image collections. We propose a patch-based framework where patches are represented by a bag-of-visual-words model powered by SIFT descriptors. A later refinement of the feature vectors is performed by applying the latent semantic indexing technique. The proposed method performs well on both handwritten and typewritten historical document images. We have also tested our method on documents written in non-Latin scripts.

Pattern Recognition | 2015

Efficient segmentation-free keyword spotting in historical document collections

Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Lladós

In this paper we present an efficient segmentation-free word spotting method, applied in the context of historical document collections, that follows the query-by-example paradigm. We use a patch-based framework where local patches are described by a bag-of-visual-words model powered by SIFT descriptors. By projecting the patch descriptors to a topic space with the latent semantic analysis technique and compressing the descriptors with the product quantization method, we are able to efficiently index the document information both in terms of memory and time. The proposed method is evaluated using four different collections of historical documents achieving good performances on both handwritten and typewritten scenarios. The yielded performances outperform the recent state-of-the-art keyword spotting approaches. HighlightsWe present a query-by-example keyword spotting method for historical collections.The method is segmentation-free and avoids any pre-processing step.We use a compact and efficient vectorial representation to index large collections.We outperform the recent state-of-the-art keyword spotting approaches.

computer vision and pattern recognition | 2010

Fast and robust object segmentation with the Integral Linear Classifier

David Aldavert; Ramon López de Mántaras; Arnau Ramisa; Ricardo Toledo

We propose an efficient method, built on the popular Bag of Features approach, that obtains robust multiclass pixellevel object segmentation of an image in less than 500ms, with results comparable or better than most state of the art methods. We introduce the Integral Linear Classifier (ILC), that can readily obtain the classification score for any image sub-window with only 6 additions and 1 product by fusing the accumulation and classification steps in a single operation. In order to design a method as efficient as possible, our building blocks are carefully selected from the quickest in the state of the art. More precisely, we evaluate the performance of three popular local descriptors, that can be very efficiently computed using integral images, and two fast quantization methods: the Hierarchical K-Means, and the Extremely Randomized Forest. Finally, we explore the utility of adding spatial bins to the Bag of Features histograms and that of cascade classifiers to improve the obtained segmentation. Our method is compared to the state of the art in the difficult Graz-02 and PASCAL 2007 Segmentation Challenge datasets.

international conference on document analysis and recognition | 2013

Integrating Visual and Textual Cues for Query-by-String Word Spotting

David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Lladós

In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character n-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances.

Autonomous Robots | 2009

Robust vision-based robot localization using combinations of local feature region detectors

Arnau Ramisa; Adriana Tapus; David Aldavert; Ricardo Toledo; Ramon López de Mántaras

This paper presents a vision-based approach for mobile robot localization. The model of the environment is topological. The new approach characterizes a place using a signature. This signature consists of a constellation of descriptors computed over different types of local affine covariant regions extracted from an omnidirectional image acquired rotating a standard camera with a pan-tilt unit. This type of representation permits a reliable and distinctive environment modelling. Our objectives were to validate the proposed method in indoor environments and, also, to find out if the combination of complementary local feature region detectors improves the localization versus using a single region detector. Our experimental results show that if false matches are effectively rejected, the combination of different covariant affine region detectors increases notably the performance of the approach by combining the different strengths of the individual detectors. In order to reduce the localization time, two strategies are evaluated: re-ranking the map nodes using a global similarity measure and using standard perspective view field of 45°.In order to systematically test topological localization methods, another contribution proposed in this work is a novel method to see the degradation in localization performance as the robot moves away from the point where the original signature was acquired. This allows to know the robustness of the proposed signature. In order for this to be effective, it must be done in several, variated, environments that test all the possible situations in which the robot may have to perform localization.

Journal of Intelligent and Robotic Systems | 2011

Combining Invariant Features and the ALV Homing Method for Autonomous Robot Navigation Based on Panoramas

Arnau Ramisa; Alex Goldhoorn; David Aldavert; Ricardo Toledo; Ramon López de Mántaras

Biologically inspired homing methods, such as the Average Landmark Vector, are an interesting solution for local navigation due to its simplicity. However, usually they require a modification of the environment by placing artificial landmarks in order to work reliably. In this paper we combine the Average Landmark Vector with invariant feature points automatically detected in panoramic images to overcome this limitation. The proposed approach has been evaluated first in simulation and, as promising results are found, also in two data sets of panoramas from real world environments.

International Journal on Document Analysis and Recognition | 2015

A study of Bag-of-Visual-Words representations for handwritten keyword spotting

David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Lladós

The Bag-of-Visual-Words (BoVW) framework has gained popularity among the document image analysis community, specifically as a representation of handwritten words for recognition or spotting purposes. Although in the computer vision field the BoVW method has been greatly improved, most of the approaches in the document image analysis domain still rely on the basic implementation of the BoVW method disregarding such latest refinements. In this paper, we present a review of those improvements and its application to the keyword spotting task. We thoroughly evaluate their impact against a baseline system in the well-known George Washington dataset and compare the obtained results against nine state-of-the-art keyword spotting methods. In addition, we also compare both the baseline and improved systems with the methods presented at the Handwritten Keyword Spotting Competition 2014.

Journal of Intelligent and Robotic Systems | 2012

Evaluation of Three Vision Based Object Perception Methods for a Mobile Robot

Arnau Ramisa; David Aldavert; Shrihari Vasudevan; Ricardo Toledo; Ramon López de Mántaras

This paper addresses visual object perception applied to mobile robotics. Being able to perceive household objects in unstructured environments is a key capability in order to make robots suitable to perform complex tasks in home environments. However, finding a solution for this task is daunting: it requires the ability to handle the variability in image formation in a moving camera with tight time constraints. The paper brings to attention some of the issues with applying three state of the art object recognition and detection methods in a mobile robotics scenario, and proposes methods to deal with windowing/segmentation. Thus, this work aims at evaluating the state-of-the-art in object perception in an attempt to develop a lightweight solution for mobile robotics use/research in typical indoor settings.

international conference on document analysis and recognition | 2015

Towards query-by-speech handwritten keyword spotting

Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Lladós

In this paper, we present a new querying paradigm for handwritten keyword spotting. We propose to represent handwritten word images both by visual and audio representations, enabling a query-by-speech keyword spotting system. The two representations are merged together and projected to a common sub-space in the training phase. This transform allows to, given a spoken query, retrieve word instances that were only represented by the visual modality. In addition, the same method can be used backwards at no additional cost to produce a handwritten text-to-speech system. We present our first results on this new querying mechanism using synthetic voices over the George Washington dataset.

international symposium on visual computing | 2009

Efficient Object Pixel-Level Categorization Using Bag of Features

David Aldavert; Arnau Ramisa; Ricardo Toledo; Ramon López de Mántaras

In this paper we present a pixel-level object categorization method suitable to be applied under real-time constraints. Since pixels are categorized using a bag of features scheme, the major bottleneck of such an approach would be the feature pooling in local histograms of visual words. Therefore, we propose to bypass this time-consuming step and directly obtain the score from a linear Support Vector Machine classifier. This is achieved by creating an integral image of the components of the SVM which can readily obtain the classification score for any image sub-window with only 10 additions and 2 products, regardless of its size. Besides, we evaluated the performance of two efficient feature quantization methods: the Hierarchical K-Means and the Extremely Randomized Forest. All experiments have been done in the Graz02 database, showing comparable, or even better results to related work with a lower computational cost.

Explore More