Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marçal Rusiñol is active.

Publication


Featured researches published by Marçal Rusiñol.


international conference on document analysis and recognition | 2011

Browsing Heterogeneous Document Collections by a Segmentation-Free Word Spotting Method

Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Lladós

In this paper, we present a segmentation-free word spotting method that is able to deal with heterogeneous document image collections. We propose a patch-based framework where patches are represented by a bag-of-visual-words model powered by SIFT descriptors. A later refinement of the feature vectors is performed by applying the latent semantic indexing technique. The proposed method performs well on both handwritten and typewritten historical document images. We have also tested our method on documents written in non-Latin scripts.


Pattern Recognition | 2015

Efficient segmentation-free keyword spotting in historical document collections

Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Lladós

In this paper we present an efficient segmentation-free word spotting method, applied in the context of historical document collections, that follows the query-by-example paradigm. We use a patch-based framework where local patches are described by a bag-of-visual-words model powered by SIFT descriptors. By projecting the patch descriptors to a topic space with the latent semantic analysis technique and compressing the descriptors with the product quantization method, we are able to efficiently index the document information both in terms of memory and time. The proposed method is evaluated using four different collections of historical documents achieving good performances on both handwritten and typewritten scenarios. The yielded performances outperform the recent state-of-the-art keyword spotting approaches. HighlightsWe present a query-by-example keyword spotting method for historical collections.The method is segmentation-free and avoids any pre-processing step.We use a compact and efficient vectorial representation to index large collections.We outperform the recent state-of-the-art keyword spotting approaches.


international conference on document analysis and recognition | 2009

Logo Spotting by a Bag-of-words Approach for Document Categorization

Marçal Rusiñol; Josep Lladós

In this paper we present a method for document categorization which processes incoming document images such as invoices or receipts. The categorization of these document images is done in terms of the presence of a certain graphical logo detected without segmentation. The graphical logos are described by a set of local features and the categorization of the documents is performed by the use of a bag-of-words model. Spatial coherence rules are added to reinforce the correct category hypothesis, aiming also to spot the logo inside the document image. Experiments which demonstrate the effectiveness of this system on a large set of real data are presented.


international conference on document analysis and recognition | 2013

Bag-of-Features HMMs for Segmentation-Free Word Spotting in Handwritten Documents

Leonard Rothacker; Marçal Rusiñol; Gernot A. Fink

Recent HMM-based approaches to handwritten word spotting require large amounts of learning samples and mostly rely on a prior segmentation of the document. We propose to use Bag-of-Features HMMs in a patch-based segmentation-free framework that are estimated by a single sample. Bag-of-Features HMMs use statistics of local image feature representatives. Therefore they can be considered as a variant of discrete HMMs allowing to model the observation of a number of features at a point in time. The discrete nature enables us to estimate a query model with only a single example of the query provided by the user. This makes our method very flexible with respect to the availability of training data. Furthermore, we are able to outperform state-of-the-art results on the George Washington dataset.


International Journal of Pattern Recognition and Artificial Intelligence | 2012

ON THE INFLUENCE OF WORD REPRESENTATIONS FOR HANDWRITTEN WORD SPOTTING IN HISTORICAL DOCUMENTS

Josep Lladós; Marçal Rusiñol; Alicia Fornés; David Pradas Fernandez; Anjan Dutta

Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images.


international conference on document analysis and recognition | 2013

Integrating Visual and Textual Cues for Query-by-String Word Spotting

David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Lladós

In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character n-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances.


document analysis systems | 2010

Efficient logo retrieval through hashing shape context descriptors

Marçal Rusiñol; Josep Lladós

In this paper we present a method for organizing and indexing logo digital libraries like the ones of the patent and trademark offices. We propose an efficient queried-by-example retrieval system which is able to retrieve logos by similarity from large databases of logo images. Logos are compactly described by a variant of the shape context descriptor. These descriptors are then indexed by a locality-sensitive hashing data structure aiming to perform approximate k-NN search in high dimensional spaces in sub-linear time. The experiments demonstrate the effectiveness and efficiency of this system on realistic datasets as the Tobacco-800 logo database.


International Journal on Document Analysis and Recognition | 2009

A performance evaluation protocol for symbol spotting systems in terms of recognition and location indices

Marçal Rusiñol; Josep Lladós

Symbol spotting systems are intended to retrieve regions of interest from a document image database where the queried symbol is likely to be found. They shall have the ability to recognize and locate graphical symbols in a single step. In this paper, we present a set of measures to evaluate the performance of a symbol spotting system in terms of recognition abilities, location accuracy and scalability. We show that the proposed measures allow to determine the weaknesses and strengths of different methods. In particular we have tested a symbol spotting method based on a set of four different off-the-shelf shape descriptors.


Pattern Recognition Letters | 2010

Relational indexing of vectorial primitives for symbol spotting in line-drawing images

Marçal Rusiñol; Agnés Borrís; Josep Lladós

This paper presents a symbol spotting approach for indexing by content a database of line-drawing images. As line-drawings are digital-born documents designed by vectorial softwares, instead of using a pixel-based approach, we present a spotting method based on vector primitives. Graphical symbols are represented by a set of vectorial primitives which are described by an off-the-shelf shape descriptor. A relational indexing strategy aims to retrieve symbol locations into the target documents by using a combined numerical-relational description of 2D structures. The zones which are likely to contain the queried symbol are validated by a Hough-like voting scheme. In addition, a performance evaluation framework for symbol spotting in graphical documents is proposed. The presented methodology has been evaluated with a benchmarking set of architectural documents achieving good performance results.


document analysis systems | 2008

Word and Symbol Spotting Using Spatial Organization of Local Descriptors

Marçal Rusiñol; Josep Lladós

In this paper we present a method to spot both text and graphical symbols in a collection of images of wiring diagrams. Word spotting and symbol spotting methods tend to use the most discriminative features to describe the objects to be located. This fact makes that one can not tackle with textual and symbolic information at the same time. We propose a spotting architecture able to index both words and symbols, inspired in off-the-shelf object recognition architectures. Keypoints are extracted from a document image and a local descriptor is computed at each of these points of interest. The spatial organization of these descriptors validate the hypothesis to find an object (text or symbol) in a certain location and under a certain pose.

Collaboration


Dive into the Marçal Rusiñol's collaboration.

Top Co-Authors

Avatar

Josep Lladós

Autonomous University of Barcelona

View shared research outputs
Top Co-Authors

Avatar

Dimosthenis Karatzas

Autonomous University of Barcelona

View shared research outputs
Top Co-Authors

Avatar

Jean-Marc Ogier

University of La Rochelle

View shared research outputs
Top Co-Authors

Avatar

Joseph Chazalon

University of La Rochelle

View shared research outputs
Top Co-Authors

Avatar

David Aldavert

Autonomous University of Barcelona

View shared research outputs
Top Co-Authors

Avatar

Ricardo Toledo

Autonomous University of Barcelona

View shared research outputs
Top Co-Authors

Avatar

Lluis Gomez

Autonomous University of Barcelona

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nibal Nayef

University of La Rochelle

View shared research outputs
Researchain Logo
Decentralizing Knowledge