Francisco Álvaro | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francisco Álvaro is active.

Explore More

Publication

Featured researches published by Francisco Álvaro.

Pattern Recognition Letters | 2014

Recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models

Francisco Álvaro; Joan-Andreu Sánchez; José-Miguel Benedí

This paper describes a formal model for the recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models. Hidden Markov models are used to recognize mathematical symbols, and a stochastic context-free grammar is used to model the relation between these symbols. This formal model makes possible to use classic algorithms for parsing and stochastic estimation. In this way, first, the model is able to capture many of variability phenomena that appear in on-line handwritten mathematical expressions during the training process. And second, the parsing process can make decisions taking into account only stochastic information, and avoiding heuristic decisions. The proposed model participated in a contest of mathematical expression recognition and it obtained the best results at different levels.

document engineering | 2013

A shape-based layout descriptor for classifying spatial relationships in handwritten math

Francisco Álvaro; Richard Zanibbi

We consider the difficult problem of classifying spatial relationships between symbols and subexpressions in handwritten mathematical expressions. We first improve existing geometric features based on bounding boxes and center points, normalizing them using the distance between the centers of the two symbols or subexpressions in question. We then propose a novel feature set for layout classification, using polar histograms computed over points in handwritten strokes. A series of experiments are presented in which a Support Vector Machine is used with these new features to classify spatial relationships of five types in the MathBrush corpus (horizontal, superscript, subscript, below, and inside (e.g. in a square root)). The normalized geometric features provide an improvement over previously published results, while the shape-based features provide a natural representation with results comparable to those for the geometric features. Combining the features produced a very small improvement in accuracy.

international conference on document analysis and recognition | 2011

Recognition of Printed Mathematical Expressions Using Two-Dimensional Stochastic Context-Free Grammars

Francisco Álvaro; Joan-Andreu SÂ´nchez; José-Miguel Benedí

In this work, a system for recognition of printed mathematical expressions has been developed. Hence, a statistical framework based on two-dimensional stochastic context-free grammars has been defined. This formal framework allows to jointly tackle the segmentation, symbol recognition and structural analysis of a mathematical expression by computing its most probable parsing. In order to test this approach a reproducible and comparable experiment has been carried out over a large publicly available (InftyCDB-1) database. Results are reported using a well-defined global dissimilitude measure. Experimental results show that this technique is able to properly recognize mathematical expressions, and that the structural information improves the symbol recognition step.

international conference on pattern recognition | 2010

Comparing Several Techniques for Offline Recognition of Printed Mathematical Symbols

Francisco Álvaro; Joan-Andreu Sánchez

Automatic recognition of printed mathematical symbols is a fundamental problem for recognition of mathematical expressions. Several classification techniques has been previously used, but there are very few works that compare different classification techniques on the same database and with the same experimental conditions. In this work we have tested classical and novelty classification techniques for mathematical symbol recognition on two databases.

international conference on pattern recognition | 2014

Offline Features for Classifying Handwritten Math Symbols with Recurrent Neural Networks

Francisco Álvaro; Joan-Andreu Sánchez; José-Miguel Benedí

In mathematical expression recognition, symbol classification is a crucial step. Numerous approaches for recognizing handwritten math symbols have been published, but most of them are either an online approach or a hybrid approach. There is an absence of a study focused on offline features for handwritten math symbol recognition. Furthermore, many papers provide results difficult to compare. In this paper we assess the performance of several well-known offline features for this task. We also test a novel set of features based on polar histograms and the vertical repositioning method for feature extraction. Finally, we report and analyze the results of several experiments using recurrent neural networks on a large public database of online handwritten math expressions. The combination of online and offline features significantly improved the recognition rate.

Pattern Recognition | 2016

An integrated grammar-based approach for mathematical expression recognition

Francisco Álvaro; Joan-Andreu Sánchez; José-Miguel Benedí

Automatic recognition of mathematical expressions is a challenging pattern recognition problem since there are many ambiguities at different levels. On the one hand, the recognition of the symbols of the mathematical expression. On the other hand, the detection of the two-dimensional structure that relates the symbols and represents the math expression. These problems are closely related since symbol recognition is influenced by the structure of the expression, while the structure strongly depends on the symbols that are recognized. For these reasons, we present an integrated approach that combines several stochastic sources of information and is able to globally determine the most likely expression. This way, symbol segmentation, symbol recognition and structural analysis are simultaneously optimized. In this paper we define the statistical framework of a model based on two-dimensional grammars and its associated parsing algorithm. Since the search space is too large, restrictions are introduced for making the search feasible. We have developed a system that implements this approach and we report results on the large public dataset of the CROHME international competition. This approach significantly outperforms other proposals and was awarded best system using only the training dataset of the competition. HighlightsAn integrated mathematical expression recognition system is proposed.This system integrates several knowledge sources.The learning of the system is described.Experiments with public databases are reported.

international conference on document analysis and recognition | 2013

Classification of On-Line Mathematical Symbols with Hybrid Features and Recurrent Neural Networks

Francisco Álvaro; Joan-Andreu Sánchez; José-Miguel Benedí

Recognition of on-line handwritten mathematical symbols has been tackled using different methods, but the recognition rates achieved until now still leave room for improvement. Many of the published approaches are based on hidden Markov models, and some of them use off-line information extracted from the on-line data. In this paper, we present a set of hybrid features that combine both on-line and off-line information. Lately, recurrent neural networks have demonstrated to obtain good results and they have outperformed hidden Markov models in several sequence learning tasks, including handwritten text recognition. Hence, we also studied a state-of-the-art recurrent neural network classifier and we compared its performance with a classifier based on hidden Markov models. Experiments using a large public database showed that both the new proposed features and recurrent neural network classifier improved significantly the classification results.

iberian conference on pattern recognition and image analysis | 2013

Page Segmentation of Structured Documents Using 2D Stochastic Context-Free Grammars

Francisco Álvaro; Francisco Cruz; Joan-Andreu Sánchez; Oriol Ramos Terrades; José-Miguel Benedí

In this paper we define a bidimensional extension of Stochastic Context-Free Grammars for page segmentation of structured documents. Two sets of text classification features are used to perform an initial classification of each zone of the page. Then, the page segmentation is obtained as the most likely hypothesis according to a grammar. This approach is compared to Conditional Random Fields and results show significant improvements in several cases. Furthermore, grammars provide a detailed segmentation that allowed a semantic evaluation which also validates this model.

international conference on frontiers in handwriting recognition | 2012

Unbiased Evaluation of Handwritten Mathematical Expression Recognition

Francisco Álvaro; Joan-Andreu Sánchez; José-Miguel Benedí

Several approaches have been proposed to tackle the problem of mathematical expression recognition, and automatic methods for performance evaluation are required. Mathematical expressions are usually encoded as a LaTeX string or a tree (MathML) for evaluation purpose, but these formats do not enforce uniqueness. Consequently, given that there can be several representations syntactically different but semantically equivalent, the automatic performance evaluation of mathematical expressions can be biased. Given a mathematical expression recognition tree and its ground-truth tree, the error is usually computed by comparing them. In this paper we propose to obtain a new tree, equivalent to the ground-truth tree, according to the model representation criteria. Then, we can compute an error by comparing the recognized tree with the obtained by using the model, both with the same bias. Several experiments were carried out in order to evaluate this approach and results showed that representation criteria had a significative effect in the evaluation results.

Neurocomputing | 2015

Structure detection and segmentation of documents using 2D stochastic context-free grammars

Francisco Álvaro; Francisco Cruz; Joan-Andreu Sánchez; Oriol Ramos Terrades; José-Miguel Benedí

In this paper we define a bidimensional extension of stochastic context-free grammars for structure detection and segmentation of images of documents. Two sets of text classification features are used to perform an initial classification of each zone of the page. Then, the document segmentation is obtained as the most likely hypothesis according to a stochastic grammar. We used a dataset of historical marriage license books to validate this approach. We also tested several inference algorithms for probabilistic graphical models and the results showed that the proposed grammatical model outperformed the other methods. Furthermore, grammars also provide the document structure along with its segmentation.

Explore More