Simone Marinai
University of Florence
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Simone Marinai.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005
Simone Marinai; Marco Gori; Giovanni Soda
Artificial neural networks have been extensively applied to document analysis and recognition. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document processing tasks, like preprocessing, layout analysis, character segmentation, word recognition, and signature verification, have been effectively faced with very promising results. This paper surveys the most significant problems in the area of offline document image processing, where connectionist-based approaches have been applied. Similarities and differences between approaches belonging to different categories are discussed. A particular emphasis is given on the crucial role of prior knowledge for the conception of both appropriate architectures and learning algorithms. Finally, the paper provides a critical analysts on the reviewed approaches and depicts the most promising research guidelines in the field. In particular, a second generation of connectionist-based models are foreseen which are based on appropriate graphical representations of the learning environment.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1998
Francesca Cesarini; Marco Gori; Simone Marinai; Giovanni Soda
We describe a flexible form-reader system capable of extracting textual information from accounting documents, like invoices and bills of service companies. In this kind of document, the extraction of some information fields cannot take place without having detected the corresponding instruction fields, which are only constrained to range in given domains. We propose modeling the documents layout by means of attributed relational graphs, which turn out to be very effective for form registration, as well as for performing a focused search for instruction fields. This search is carried out by means of a hybrid model, where proper algorithms, based on morphological operations and connected components, are integrated with connectionist models. Experimental results are given in order to assess the actual performance of the system.
Archive | 2004
Simone Marinai; Andreas Dengel
Implications of technical demands made within digital libraries (DL’s) for document image analysis systems are discussed. The state-of-the-art is summarized, including a digest of themes that emerged during the recent International Workshop on Document Image Analysis for Libraries. We attempt to specify, in considerable detail, the essential features of document analysis systems that can assist in: (a) the creation of DL’s; (b) automatic indexing and retrieval of doc-images within DL’s; (c) the presentation of doc-images to DL users; (d) navigation within and among doc-images in DL’s; and (e) effective use of personal and
graphics recognition | 1997
Enrico Francesconi; Paolo Frasconi; Marco Gori; Simone Marinai; Jianqing Sheng; Giovanni Soda; Alessandro Sperduti
In this paper we propose recognizing logo images by using an adaptive model referred to as recursive artificial neural network. At first, logo images are converted into a structured representation based on contour trees. Recursive neural networks are then learnt using the contourtrees as inputs to the neural nets. On the other hand, the contour-tree is constructed by associating a node with each exterior or interior contour extracted from the logo instance. Nodes in the tree are labeled by a feature vector, which describes the contour by means of its perimeter, surrounded area, and a synthetic representation of its curvature plot. The contour-tree representation contains the topological structured information of logo and continuous values pertaining to each contour node. Hence symbolic and sub-symbolic information coexist in the contour-tree representation of logo image. Experimental results are reported on 40 real logos distorted with artificial noise and performance of recursive neural network is compared with another two types of neural approaches.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006
Simone Marinai; Emanuele Marino; Giovanni Soda
We propose an approach for the word-level indexing of modern printed documents which are difficult to recognize using current OCR engines. By means of word-level indexing, it is possible to retrieve the position of words in a document, enabling queries involving proximity of terms. Web search engines implement this kind of indexing, allowing users to retrieve Web pages on the basis of their textual content. Nowadays, digital libraries hold collections of digitized documents that can be retrieved either by browsing the document images or relying on appropriate metadata assembled by domain experts. Word indexing tools would therefore increase the access to these collections. The proposed system is designed to index homogeneous document collections by automatically adapting to different languages and font styles without relying on OCR engines for character recognition. The approach is based on three main ideas: the use of self organizing maps (SOM) to perform unsupervised character clustering, the definition of one suitable vector-based word representation whose size depends on the word aspect-ratio, and the run-time alignment of the query word with indexed words to deal with broken and touching characters. The most appropriate applications are for processing modern printed documents (17th to 19th centuries) where current OCR engines are less accurate. Our experimental analysis addresses six data sets containing documents ranging from books of the 17th century to contemporary journals
International Journal on Document Analysis and Recognition | 2001
Enrico Appiani; Francesca Cesarini; Anna Maria Colla; Michelangelo Diligenti; Marco Gori; Simone Marinai; Giovanni Soda
Abstract. In this paper a system for analysis and automatic indexing of imaged documents for high-volume applications is described. This system, named STRETCH (STorage and RETrieval by Content of imaged documents), is based on an Archiving and Retrieval Engine, which overcomes the bottleneck of document profiling bypassing some limitations of existing pre-defined indexing schemes. The engine exploits a structured document representation and can activate appropriate methods to characterise and automatically index heterogeneous documents with variable layout. The originality of STRETCH lies principally in the possibility for unskilled users to define the indexes relevant to the document domains of their interest by simply presenting visual examples and applying reliable automatic information extraction methods (document classification, flexible reading strategies) to index the documents automatically, thus creating archives as desired. STRETCH offers ease of use and application programming and the ability to dynamically adapt to new types of documents. The system has been tested in two applications in particular, one concerning passive invoices and the other bank documents. In these applications, several classes of documents are involved. The indexing strategy first automatically classifies the document, thus avoiding pre-sorting, then locates and reads the information pertaining to the specific document class. Experimental results are encouraging overall; in particular, document classification results fulfill the requirements of high-volume application. Integration into production lines is under execution.
Archive | 2008
Simone Marinai; Hiromichi Fujisawa
The objective of Document Analysis and Recognition (DAR) is to recognize the text and graphical components of a document and to extract information. This book is a collection of research papers and state-of-the-art reviews by leading researchers all over the world including pointers to challenges and opportunities for future research directions. The main goals of the book are identification of good practices for the use of learning strategies in DAR, identification of DAR tasks more appropriate for these techniques, and highlighting new learning algorithms that may be successfully applied to DAR.
international conference on document analysis and recognition | 1999
Francesca Cesarini; Marco Gori; Simone Marinai; Giovanni Soda
We describe a top-down approach to the segmentation and representation of documents containing tabular structures. Examples of these documents are invoices and technical papers with tables. The segmentation is based on an extension of X-Y trees, where the regions are split by means of cuts along separators (e.g. lines), in addition to cuts along white spaces. The leaves describe regions containing homogeneous information and cutting separators. Adjacency links among leaves of the tree describe local relationships between corresponding regions.
international conference on document analysis and recognition | 1997
Francesca Cesarini; Enrico Francesconi; Marco Gori; Simone Marinai; Jianqing Sheng; Giovanni Soda
Much attention has recently been paid to the recognition of graphical objects, such as company logos and trademarks. Recognizing these objects facilitates the recognition of document classes. Some promising results have been achieved by using autoassociator-based artificial neural networks (AANN) in the presence of homogeneously distributed noise. However, the performance drops significantly when dealing with spot-noisy logos, where strips or blobs produce a partial obstruction of the pictures. We propose a new approach for training AANNs especially conceived for dealing with spot noise. The basic idea is to introduce new metrics for assessing the reproduction error in AANNs. The proposed algorithm, referred to as spot-backpropagation (S-BP), is significantly more robust with respect to spot-noise than classical Euclidean norm-based backpropagation (BP). Our experimental results are based on a database of 88 real logos that are artificially corrupted by spot-noise.
international conference on pattern recognition | 2002
Francesca Cesarini; Simone Marinai; L. Sarti; Giovanni Soda
We describe an approach for table location in document images. The documents are described by means of a hierarchical representation that is based on the MXY tree. The presence of a table is hypothesized by searching parallel lines in the MXY tree of the page. This hypothesis is afterwards verified by locating perpendicular lines or white spaces in the region included between the parallel lines. Lastly, located tables can be merged on the basis of proximity and similarity criteria. The use of an optimization method, that relies on the definition of an appropriate table location index, allows us to identify, the optimal values of thresholds involved in the algorithm. In this way the algorithm can be adapted to recognize tables with different features by maximizing the performance on an appropriate training set. The algorithm has been evaluated on two data-sets containing more than 1500 pages, and comparing its results with the tables identified by two commercial OCRs.