Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Claudie Faure is active.

Publication


Featured researches published by Claudie Faure.


international conference on document analysis and recognition | 2003

Detection, extraction and representation of tables

Jean-Yves Ramel; Michel Crucianu; Nicole Vincent; Claudie Faure

We are concerned with the extraction of tables from exchange format representations of very diverse composite documents. We put forward a flexible representation scheme for complex tables, based on a clear distinction between the physical layout of a table and its logical structure. Relying on this scheme, we develop a new method for the detection and the extraction of tables by an analysis of the graphic lines. To deal with tables that lack all or most of the graphic marks, one must focus on the regularities of the text elements alone. We propose such a method, based on a multi-level analysis of the layout of text components on a page. A general graph representation of the relative positions of blocks of text is exploited.


international conference on document analysis and recognition | 1995

Pattern recognition and beautification for a pen based interface

Luc Julia; Claudie Faure

The paper presents the algorithms for recognition and beautification which are used in incremental graphic design applications. These applications propose multimodal interfaces integrating handwriting, gesture, and speech on a pen-computer. User and computer collaborate to perform the task of incrementally designing a drawing. Processing and data representation take into account the variable quality of handwritten data, the man-machine interaction context and the cooperation between the user and the interpretation system. Both recognition algorithms may be used in combination in order to increase the speed and the set of recognized figures. Local recognition is followed by the beautification of the global structure in order to detect alignments and logical structures. The beautification enables the user to display a clean version of the original draft. The applications which the authors developed are used to recognize tables, gestures, geometrical figures or diagram networks.


Pattern Recognition | 2012

Word spotting in historical printed documents using shape and sequence comparisons

Khurram Khurshid; Claudie Faure; Nicole Vincent

Information spotting in scanned historical document images is a very challenging task. The joint use of the mechanical press and of human controlled inking introduced great variability in ink level within a book or even within a page. Consequently characters are often broken or merged together and thus become difficult to segment and recognize. The limitations of commercial OCR engines for information retrieval in historical document images have inspired alternative means of identification of given words in such documents. We present a word spotting method for scanned documents in order to find the word images that are similar to a query word, without assuming a correct segmentation of the words into characters. The connected components are first processed to transform a word pattern into a sequence of sub-patterns. Each sub-pattern is represented by a sequence of feature vectors. A modified Edit distance is proposed to perform a segmentation-driven string matching and to compute the Segmentation Driven Edit (SDE) distance between the words to be compared. The set of SDE operations is defined to obtain the word segmentations that are the most appropriate to evaluate their similarity. These operations are efficient to cope with broken and touching characters in words. The distortion of character shapes is handled by coupling the string matching process with local shape comparisons that are achieved by Dynamic Time Warping (DTW). The costs of the SDE operations are provided by the DTW distances. A sub-optimal version of the SDE string matching is also proposed to reduce the computation time, nevertheless it did not lead to a great decrease in performance. It is possible to enter a query by example or a textual query entered with the keyboard. Textual queries can be used to directly spot the word without the need to synthesize its image, as far as character prototype images are available. Results are presented for different documents and compared with other methods, showing the efficiency of our method.


international conference on document analysis and recognition | 1997

Perceptually-based representation of network diagrams

Diana Galindo; Claudie Faure

A pen based interactive editor for network diagrams is proposed. The network diagrams are composed of geometrical figures and connecting lines. The user may hand-sketch a first diagram and then, modify it by adding new components or by erasing, replacing or moving the existing ones. The machine beautifies the draft and updates the whole structure of the diagram when the user produces local modification. Graphic communication implies that the layout of the diagram follows principles which are grounded on visual perception. The machine must be able to detect perceptual constraints (alignments, equality of sizes etc.) for beautification and updating. A perceptually structured representation (PSR) of the diagram is built automatically. A formal model of diagram perception is defined as a set of rules which are applied during the global analysis following the local analysis where the figures are detected and recognised.


international conference on document analysis and recognition | 2009

Fusion of Word Spotting and Spatial Information for Figure Caption Retrieval in Historical Document Images

Khurram Khurshid; Claudie Faure; Nicole Vincent

We present a method for figure caption detection by employing a fusion of several information sources. The evaluation is performed on documents gathered from the collection of the historical medical digital library Medic@. A method based on perceptual grouping simultaneously segments the vertical and horizontal text lines in a page. Spatial relationships between the text lines and the graphics are considered to select a set of caption line candidates. A feature-based wordspotting method is proposed to retrieve the occurrences of word images similar to a given query.Word-spotting is applied to detect the label of the captions, a word like ‘Fig’, ‘FIG’, ‘Figure’ ...followed by the figure number. Combining spatial information and word recognition greatly improve the detection of caption lines. Our initial experiments process more than 300 pages from three different books.


document recognition and retrieval | 2009

Simultaneous detection of vertical and horizontal text lines based on perceptual organization

Claudie Faure; Nicole Vincent

A page of a document is a set of small components which are grouped by a human reader into higher level components, such as lines and text blocs. Document image analysis is aimed at detecting these components in document images. We propose the encoding of local information by considering the properties that determine perceptual grouping. Each connected component is labelled according to the location of its nearest neighbour connected component. These labelled components constitute the input of a rule-based incremental process. Vertical and horizontal text lines are detected without prior assumption on their direction. Touching characters belonging to different lines are detected early and discarded from the grouping process to avoid line merging. The tolerance for grouping components increases in the course of the process until the final decision. After each step of the grouping process, conflict resolution rules are activated. This work was motivated by the automatic detection of Figure&Caption pairs in the documents of the historical collection of the BIUM digital library (Bibliotheque InterUniversitaire Medicale). The images that were used in this study belong to this collection.


computer analysis of images and patterns | 2009

A Novel Approach for Word Spotting Using Merge-Split Edit Distance

Khurram Khurshid; Claudie Faure; Nicole Vincent

Edit distance matching has been used in literature for word spotting with characters taken as primitives. The recognition rate however, is limited by the segmentation inconsistencies of characters (broken or merged) caused by noisy images or distorted characters. In this paper, we have proposed a Merge-split edit distance which overcomes these segmentation problems by incorporating a multi-purpose merge cost function. The system is based on the extraction of words and characters in the text and then attributing each character with a set of features. Characters are matched by comparing their extracted feature sets using Dynamic Time Warping (DTW) while the words are matched by comparing the strings of characters using the proposed Merge-Split Edit distance algorithm. Evaluation of the method on 19th century historical document images exhibits extremely promising results.


document analysis systems | 2012

Web Document Analysis Based on Visual Segmentation and Page Rendering

Cong Kinh Nguyen; Laurence Likforman-Sulem; Jean-Claude Moissinac; Claudie Faure; Jeremy Lardon

This paper proposes an approach for segmenting a Web page into its semantic parts. Such analysis may be useful for adapting blog or other pages on small devices. In this approach, we take advantage of both dynamic layout after rendering and textual information. Our method segments the page into blocks and then classifies the blocks. A classification in semantic parts is performed thanks to a SVM-based machine learning approach using a set of 30 textual and visual-based features. Evaluation is conducted on a Web blog database. Results are provided for both block classification and blog segmentation into articles.


document recognition and retrieval | 2010

Detection of figure and caption pairs based on disorder measurements

Claudie Faure; Nicole Vincent

Figures inserted in documents mediate a kind of information for which the visual modality is more appropriate than the text. A complete understanding of a figure often necessitates the reading of its caption or to establish a relationship with the main text using a numbered figure identifier which is replicated in the caption and in the main text. A figure and its caption are closely related; they constitute single multimodal components (FC-pair) that Document Image Analysis cannot extract with text and graphics segmentation. We propose a method to go further than the graphics and text segmentation in order to extract FC-pairs without performing a full labelling of the page components. Horizontal and vertical text lines are detected in the pages. The graphics are associated with selected text lines to initiate the detector of FC-pairs. Spatial and visual disorders are introduced to define a layout model in terms of properties. It enables to cope with most of the numerous spatial arrangements of graphics and text lines. The detector of FC-pairs performs operations in order to eliminate the layout disorder and assigns a quality value to each FC-pair. The processed documents were collected in medic@, the digital historical collection of the BIUM (Bibliothèque InterUniversitaire Médicale). A first set of 98 pages constitutes the design set. Then 298 pages were collected to evaluate the system. The performances are the result of a full process, from the binarisation of the digital images to the detection of FC-pairs.


document recognition and retrieval | 2009

Comparison of Niblack inspired binarization methods for ancient documents

Khurram Khurshid; Imran Siddiqi; Claudie Faure; Nicole Vincent

Collaboration


Dive into the Claudie Faure's collaboration.

Top Co-Authors

Avatar

Nicole Vincent

Paris Descartes University

View shared research outputs
Top Co-Authors

Avatar

Khurram Khurshid

Institute of Space Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Luc Julia

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jean-Yves Ramel

François Rabelais University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michel Crucianu

Conservatoire national des arts et métiers

View shared research outputs
Researchain Logo
Decentralizing Knowledge