Stéphane Nicolas
University of Rouen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stéphane Nicolas.
international conference on document analysis and recognition | 2007
Stéphane Nicolas; Julien Dardenne; Thierry Paquet; Laurent Heutte
This work relates to the implementation of a 2D conditional random field model in the context of document image analysis. Our model makes it possible to take variability into account and to integrate contextual knowledge, while taking benefit from machine learning techniques. Experiments on handwritten drafts of Flaubert show that these models provide interesting solutions.
international conference on document analysis and recognition | 2013
Vladislavs Dovgalecs; Alexandre Burnett; Pierrick Tranouez; Stéphane Nicolas; Laurent Heutte
We propose a system designed to spot either words or patterns, based on a user made query. Employing a two stage approach, it takes advantage of the descriptive power of the Bag of Visual Words (BOVW) representation and the discriminative power of the proposed Longest Weighted Profile (LWP) algorithm. First, we try to identify the zones of images that share common characteristics with the query as summed up in a BOVW. Then, we filter these zones using the LWP introducing spatial constraints extracted from the query. We have validated our system on the George Washington handwritten document database for word spotting, and medieval manuscripts from the DocExplore project for pattern spotting.
document analysis systems | 2004
Stéphane Nicolas; Thierry Paquet; Laurent Heutte
In this paper we describe the Bovary Project, a manuscripts digitization project of the famous French writer Gustave FLAUBERT first great work, which should end in 2006 by providing an online access to an hypertextual edition of “Madame Bovary” drafts set. We first develop the global context of this project, the main objectives, and then focus particularly on the document analysis problem. Finally we propose a new approach for the segmentation of handwritten documents.
international conference on document analysis and recognition | 2009
Florent Montreuil; Emmanuèle Grosicki; Laurent Heutte; Stéphane Nicolas
The paper describes a new approach using a Conditional Random Fields (CRFs) to extract physical and logical layouts in unconstrained handwritten letters such as those sent by individuals to companies. In this approach, the extraction of the layouts is considered as a labeling task consisting in assigning a label to each pixel of the document image. This label is chosen among a set of labels depicting the layout elements. The CRF-based method models two stochastic processes : the first one corresponds to the association between pixels and labels, the second one to the relationship of one label with respect to its neighboring labels. The CRF model gives access to the global conditional probability of a given labeling of the image according to image features and some prior knowledge about the structure of the document. This global probability is computed by means of local conditional probabilities at each pixel. To find the best label field, a key point of our model is the implementation of the optimal inference 2D Dynamic Programming method. Experiments have been performed on 1250 handwritten letters of the RIMES database. Good results have been reported showing the capacity of our approach to extract simultaneously the physical and logical layouts.
international conference on pattern recognition | 2006
Stéphane Nicolas; Thierry Paquet; Laurent Heutte
We address in this paper the problem of segmenting complex handwritten pages such as novelist drafts or authorial manuscripts. We propose to use stochastic and contextual models in order to cope with local spatial variability, and to take into account some prior knowledge about the global structure of the document image. The models we propose to use are Markov random field models
international conference on document analysis and recognition | 2013
Fattah Zirari; Abdellatif Ennaji; Stéphane Nicolas; Driss Mammass
Page segmentation into text and non-text elements is an essential preprocessing step before optical character recognition (OCR) operation. In case of poor segmentation, an OCR classification engine produces garbage characters due to the presence of non-text elements. This paper presents a method to separate the textual and non textual components in document images using a graph-based modeling and structural analysis. This is a fast and efficient method to separate adequately the graphical and the textual parts of a document. We have evaluated our method on two well-known subsets: the UW-III dataset and the ICDAR 2009 page segmentation competition dataset. Comparisons are led with two methods of state-of-the-art, these results showing that our method proved better performances in this task.
international conference on document analysis and recognition | 2011
David Hebert; Thierry Paquet; Stéphane Nicolas
We introduce quantization feature functions to represent continuous or large range discrete data into the symbolic CRF data representation. We show that doing this convertion in a simple way allows the CRF to automaticaly select discriminative features to achieve best performance. This system is evaluated on a segmentation task of degraded newspapers archives. The results obtained show the ability of the CRF model to deal with numerical features similarly as for symbolic representation thanks to the use of quantization feature functions. The segmentation task is achieved by the definition of a horizontal CRF model dedicated to pixel labelling.
international conference on document analysis and recognition | 2005
Stéphane Nicolas; Yousri Kessentini; Thierry Paquet; Laurent Heutte
In this paper we present a method based on hidden Markov random fields and 2D dynamic programming image decoding, for segmenting pages of complex handwritten manuscripts such as novelist drafts. After a formal description of the theoretical framework and the principles of the decoding method, we describe the implementation of the model and the decoding method. Then we discuss the results obtained with this approach on the drafts of the French novelist Gustave Flaubert.
Second International Conference on Document Image Analysis for Libraries (DIAL'06) | 2006
Stéphane Nicolas; Thierry Paquet; Laurent Heutte
In this paper we address the problem of segmenting complex handwritten pages such as novelist drafts or authorial manuscripts. We propose to use stochastic and contextual models in order to cope with local spatial variability, and to take into account some prior knowledge about the global structure of the document image. The models we propose to use are Markov Random Field models. After a formal description of the theoretical framework of Markov Random Fields and the principles of image segmentation using such models, we describe the implementation of our model and the proposed segmentation method. Then we discuss the results obtained with this approach on the drafts of the French novelist Gustave Flaubert, for different segmentation tasks. In conclusion, an extension of this work towards the use of discriminative models is discussed
international conference on frontiers in handwriting recognition | 2010
Florent Montreuil; Stéphane Nicolas; Emmanuèle Grosicki; Laurent Heutte
In this study we describe a new approach to extract layout of unconstrained handwritten letters such as those sent by individuals to companies. The proposed model uses a hierarchical combination of Conditional Random Fields (CRFs) which gives access to various levels of the layout interpretation. The analysis proceeds by decreasing the resolution and increasing the abstraction of the document, starting from high resolution analysis (pixel level), to a low resolution of the layout structure. Informations of high resolution are used to bring a specific prior knowledge of the layout like presence of textual information. Experiments have been performed on the RIMES database composed of more than 5000 handwritten letters. Good results have been reported showing the capacity of our approach to extract simultaneously the physical and logical layouts.