Christopher Kermorvant
Intelligence and National Security Alliance
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Christopher Kermorvant.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2011
Anne-Laure Bianne-Bernard; Farès Menasri; R. Al-Hajj Mohamad; Chafic Mokbel; Christopher Kermorvant; Laurence Likforman-Sulem
This study aims at building an efficient word recognition system resulting from the combination of three handwriting recognizers. The main component of this combined system is an HMM-based recognizer which considers dynamic and contextual information for a better modeling of writing units. For modeling the contextual units, a state-tying process based on decision tree clustering is introduced. Decision trees are built according to a set of expert-based questions on how characters are written. Questions are divided into global questions, yielding larger clusters, and precise questions, yielding smaller ones. Such clustering enables us to reduce the total number of models and Gaussians densities by 10. We then apply this modeling to the recognition of handwritten words. Experiments are conducted on three publicly available databases based on Latin or Arabic languages: Rimes, IAM, and OpenHart. The results obtained show that contextual information embedded with dynamic modeling significantly improves recognition.
international conference on document analysis and recognition | 2013
Théodore Bluche; Hermann Ney; Christopher Kermorvant
In this paper, we show that learning features with convolutional neural networks is better than using hand-crafted features for handwritten word recognition. We consider two kinds of systems: a grapheme based segmentation and a sliding window segmentation. In both cases, the combination of a convolutional neural network with a HMM outperform a state-of-the art HMM system based on explicit feature extraction. The experiments are conducted on the Rimes database. The systems obtained with the two kinds of segmentation are complementary: when they are combined, they outperform the systems in isolation. The system based on grapheme segmentation yields lower recognition rate but is very fast, which is suitable for specific applications such as document classification.
international conference on acoustics, speech, and signal processing | 2013
Théodore Bluche; Hermann Ney; Christopher Kermorvant
In this paper, we investigate the combination of hidden Markov models and convolutional neural networks for handwritten word recognition. The convolutional neural networks have been successfully applied to various computer vision tasks, including handwritten character recognition. In this work, we show that they can replace Gaussian mixtures to compute emission probabilities in hidden Markov models (hybrid combination), or serve as feature extractor for a standard Gaussian HMM system (tandem combination). The proposed systems outperform a basic HMM based on either decorrelated pixels or handcrafted features. We validated the approach on two publicly available databases, and we report up to 60% (Rimes) and 35% (IAM) relative improvement compared to a Gaussian HMM based on pixel values. The final systems give comparable results to recurrent neural networks, which are the best systems since 2009.
international conference on frontiers in handwriting recognition | 2014
Théodore Bluche; Bastien Moysset; Christopher Kermorvant
In this paper, we present a method for the automatic segmentation and transcript alignment of documents, for which we only have the transcript at the document level. We consider several line segmentation hypotheses, and recognition hypotheses for each segmented line. The recognition is highly constrained with the document transcript. We formalize the problem in a weighted finite-state transducer framework. We evaluate how the constraints help achieve a reasonable result. In particular, we assess the performance of the system both in terms of segmentation quality and transcript mapping. The main contribution of this paper is that we jointly find the best segmentation and transcript mapping that allow to align the image with the whole ground-truth text. The evaluation is carried out on fully annotated public databases. Furthermore, we retrieved training material with this system for the Maurdor evaluation, where the data was only annotated at the paragraph level. With the automatically segmented and annotated lines, we record a relative improvement in Word Error Rate of 35.6%.
international conference on document analysis and recognition | 2015
Bastien Moysset; Christopher Kermorvant; Christian Wolf; Jérôme Louradour
The detection of text lines, as a first processing step, is critical in all text recognition systems. State-of-the-art methods to locate lines of text are based on handcrafted heuristics fine-tuned by the image processing communitys experience. They succeed under certain constraints; for instance the background has to be roughly uniform. We propose to use more “agnostic” Machine Learning-based approaches to address text line location. The main motivation is to be able to process either damaged documents, or flows of documents with a high variety of layouts and other characteristics. A new method is presented in this work, inspired by the latest generation of optical models used for text recognition, namely Recurrent Neural Networks. As these models are sequential, a column of text lines in our application plays here the same role as a line of characters in more traditional text recognition settings. A key advantage of the proposed method over other data-driven approaches is that compiling a training dataset does not require labeling line boundaries: only the number of lines are required for each paragraph. Experimental results show that our approach gives similar or better results than traditional handcrafted approaches, with little engineering efforts and less hyper-parameter tuning.
international conference on document analysis and recognition | 2015
Théodore Bluche; Hermann Ney; Jérôme Louradour; Christopher Kermorvant
In recent years, Long Short-Term Memory Recurrent Neural Networks (LSTM-RNNs) trained with the Connectionist Temporal Classification (CTC) objective won many international handwriting recognition evaluations. The CTC algorithm is based on a forward-backward procedure, avoiding the need of a segmentation of the input before training. The network outputs are characters labels, and a special non-character label. On the other hand, in the hybrid Neural Network / Hidden Markov Models (NN/HMM) framework, networks are trained with framewise criteria to predict state labels. In this paper, we show that CTC training is close to forward-backward training of NN/HMMs, and can be extended to more standard HMM topologies. We apply this method to Multi-Layer Perceptrons (MLPs), and investigate the properties of CTC, namely the modeling of character by single labels and the role of the special label.
document recognition and retrieval | 2010
Anne-Laure Bianne; Christopher Kermorvant; Laurence Likforman-Sulem
This paper presents an HMM-based recognizer for the off-line recognition of handwritten words. Word models are the concatenation of context-dependent character models (trigraphs). The trigraph models we consider are similar to triphone models in speech recognition, where a character adapts its shape according to its adjacent characters. Due to the large number of possible context-dependent models to compute, a top-down clustering is applied on each state position of all models associated with a particular character. This clustering uses decision trees, based on rhetorical questions we designed. Decision trees have the advantage to model untrained trigraphs. Our system is shown to perform better than a baseline context independent system, and reaches an accuracy higher than 74% on the publicly available Rimes database.
international conference on document analysis and recognition | 2015
Théodore Bluche; Hermann Ney; Christopher Kermorvant
In this paper we present the handwriting recognition systems submitted by the LIMSI to the HTRtS 2014 contest. The systems for both the restricted and unrestricted tracks consisted of combination of several optical models. We extracted handcrafted features as well as pixels values with a sliding window. We trained Deep Neural Networks (DNNs) and Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNNs), which where plugged as the optical model in Hidden Markov Models (HMMs). We propose a novel method to build language models that can cope with hyphenation in the text. The combination was performed from lattices generated from the different systems. We were the only team participating in both tracks and ranked second in each. The final Word Error Rates were 15.0% and 11.0% for the restricted (resp. unrestricted) track. We studied the impact of adding data for optical and language modeling. After the evaluation, we also used the same corpus for the language model as the winning team and obtained comparable results.
international conference on frontiers in handwriting recognition | 2010
Christopher Kermorvant; F. Menasri; A.-L. Bianne; R. Al-Hajj; Chafic Mokbel; Laurence Likforman-Sulem
This article describes the isolated word recognizer presented by the authors to the ICDAR 2009 French handwriting recognition competition. The system is a combination of three isolated word recognizers based on different features and models. A novel n-best combination method is proposed and compared to standard combination methods. New results on the ICDAR 2009 test database are reported.
international conference on document analysis and recognition | 2009
Christopher Kermorvant; Anne-Laure Bianne; Patrick Marty; Farès Menasri
Recognition of handwritten characters has been a popular task for the evaluation of classification algorithms for many years. Looking at the latest results on databases such as USPS or MNIST, one could think that character recognition is a solved problem. In this paper, we claim that this is not the case for two reasons : first because the classical databases for digit recognition are realistic but too simple and second because digit recognition is not a real-world task but only a part of it. In this paper, we contribute to a better understanding of these two aspects with new results. In a first part, we compare three state-of-the-art recognizers on a digit recognition task extracted from a real world application and show that the error rates on this database can not be extrapolated from MNIST. Then, in a second part, we present and evaluate a system designed for an industrial application based on character recognition : document identification with floating field recognition.