Yann Leydier | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yann Leydier is active.

Explore More

Publication

Featured researches published by Yann Leydier.

Pattern Recognition | 2007

Text search for medieval manuscript images

Yann Leydier; Frank Lebourgeois; Hubert Emptoz

In this article we introduce a text search algorithm designed for ancient manuscripts. Word-spotting is the best alternative to word recognition on this type of document. Our method is based on differential features that are compared using a cohesive elastic matching method, based on zones of interest in order to match only the informative parts of the words. Thus we improved both the accuracy and the runtime of the word-spotting process. The proposed method is tested on medieval manuscripts of Latin and Semitic alphabets as well as on more recent manuscripts.

Pattern Recognition | 2009

Towards an omnilingual word retrieval system for ancient manuscripts

Yann Leydier; Asma Ouji; Frank Lebourgeois; Hubert Emptoz

In this article, we introduce the first method that allows the indexation of ancient manuscripts of any language and alphabet. We describe a word retrieval engine inspired by recent word-spotting advances on ancient manuscripts. Our approach does not need any layout segmentation and makes use of features fitted to any type of alphabet (Latin, Arabic, Chinese, etc.) and writing. The engine is tested on numerous documents and in several use-cases.

international conference on frontiers in handwriting recognition | 2002

A hybrid large vocabulary handwritten word recognition system using neural networks with hidden Markov models

Alessandro L. Koerich; Yann Leydier; Robert Sabourin; Ching Y. Suen

We present a hybrid recognition system that integrates hidden Markov models (HMM) with neural networks (NN) in a probabilistic framework. The input data is processed first by a lexicon-driven word recognizer based on HMMs to generate a list of the candidate N-best-scoring word hypotheses as well as the segmentation of such word hypotheses into characters. An NN classifier is used to generate a score for each segmented character and in the end, the scores from the HMM and the NN classifiers are combined to optimize performance. Experimental results show that for an 80,000-word vocabulary, the hybrid HMM/NN system improves by about 10% the word recognition rate over the HMM system alone.

international conference on pattern recognition | 2004

Serialized unsupervised classifier for adaptative color image segmentation: application to digitized ancient manuscripts

Yann Leydier; F. Le Bourgeois; Hubert Emptoz

This paper presents an adaptative algorithm for the segmentation of color images suited for document image analysis. The algorithm is based on a serialization of the k-means algorithm that is applied sequentially by using a sliding window over the image. The algorithm reuses information about the clusters computed by the previous classification and automatically adjusts the clusters during the windows displacement in order to better adapt the classifier to any new local modification of the colors. For digitized documents, we propose to define several different clusters in the color feature space for the same logical class. We also reintroduce the user into the initialization step who must define the different samples of colors for each class and the number of classes. This algorithm has been tested successfully on ancient color manuscripts having heavy defects, showing lighting variation and transparency. Nevertheless, the proposed algorithm is generic enough to be applied on a large variety of images using other features for different purposes like color image segmentation as well as image binarization.

international conference on document analysis and recognition | 2005

Omnilingual segmentation-free word spotting for ancient manuscripts indexation

Yann Leydier; F. Le Bourgeois; Hubert Emptoz

This article introduces a new word spotting method designed for ancient manuscripts. We take advantage of the robustness of the gradient feature and propose a new segmentation-free matching algorithm that tolerates spatial variations. We test our algorithm on ancient Latin manuscripts and on George Washingtons manuscripts.

international conference on document analysis and recognition | 2011

Chromatic / Achromatic Separation in Noisy Document Images

Asma Ouji; Yann Leydier; Frank Lebourgeois

This paper presents a new method to split an image into chromatic and achromatic zones. The proposed algorithm is dedicated to document images. It is robust to the color noise introduced by scanners and image compression. It is also parameter-free since it automatically adapts to the image content.

document analysis systems | 2004

Serialized k-Means for Adaptative Color Image Segmentation

Yann Leydier; Frank Le Bourgeois; Hubert Emptoz

This paper introduces an adaptative segmentation system that was designed for color document image analysis. The method is based on the serialization of a k-means algorithm that is applied sequentially by using a sliding window over the image. During the window’s displacement, the algorithm reuses information from the clusters computed in the previous window and automatically adjusts them in order to adapt the classifier to any new local variation of the colors. To improve the results, we propose to define several different clusters in the color feature space for each logical class. We also reintroduce the user into the initialization step to define the number of classes and the different samples for each class. This method has been tested successfully on ancient color manuscripts, video images and multiple natural and non-natural images having heavy defects and showing illumination variation and transparency. The proposed algorithm is generic enough to be applied on a large variety of images for different purposes such as color image segmentation as well as binarization.

Pattern Analysis and Applications | 2013

A hierarchical and scalable model for contemporary document image segmentation

Asma Ouji; Yann Leydier; Frank Lebourgeois

In this paper, we introduce a novel color segmentation approach robust against digitization noise and adapted to contemporary document images. This system is scalable, hierarchical, versatile and completely automated, i.e. user independent. It proposes an adaptive binarization/quantization without any penalizing information loss. This model may be used for many purposes. For instance, we rely on it to carry out the first steps leading to advertisement recognition in document images. Furthermore, the color segmentation output is used to localize text areas and enhance optical character recognition (OCR) performances. We held tests on a variety of magazine images to point up our contribution to the well-known OCR product Abby FinerReader. We also get promising results with our ad detection system on a large set of complex layout testing images.

international conference on multimedia and expo | 2011

Advertisement detection in digitized press images

Asma Ouji; Yann Leydier; Frank Lebourgeois

This paper presents the first method for detecting advertisements in digitized press. The system aims at locating and recognizing ads. A color segmentation approach which is robust against digitization noise is introduced. The color separation output is used to carry out layout segmentation in document pages and to compute visual features. Block classification results, given with a variety of magazine and newspaper pages, are presented and discussed.

international conference on frontiers in handwriting recognition | 2016

libcrn, an Open-Source Document Image Processing Library

Yann Leydier; Jean Duong; Stéphane Bres; Véronique Eglin; Frank Lebourgeois; Martial Tola

In this paper we introduce libcrn, a multiplatform open-source document image processing library aimed at researchers and companies. It is written in C++11 and has a non-contaminating license that makes it available for use in any project without legal constraints. The features include low-level image processing (color format conversion, binarization, convolution, PDE…), document images specific tools (connected components extraction, recursive block description, PDF export…), maths (matrix arithmetics, linear algebra, GMMs, equation solvers…), classification and clustering (kNN, k-means, HMMs…). The API is comprehensively documented and libcrns architecture follows modern C++ guidelines to facilitate the handling of the library and enforce its safe usage. A sample OCR, which is only 30 lines long, is described to illustrate libcrns scope of possibilities.

Explore More