Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hubert Emptoz is active.

Publication


Featured researches published by Hubert Emptoz.


Pattern Recognition | 2007

Text search for medieval manuscript images

Yann Leydier; Frank Lebourgeois; Hubert Emptoz

In this article we introduce a text search algorithm designed for ancient manuscripts. Word-spotting is the best alternative to word recognition on this type of document. Our method is based on differential features that are compared using a cohesive elastic matching method, based on zones of interest in order to match only the informative parts of the words. Thus we improved both the accuracy and the runtime of the word-spotting process. The proposed method is tested on medieval manuscripts of Latin and Semitic alphabets as well as on more recent manuscripts.


Pattern Recognition | 2009

Towards an omnilingual word retrieval system for ancient manuscripts

Yann Leydier; Asma Ouji; Frank Lebourgeois; Hubert Emptoz

In this article, we introduce the first method that allows the indexation of ancient manuscripts of any language and alphabet. We describe a word retrieval engine inspired by recent word-spotting advances on ancient manuscripts. Our approach does not need any layout segmentation and makes use of features fitted to any type of alphabet (Latin, Arabic, Chinese, etc.) and writing. The engine is tested on numerous documents and in several use-cases.


document analysis systems | 2006

Restoring ink bleed-through degraded document images using a recursive unsupervised classification technique

Drira Fadoua; Frank Le Bourgeois; Hubert Emptoz

This paper presents a new method to restore a particular type of degradation related to ancient document images. This degradation, referred to as “bleed-through”, is due to the paper porosity, the chemical quality of the ink, or the conditions of digitalization. It appears as marks degrading the readability of the document image. Our purpose consists then in removing these marks to improve readability. The proposed method is based on a recursive unsupervised segmentation approach applied on the decorrelated data space by the principal component analysis. It generates a binary tree that only the leaves images satisfying a certain condition on their logarithmic histogram are processed. Some experiments, done on real ancient document images provided by the archives of “Chatillon-Chalaronne” illustrate the effectiveness of the suggested method.


International Journal on Document Analysis and Recognition | 2007

DEBORA: Digital AccEss to BOoks of the RenAissance

Frank Le Bourgeois; Hubert Emptoz

EBORA (Digital AccEss to BOoks of the RenAissance) is a multidisciplinary European project aiming at digitizing and thus making rare sixteenth century books more accessible. End-users, librarians, historians, researchers in book history and computer scientists participated in the development of remote and collaborative access to digitized Renaissance books, necessary because of the reduced accessibility to digital libraries in image mode through the Internet. The size of files for the storage of images, the lack of a standard file format exchange suitable for progressive transmission, and limited querying possibilities currently limit remote access to digital libraries. To improve accessibility, historical documents must be digitized and retro-converted to extract a detailed description of the image contents suited to users’ needs. Specialists of the Renaissance have described the metadata generally required by end-users and the ideal functionalities of the digital library. The retro-conversion of historical documents is a complex process that includes image capture, metadata extraction, image storage and indexing, automatic conversion in a reusable electronic form, publication on the Internet, and data compression for faster remote access. The steps of this process cannot be developed independently. DEBORA proposes a global approach to retro-conversion from the digitization to the final functionalities of the digital library centered on users’ needs. The retro-conversion process is mainly based on a document image analysis system that simultaneously extracts the metadata and compresses the images. We also propose a file format to describe compressed books as heterogeneous data (images/text/links/ annotation/physical layout and logical structure) suitable for progressive transmission, editing, and annotation. DEBORA is an exploratory project that aims at demonstrating the feasibility of the concepts by developing prototypes tested by end-users.


international conference on pattern recognition | 1998

Handwriting and signature: one or two personality identifiers?

Viviane Bouletreau; Nicole Vincent; Robert Sabourin; Hubert Emptoz

Handwriting and signature are often studied without any connection, In this paper, we present a method applied both to handwriting and signature classification that is based on their fractal behavior. First is presented the method we have developed for the computation of the fractal dimension and the secondary dimension of writing. We describe how these parameters allow us to define a pertinent representation space. We also show how this approach has permitted to extract classes related to writing and signature styles. Lastly, this method has allowed us to give evidence of the independence between the behaviors of the writer when he signs and when he writes. Such an independence will be a source of very enriching information within the context of signature authentication.


international conference on pattern recognition | 2004

Serialized unsupervised classifier for adaptative color image segmentation: application to digitized ancient manuscripts

Yann Leydier; F. Le Bourgeois; Hubert Emptoz

This paper presents an adaptative algorithm for the segmentation of color images suited for document image analysis. The algorithm is based on a serialization of the k-means algorithm that is applied sequentially by using a sliding window over the image. The algorithm reuses information about the clusters computed by the previous classification and automatically adjusts the clusters during the windows displacement in order to better adapt the classifier to any new local modification of the colors. For digitized documents, we propose to define several different clusters in the color feature space for the same logical class. We also reintroduce the user into the initialization step who must define the different samples of colors for each class and the number of classes. This algorithm has been tested successfully on ancient color manuscripts having heavy defects, showing lighting variation and transparency. Nevertheless, the proposed algorithm is generic enough to be applied on a large variety of images using other features for different purposes like color image segmentation as well as image binarization.


International Journal on Document Analysis and Recognition | 2006

Automatic accurate broken character restoration for patrimonial documents

Bénédicte Allier; Nadia Bali; Hubert Emptoz

In this article, we are interested in the restoration of character shapes in antique document images. This particular class of documents generally present a lot of involuntary historical information that have to be taken into account to get quality digital libraries. Actually, many document processing methods of all sorts have already been proposed to cope with degraded character images, but those techniques often consist in replacing the degraded shapes by a corresponding prototype which is not satisfying for lots of specialists. For that, we decided to develop our own method for accurate character restoration, basing our study on generic image processing tools (namely: Gabor filtering and the active contours model) completed with some specific automatically extracted structural information. The principle of our method is to make an active contour recover the lost information using an external energy term based on the use of an automatically built and selected reference character image. Results are presented for real case examples taken from printed and handwritten documents.


International Journal on Document Analysis and Recognition | 2000

A structural representation for understanding line-drawing images

Jean-Yves Ramel; Nicole Vincent; Hubert Emptoz

Abstract. In this paper, we are concerned with the problem of finding a good and homogeneous representation to encode line-drawing documents (which may be handwritten). We propose a method in which the problems induced by a first-step skeletonization have been avoided. First, we vectorize the image, to get a fine description of the drawing, using only vectors and quadrilateral primitives. A structural graph is built with the primitives extracted from the initial line-drawing image. The objective is to manage attributes relative to elementary objects so as to provide a description of the spatial relationships (inclusion, junction, intersection, etc.) that exist between the graphics in the images. This is done with a representation that provides a global vision of the drawings. The capacity of the representation to evolve and to carry highly semantic information is also highlighted. Finally, we show how an architecture using this structural representation and a mechanism of perceptive cycles can lead to a high-quality interpretation of line drawings.


international conference on document analysis and recognition | 2009

Document Images Restoration by a New Tensor Based Diffusion Process: Application to the Recognition of Old Printed Documents

Fadoua Drira; Frank Lebourgeois; Hubert Emptoz

A modification of the Weickert coherence enhancing diffusion filter is proposed for which new constraints formulated form the Perona-Malik equation are added. The new diffusion filter, driven by local tensors fields, takes benefit from both of these approaches and avoids problems known to affect them. This filter reinforces character discontinuity and eliminates the inherent problem of corner rounding while smoothing. Experiments conducted on degraded document images illustrate the effectiveness of the proposed method compared to another anisotropic diffusion approaches. A visual quality improvement is thus achieved on these images. Such improvement leads to a noticeable improvement of the OCR systems accuracy proven through the comparison of OCR recognition rates before and after the diffusion process.


international conference on document analysis and recognition | 2005

Omnilingual segmentation-free word spotting for ancient manuscripts indexation

Yann Leydier; F. Le Bourgeois; Hubert Emptoz

This article introduces a new word spotting method designed for ancient manuscripts. We take advantage of the robustness of the gradient feature and propose a new segmentation-free matching algorithm that tolerates spatial variations. We test our algorithm on ancient Latin manuscripts and on George Washingtons manuscripts.

Collaboration


Dive into the Hubert Emptoz's collaboration.

Top Co-Authors

Avatar

Véronique Eglin

Institut national des sciences Appliquées de Lyon

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stéphane Bres

Institut national des sciences Appliquées de Lyon

View shared research outputs
Top Co-Authors

Avatar

Djamel Gaceb

Institut national des sciences Appliquées de Lyon

View shared research outputs
Top Co-Authors

Avatar

Frank Le Bourgeois

Institut national des sciences Appliquées de Lyon

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nicole Vincent

Paris Descartes University

View shared research outputs
Top Co-Authors

Avatar

F. Le Bourgeois

Institut national des sciences Appliquées de Lyon

View shared research outputs
Top Co-Authors

Avatar

Guillaume Joutel

Institut national des sciences Appliquées de Lyon

View shared research outputs
Top Co-Authors

Avatar

Vincent Malleron

Institut national des sciences Appliquées de Lyon

View shared research outputs
Researchain Logo
Decentralizing Knowledge