Djamel Gaceb
Institut national des sciences Appliquées de Lyon
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Djamel Gaceb.
International Journal on Document Analysis and Recognition | 2008
Djamel Gaceb; Véronique Eglin; Frank Lebourgeois; Hubert Emptoz
An efficient mail sorting system is mainly based on an accurate optical recognition of the addresses on the envelopes. However, the localizing of the address block (ABL) should be done before the OCR recognition process. The location step is very crucial as it has a great impact on the global performance of the system. Consequently a good localizing step leads to a better recognition rate. The limits of current methods are mainly caused by modular linear architectures used for ABL and the lack of cooperation between modules: their performances greatly depend on each independent module performance. We are presenting in this paper a new approach for ABL based on a pyramidal data organization and on a hierarchical graph coloring for classification process. This new approach presents the advantage to guarantee a good coherence between different modules and it also reduces both the computation time and the rejection rate. The proposed method gives a very satisfying rate of 98% of good locations on a set of 750 envelope images.
international conference on document analysis and recognition | 2013
Frank Lebourgeois; Fadoua Drira; Djamel Gaceb; Jean Duong
Global Mean Shift algorithm is an unsupervised clustering technique already applied for color document image segmentation. Nevertheless, its important computational cost limits its application for document images. The complexity of the global approach is explained by the intensive search of colors samples in the Parzen window to compute the vector oriented toward the mean. For making it more flexible, several attempts have tried to decrease the algorithm complexity mainly by adding spatial information or by reducing the number of colors to shift or even by selecting a reduced number of colors to estimate the means of density function. This paper presents a fast optimized Mean Shift with a much reduced computational cost. This algorithm uses both the discretisation of the shift and the integral image which allow the computation of means into the Parzen windows with a reduced and fixed number of operations. With the discretisation of the color space, the fast optimised MeanShift also memorizes all existing paths to avoid shifting again colors along similar path. Despite the square shape of the Parzen windows and the uniform kernel used, the results are very similar to those obtained by the global Mean Shift algorithm. The proposed algorithm is compared to the different existing implementation of similar algorithms found in the literature.
international conference on document analysis and recognition | 2013
Djamel Gaceb; Frank Lebourgeois; Jean Duong
The automatic reading systems of business documents requires fast and accurate reading of interest zones using the OCR technology. The result quality of the binarization has a major impact on the quality of binary characters. We propose in this paper a smart-binarization method of the images of business documents. In our work, we considered different degradations on document images, real-time constraints and high spatial resolution of the images. The quality of each pixel is estimated using a hierarchical local thresholding in order to classify it as foreground, background or ambiguous pixel. The ambiguous pixels that represent the degraded zones cannot be binarized with the same local thresholding. The global quality of the image is thus estimated from the density of theses degraded pixels. If it is considered as degraded, we apply a second separation on the ambiguous pixels to separate them into background or foreground. This second process uses our improved relaxation method that we have accelerate for the first time to integrate it into a system of automatic reading document. Our approach, compared to existing binarization approaches (local or global), offers a better reading of characters by the OCR. The computation time remains constant with the variation of the local window size through the use of integral images. The method was developed in the context of DOD project (Documents On Demand) at the request of the ITESOFT company.
international conference on document analysis and recognition | 2009
Djamel Gaceb; Véronique Eglin; Frank Lebourgeois; Hubert Emptoz
In order to reduce the rejection rate of our automatic reading system, we propose to pre-classify the business documents by introducing an Automatic Recognition of Documents stage (ARD) as a pre-processing step. This important step will guide the other stages involved in the recognition process of the documents contents. Once the document class identified, the reading system will use correct information from the ARD stage to improve the segmentation of the layout, the recognition of the document structure, the parameterization of the OCR, and the final decision for the rejection. We propose in this paper an original method for the classification of business documents suited for complex layouts having great variability. We introduce the graph coloring approach for both layout analysis and document classification. The proposed method is reliable, robust to various constraints and guarantees a real-time answer to the sorting of business documents.
international conference on pattern recognition | 2008
Djamel Gaceb; Véronique Eglin; Frank Lebourgeois; Hubert Emptoz
Every-day, the postal sorting systems diffuse several tons of mails. It is noted that the principal origin of mail rejection is related to the failure of address-block localization task, particularly, of the physical layout segmentation stage. The bottom-up and top-down segmentation methods bring different knowledge that should not be ignored when we need to increase the robustness. Hybrid methods combine the two strategies in order to take advantages of one strategy to the detriment of other. Starting from these remarks, our proposal makes use of a hybrid segmentation strategy more adapted to the postal mails. The high level stages are based on the hierarchical graphs coloring. Today, no other work in this context has make use of the powerfulness of this tool. The performance evaluation of our approach was tested on a corpus of 10000 envelope images. The processing times and the rejection rate were considerably reduced.
international conference on frontiers in handwriting recognition | 2010
Hani Daher; Djamel Gaceb; Véronique Eglin; Stéphane Bres; Nicole Vincent
We present in this paper a new method of analysis and decomposition of handwritten documents into glyphs (graphemes) and their associated code book. The different techniques that are involved in this paper are inspired by image processing methods in a large sense and mathematical models implying graph coloring. Our approaches provide firstly a rapid and detailed characterization of handwritten shapes based on dynamic tracking of the handwriting (curvature, thickness, direction, etc.) and also a very efficient analysis method for the categorization of basic shapes (graphemes). The tools that we have produced enable paleographers to study quickly and more accurately a large volume of manuscripts and to extract a large number of characteristics that are specific to individual writer or specific era.
document recognition and retrieval | 2008
Djamel Gaceb; Véronique Eglin; Frank Lebourgeois; Hubert Emptoz
An efficient mail sorting system is mainly based on an accurate optical recognition of the addresses on the envelopes. However, the localizing of the address block (ABL) should be done before the OCR recognition process. The location step is very crucial as it has a great impact on the global performance of the system. Currently, a good localizing step leads to a better recognition rate. The limit of current methods is mainly caused by modular linear architectures used for ABL: their performances greatly depend on each independent module performance. We are presenting in this paper a new approach for ABL based on a pyramidal data organization and on a hierarchical graph coloring for classification process. This new approach presents the advantage to guarantee a good coherence between different modules and reduces both the computation time and the rejection rate. The proposed method gives a very satisfying rate of 98% of good locations on a set of 750 envelope images.
document analysis systems | 2008
Djamel Gaceb; Véronique Eglin; Frank Lebourgeois; Hubert Emptoz
Every day, the postal sorting systems diffuse several tons of mails. It is noted that the principal origin of mail rejection is related to the failure of address-block localization task, particularly, of the physical layout segmentation stage. The bottom-up and top-down segmentation methods bring different knowledge that should not be ignored when we need to increase the robustness. Hybrid methods combine the two strategies in order to take advantages of one strategy to the detriment of other. Starting from these remarks, our proposal makes use of a hybrid segmentation strategy more adapted to the postal mails. The high level stages are based on the hierarchical graphs coloring, allowing managing through a pyramidal data organization, the complex rules leading the interpretation of the connected components decomposition of interest zones. Today, no other work in this context has make use of the powerfulness of this tool. The performance evaluation of our approach was tested on a corpus of 10000 envelope images. The processing times and the rejection rate were considerably reduced.
Journal of Real-time Image Processing | 2014
Djamel Gaceb; Véronique Eglin; Frank Lebourgeois
In this paper, we present a new document classification based on physical layout features and graph b-coloring modeling. In order to reduce the computing time and to increase the performance of our automatic reading system, we propose to pre-classify the business documents by introducing an Automatic Recognition of Documents stage as a pre-analysis phase. This phase guides others involved in the recognition process of the documents contents. Once the document type is identified, the reading system will use its corresponding information source to improve the recognition of its logical layout, the selection and parameterization of the OCR, and the final decision of sorting. The graph coloring model is introduced for both layout analysis and document classification. The proposed method is reliable, robust to various constraints and guarantees a real-time answer to the sorting of business documents.
international conference on image analysis and recognition | 2007
Djamel Gaceb; Véronique Eglin; Frank Lebourgeois; Hubert Emptoz
An efficient sorting mail system is mainly based on an accurate optical recognition of the envelopes addresses. However, the location of the address block (ABL) should be done before the OCR recognition process. The location step is very crucial as it has a great impact on the global performance of the system. Actually, a good location step leads to a better recognition rate. The limit of current methods depends on modular linear architectures used for ABL. Their performances depend on each independent module performance. We are presenting in this paper a new approach for ABL based on the hierarchical graph coloring and on the pyramidal organization of data that present the advantage to guarantee a good coherence between different modules and that reduces both the computation time and the rejection rate. The proposed method gives very satisfying rate of 98% of good location on a set of 750 envelope images.