Slim Kanoun | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Slim Kanoun is active.

Explore More

Publication

Featured researches published by Slim Kanoun.

international conference on document analysis and recognition | 2009

A New Arabic Printed Text Image Database and Evaluation Protocols

Fouad Slimane; Rolf Ingold; Slim Kanoun; Adel M. Alimi; Jean Hennebert

We report on the creation of a database composed of images of Arabic Printed words. The purpose of this database is the large-scale benchmarking of open-vocabulary, multi-font, multi-size and multi-style text recognition systems in Arabic. The challenges that are addressed by the database are in the variability of the sizes, fonts and style used to generate the images. A focus is also given on low-resolution images where anti-aliasing is generating noise on the characters to recognize. The database is synthetically generated using a lexicon of 113’284 words, 10 Arabic fonts, 10 font sizes and 4 font styles. The database contains 45’313’600 single word images totaling to more than 250 million characters. Ground truth annotation is provided for each image. The database is called APTI for Arabic Printed Text Images.

Pattern Recognition Letters | 2013

A study on font-family and font-size recognition applied to Arabic word images at ultra-low resolution

Fouad Slimane; Slim Kanoun; Jean Hennebert; Adel M. Alimi; Rolf Ingold

In this paper, we propose a new font and size identification method for ultra-low resolution Arabic word images using a stochastic approach. The literature has proved the difficulty for Arabic text recognition systems to treat multi-font and multi-size word images. This is due to the variability induced by some font family, in addition to the inherent difficulties of Arabic writing including cursive representation, overlaps and ligatures. This research work proposes an efficient stochastic approach to tackle the problem of font and size recognition. Our method treats a word image with a fixed-length, overlapping sliding window. Each window is represented with a 102 features whose distribution is captured by Gaussian Mixture Models (GMMs). We present three systems: (1) a font recognition system, (2) a size recognition system and (3) a font and size recognition system. We demonstrate the importance of font identification before recognizing the word images with two multi-font Arabic OCRs (cascading and global). The cascading system is about 23% better than the global multi-font system in terms of word recognition rate on the Arabic Printed Text Image (APTI) database which is freely available to the scientific community.

international conference on frontiers in handwriting recognition | 2012

A Database for Arabic Handwritten Text Image Recognition and Writer Identification

Anis Mezghani; Slim Kanoun; Maher Khemakhem; Haikal El Abed

Standard databases play essential roles for evaluating and comparing results obtained by different groups of researchers. In this paper, an Arabic Handwritten Text Images Database written by Multiple Writers (AHTID/MW) is introduced. This database can be used for research in the recognition of Arabic handwritten text with open vocabulary, word segmentation and writer identification. The AHTID/MW contains 3710 text lines and 22896 words written by 53 native writers of Arabic. In addition, ground truth annotation is provided for each text image. The database is freely available for worldwide researchers.

international conference on document analysis and recognition | 2007

Three decision levels strategy for Arabic and Latin texts differentiation in printed and handwritten natures

M. Ben Jlaiel; Slim Kanoun; Adel M. Alimi; Rémy Mullot

Arabic and Latin script identification in printed and handwritten nature present several difficulties because the Arabic (printed or handwritten) and the handwritten Latin scripts are cursive scripts of nature. To avoid all possible confusions which can be generated, we propose in this paper a strategy which is based on three decision levels where each level will have its own features vector and will consist in identifying only one script among the scripts to identify.

international conference on frontiers in handwriting recognition | 2002

Script and nature differentiation for Arabic and Latin text images

Slim Kanoun; Abdellatif Ennaji; Yves Lecourtier; Adel M. Alimi

A method for Arabic and Latin text block differentiation for printed and handwritten scripts is proposed. This method is based on a morphological analysis for each script at the text block level and a geometrical analysis at the line and the connected component level. In this paper, we present a brief survey, of existing methods used for scripts differentiation as well as a general characteristics of Arabic and Latin scripts. Then, We describe our method for the differentiation of these last scripts. We finally show two experimental results on two different data sets. 400 text blocks constitute the first one and 335 text blocks compose the second.

international conference on document analysis and recognition | 2005

Affixal approach for Arabic decomposable vocabulary recognition a validation on printed word in only one font

Slim Kanoun; Adel M. Alimi; Yves Lecourtier

We propose a new approach for Arabic word recognition called affixal approach. This approach is founded on morphological structure of Arabic vocabulary. A mechanism of decomposition-recognition is used in our approach and makes it possible to lead to a set of reliable solutions for each word. This mechanism tries to recognize word basic morphemes: prefix, infix, suffix and root contrary to existing approaches which are usually based on recognition of word entity by holistic approach, pseudo-word entity by pseudo-analytical approach or letter entity by analytical approach. In this paper, we will present limits of existing approaches for Arabic word recognition. We will expose then Arabic vocabulary structure. We will detail after affixal approach for Arabic decomposable vocabulary recognition with a word example. Lastly, we will expose experimental results obtained on a basis of 1000 words data set.

international conference on document analysis and recognition | 2011

ICDAR 2011 - Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text

Fouad Slimane; Slim Kanoun; Haikal El Abed; Adel M. Alimi; Rolf Ingold; Jean Hennebert

This paper describes the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text held in the context of the 11

international conference on pattern recognition | 2010

Gaussian Mixture Models for Arabic Font Recognition

Fouad Slimane; Slim Kanoun; Adel M. Alimi; Rolf Ingold; Jean Hennebert

^{th}

international conference on image analysis and processing | 2013

Database for Arabic Printed Text Recognition Research

Faten Kallel Jaiem; Slim Kanoun; Maher Khemakhem; Haikal El Abed; Jihain Kardoun

International Conference on Document Analysis and Recognition (ICDAR2011), during September 18-21, 2011, Beijing, China. This first competition used the freely available Arabic Printed Text Image (APTI) database. Several research groups have started using the APTI database and this year, 2 groups with 3 systems are participating in the competition. The systems are compared using the recognition rates at the character and word levels. The systems were tested on one test dataset which is unknown to all participants (set 6 of APTI database). The systems are compared on the most important characteristic of classification systems, the recognition rate. A short description of the participating groups, their systems, the experimental setup, and the observed results are presented.

systems man and cybernetics | 2011

Natural Language Morphology Integration in Off-Line Arabic Optical Text Recognition

Slim Kanoun; Adel M. Alimi; Yves Lecourtier

We present in this paper a new approach for Arabic font recognition. Our proposal is to use a fixed-length sliding window for the feature extraction and to model feature distributions with Gaussian Mixture Models (GMMs). This approach presents a double advantage. First, we do not need to perform a priori segmentation into characters, which is a difficult task for arabic text. Second, we use versatile and powerful GMMs able to model finely distributions of features in large multi-dimensional input spaces. We report on the evaluation of our system on the APTI (Arabic Printed Text Image) database using 10 different fonts and 10 font sizes. Considering the variability of the different font shapes and the fact that our system is independent of the font size, the obtained results are convincing and compare well with competing systems.

Explore More