Slim Kanoun
University of Sfax
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Slim Kanoun.
international conference on document analysis and recognition | 2009
Fouad Slimane; Rolf Ingold; Slim Kanoun; Adel M. Alimi; Jean Hennebert
We report on the creation of a database composed of images of Arabic Printed words. The purpose of this database is the large-scale benchmarking of open-vocabulary, multi-font, multi-size and multi-style text recognition systems in Arabic. The challenges that are addressed by the database are in the variability of the sizes, fonts and style used to generate the images. A focus is also given on low-resolution images where anti-aliasing is generating noise on the characters to recognize. The database is synthetically generated using a lexicon of 113’284 words, 10 Arabic fonts, 10 font sizes and 4 font styles. The database contains 45’313’600 single word images totaling to more than 250 million characters. Ground truth annotation is provided for each image. The database is called APTI for Arabic Printed Text Images.
Pattern Recognition Letters | 2013
Fouad Slimane; Slim Kanoun; Jean Hennebert; Adel M. Alimi; Rolf Ingold
In this paper, we propose a new font and size identification method for ultra-low resolution Arabic word images using a stochastic approach. The literature has proved the difficulty for Arabic text recognition systems to treat multi-font and multi-size word images. This is due to the variability induced by some font family, in addition to the inherent difficulties of Arabic writing including cursive representation, overlaps and ligatures. This research work proposes an efficient stochastic approach to tackle the problem of font and size recognition. Our method treats a word image with a fixed-length, overlapping sliding window. Each window is represented with a 102 features whose distribution is captured by Gaussian Mixture Models (GMMs). We present three systems: (1) a font recognition system, (2) a size recognition system and (3) a font and size recognition system. We demonstrate the importance of font identification before recognizing the word images with two multi-font Arabic OCRs (cascading and global). The cascading system is about 23% better than the global multi-font system in terms of word recognition rate on the Arabic Printed Text Image (APTI) database which is freely available to the scientific community.
international conference on frontiers in handwriting recognition | 2012
Anis Mezghani; Slim Kanoun; Maher Khemakhem; Haikal El Abed
Standard databases play essential roles for evaluating and comparing results obtained by different groups of researchers. In this paper, an Arabic Handwritten Text Images Database written by Multiple Writers (AHTID/MW) is introduced. This database can be used for research in the recognition of Arabic handwritten text with open vocabulary, word segmentation and writer identification. The AHTID/MW contains 3710 text lines and 22896 words written by 53 native writers of Arabic. In addition, ground truth annotation is provided for each text image. The database is freely available for worldwide researchers.
international conference on document analysis and recognition | 2007
M. Ben Jlaiel; Slim Kanoun; Adel M. Alimi; Rémy Mullot
Arabic and Latin script identification in printed and handwritten nature present several difficulties because the Arabic (printed or handwritten) and the handwritten Latin scripts are cursive scripts of nature. To avoid all possible confusions which can be generated, we propose in this paper a strategy which is based on three decision levels where each level will have its own features vector and will consist in identifying only one script among the scripts to identify.
international conference on frontiers in handwriting recognition | 2002
Slim Kanoun; Abdellatif Ennaji; Yves Lecourtier; Adel M. Alimi
A method for Arabic and Latin text block differentiation for printed and handwritten scripts is proposed. This method is based on a morphological analysis for each script at the text block level and a geometrical analysis at the line and the connected component level. In this paper, we present a brief survey, of existing methods used for scripts differentiation as well as a general characteristics of Arabic and Latin scripts. Then, We describe our method for the differentiation of these last scripts. We finally show two experimental results on two different data sets. 400 text blocks constitute the first one and 335 text blocks compose the second.
international conference on document analysis and recognition | 2005
Slim Kanoun; Adel M. Alimi; Yves Lecourtier
We propose a new approach for Arabic word recognition called affixal approach. This approach is founded on morphological structure of Arabic vocabulary. A mechanism of decomposition-recognition is used in our approach and makes it possible to lead to a set of reliable solutions for each word. This mechanism tries to recognize word basic morphemes: prefix, infix, suffix and root contrary to existing approaches which are usually based on recognition of word entity by holistic approach, pseudo-word entity by pseudo-analytical approach or letter entity by analytical approach. In this paper, we will present limits of existing approaches for Arabic word recognition. We will expose then Arabic vocabulary structure. We will detail after affixal approach for Arabic decomposable vocabulary recognition with a word example. Lastly, we will expose experimental results obtained on a basis of 1000 words data set.
international conference on document analysis and recognition | 2011
Fouad Slimane; Slim Kanoun; Haikal El Abed; Adel M. Alimi; Rolf Ingold; Jean Hennebert
This paper describes the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text held in the context of the 11
international conference on pattern recognition | 2010
Fouad Slimane; Slim Kanoun; Adel M. Alimi; Rolf Ingold; Jean Hennebert
^{th}
international conference on image analysis and processing | 2013
Faten Kallel Jaiem; Slim Kanoun; Maher Khemakhem; Haikal El Abed; Jihain Kardoun
International Conference on Document Analysis and Recognition (ICDAR2011), during September 18-21, 2011, Beijing, China. This first competition used the freely available Arabic Printed Text Image (APTI) database. Several research groups have started using the APTI database and this year, 2 groups with 3 systems are participating in the competition. The systems are compared using the recognition rates at the character and word levels. The systems were tested on one test dataset which is unknown to all participants (set 6 of APTI database). The systems are compared on the most important characteristic of classification systems, the recognition rate. A short description of the participating groups, their systems, the experimental setup, and the observed results are presented.
systems man and cybernetics | 2011
Slim Kanoun; Adel M. Alimi; Yves Lecourtier
We present in this paper a new approach for Arabic font recognition. Our proposal is to use a fixed-length sliding window for the feature extraction and to model feature distributions with Gaussian Mixture Models (GMMs). This approach presents a double advantage. First, we do not need to perform a priori segmentation into characters, which is a difficult task for arabic text. Second, we use versatile and powerful GMMs able to model finely distributions of features in large multi-dimensional input spaces. We report on the evaluation of our system on the APTI (Arabic Printed Text Image) database using 10 different fonts and 10 font sizes. Considering the variability of the different font shapes and the fact that our system is independent of the font size, the obtained results are convincing and compare well with competing systems.