Is this you? Create Your Porfile

Abedelkadir Asi

Ben-Gurion University of the Negev

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Abedelkadir Asi is active.

Explore More

Publication

Featured researches published by Abedelkadir Asi.

Pattern Recognition Letters | 2014

Text line extraction for historical document images

Raid Saabni; Abedelkadir Asi; Jihad El-Sana

In this paper we present a language independent global method for automatic text line extraction. The proposed approach computes an energy map of a text image and determines the seams that pass across and between text lines. In this work we have developed two algorithms along this novel idea, one for binary images and the other for grayscale images. The first algorithm works on binary document images and assumes it is possible to extract the components along text lines. The seam passes on the middle and along the text line, l, and marks the components that make the letters and words of l. It then assigns the unmarked component to the closest text line. The second algorithm works directly on grayscale document images. It computes the distance transform directly from the grayscale images and generates two types of seams: medial seams and separating seams. The medial seams determine the text lines and the separating seams define the upper and lower boundaries of these text lines. Moreover, we present a new benchmark dataset of historical document images with various types of challenges. The dataset contains a groundtruth for text line extraction and it contains samples with different languages such as: Arabic, English and Spanish. A binary dataset is used to test the binary algorithm. We performed various experimental results using our two algorithms on the mentioned datasets and report segmentation accuracy. We also compare our algorithms with the state-of-the-art text line segmentation methods.

Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing | 2013

Robust text and drawing segmentation algorithm for historical documents

Rafi Cohen; Abedelkadir Asi; Klara Kedem; Jihad El-Sana; Its'hak Dinstein

We present a method to segment historical document images into regions of different content. First, we segment text elements from non-text elements using a binarized version of the document. Then, we refine the segmentation of the non-text regions into drawings, background and noise. At this stage, spatial and color features are exploited to guarantee coherent regions in the final segmentation. Experiments show that the suggested approach achieves better segmentation quality with respect to other methods. We examine the segmentation quality on 252 pages of a historical manuscript, for which the suggested method achieves about 92% and 90% segmentation accuracy of drawings and text elements, respectively.

international conference on frontiers in handwriting recognition | 2012

Layout Analysis for Arabic Historical Document Images Using Machine Learning

Syed Saqib Bukhari; Thomas M. Breuel; Abedelkadir Asi; Jihad El-Sana

Page layout analysis is a fundamental step of any document image understanding system. We introduce an approach that segments text appearing in page margins (a.k.a side-notes text) from manuscripts with complex layout format. Simple and discriminative features are extracted in a connected-component level and subsequently robust feature vectors are generated. Multilayer perception classifier is exploited to classify connected components to the relevant class of text. A voting scheme is then applied to refine the resulting segmentation and produce the final classification. In contrast to state-of-the-art segmentation approaches, this method is independent of block segmentation, as well as pixel level analysis. The proposed method has been trained and tested on a dataset that contains a variety of complex side-notes layout formats, achieving a segmentation accuracy of about 95%.

Proceedings of the 2011 Workshop on Historical Document Imaging and Processing | 2011

Text line segmentation for gray scale historical document images

Abedelkadir Asi; Raid Saabni; Jihad El-Sana

In this paper we present a new approach for text line segmentation that works directly on gray-scale document images. Our algorithm constructs distance transform directly on the gray-scale images, which is used to compute two types of seams: medial seams and separating seams. A medial seam is a chain of pixels that crosses the text area of a text line and a separating seam is a path that passes between two consecutive rows. The medial seam determines a text line and the separating seams define the upper and lower boundaries of the text line. The medial and separating seams propagate according to energy maps, which are defined based on the constructed distance transform. We have performed various experimental results on different datasets and received encouraging results.

international conference on document analysis and recognition | 2013

WebGT: An Interactive Web-Based System for Historical Document Ground Truth Generation

Ofer Biller; Abedelkadir Asi; Klara Kedem; Jihad El-Sana; Its'hak Dinstein

We present WebGT, the first web-based system to help users produce ground truth data for document images. This user-friendly software system helps historians and computer scientists collectively annotate historical documents. It supports real time collaboration among remote sites independent of the local operating system and also provides several novel semi-automatic tools that have proven effective for annotating degraded documents.

international conference on frontiers in handwriting recognition | 2014

A Coarse-to-Fine Approach for Layout Analysis of Ancient Manuscripts

Abedelkadir Asi; Rafi Cohen; Klara Kedem; Jihad El-Sana; Its'hak Dinstein

Many applications along the manuscript analysis pipeline rely on the accuracy of pre-processing steps. Perfectly detecting the main text area in ancient historical documents is of great importance for these applications. We propose a learning-free approach to detect the main text area in ancient manuscripts. First, we coarsely segment the main text area by using a texture-based filter. Then, we refine the segmentation by formulating the problem as an energy minimization task and achieving the minimum using graph cuts. The energy function is derived from properties of the text components. Spatial coherence of the segmented text regions is explicitly encouraged by the energy function. We evaluate the suggested method on a publicly available dataset of 38 historical document images. Experiments show that the suggested approach outperforms another state-of-the-art page segmentation method in terms of segmentation quality and time performance.

Proceedings of the 2011 Workshop on Historical Document Imaging and Processing | 2011

User-assisted alignment of Arabic historical manuscripts

Abedelkadir Asi; Irina Rabaev; Klara Kedem; Jihad El-Sana

This work aims to simplify the tiresome manual comparison of two similar Arabic historical manuscripts. We developed a system that determines the difference between two manuscripts by comparing their components, while ignoring page breaks and different warping among consecutive rows; i.e., we treat each manuscript as one long row of components. We compare two components (blocks of pixels) by extracting features from the columns of their bounding rectangles. We adopted the edit distance, which is computed using dynamic time warping (DTW) on the feature domain, to measure similarity between components. The user selects the region to align in two manuscripts and the system return its alignment with visual clues that indicate the distance between the aligned components. In our current implementation, our system provides good results and requires less interaction for manuscripts at good quality that do not include touching components. We tested our system on different Arabic manuscripts of various qualities and received encouraging results.

International Journal on Document Analysis and Recognition | 2017

On writer identification for Arabic historical manuscripts

Abedelkadir Asi; Alaa Abdalhaleem; Daniel Fecker; Volker Märgner; Jihad El-Sana

This paper introduces new methodologies for reliably identifying writers of Arabic historical manuscripts. We propose an approach that transforms key point-based features, such as SIFT, into a global form that captures high-level characteristics of writing styles. We suggest a modification for a common local feature, the contour direction feature, and show the contribution of combining local and global features for writer identification. Our work also presents a novel algorithm that determines the number of writers involved in writing a given manuscript. The experimental study confirms the significant improvement in this algorithm on writer identification once applied to historical manuscripts. Comprehensive experiments using different features and classification schemes demonstrate the vitality of the suggested methodologies for reliable writer identification. The presented techniques were evaluated on both historical and modern documents where the suggested features yielded very promising results with respect to state-of-the-art features.

international conference on document analysis and recognition | 2015

Simplifying the reading of historical manuscripts

Abedelkadir Asi; Rafi Cohen; Klara Kedem; Jihad El-Sana

Complex document layouts pose prominent challenges for document image understanding algorithms. These layouts impose irregularities on the location of text paragraphs which consequently induces difficulties in reading the text. In this paper we present a robust framework for analyzing historical manuscripts with complex layouts. This framework aims to provide a convenient reading experience for historians through topnotch algorithms for text localization, classification and dewarping. We segment text into spatially coherent regions and text-lines using texture-based filters and refine this segmentation by exploiting Markov Random Fields (MRFs). A principled technique is presented for dewarping curvy text regions using a non-linear geometric transformation. The framework has been validated using a subset of a publicly available dataset of historical documents and it provided promising results.

international conference on frontiers in handwriting recognition | 2014

Document Writer Analysis with Rejection for Historical Arabic Manuscripts

Daniel Fecker; Abedelkadir Asi; Werner Pantke; Volker Märgner; Jihad El-Sana; Tim Fingscheidt

Determining the individuality of handwriting in ancient manuscripts is an important aspect of the manuscript analysis process. Automatic identification of writers in historical manuscripts can support historians to gain insights into manuscripts with missing metadata such as writer name, period, and origin. In this paper writer classification and retrieval approaches for multi-page documents in the context of historical manuscripts are presented. The main contribution is a learning-based rejection strategy which utilizes writer retrieval and support vector machines for rejecting a decision if no corresponding writer can be found for a query manuscript. Experiments using different feature extraction methods demonstrate the abilities of our proposed methods. A dedicated data set based on a publicly available database of historical Arabic manuscripts was used and the experiments show promising results.

Explore More