Adnan Ul-Hasan
Kaiserslautern University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Adnan Ul-Hasan.
international conference on document analysis and recognition | 2013
Thomas M. Breuel; Adnan Ul-Hasan; Mayce Ali Al-Azawi; Faisal Shafait
Long Short-Term Memory (LSTM) networks have yielded excellent results on handwriting recognition. This paper describes an application of bidirectional LSTM networks to the problem of machine-printed Latin and Fraktur recognition. Latin and Fraktur recognition differs significantly from handwriting recognition in both the statistical properties of the data, as well as in the required, much higher levels of accuracy. Applications of LSTM networks to handwriting recognition use two-dimensional recurrent networks, since the exact position and baseline of handwritten characters is variable. In contrast, for printed OCR, we used a one-dimensional recurrent network combined with a novel algorithm for baseline and x-height normalization. A number of databases were used for training and testing, including the UW3 database, artificially generated and degraded Fraktur text and scanned pages from a book digitization project. The LSTM architecture achieved 0.6% character-level test-set error on English text. When the artificially degraded Fraktur data set is divided into training and test sets, the system achieves an error rate of 1.64%. On specific books printed in Fraktur (not part of the training set), the system achieves error rates of 0.15% (Fontane) and 1.47% (Ersch-Gruber). These recognition accuracies were found without using any language modelling or any other post-processing techniques.
international conference on document analysis and recognition | 2013
Adnan Ul-Hasan; Saad Bin Ahmed; Faisal Rashid; Faisal Shafait; Thomas M. Breuel
Recurrent neural networks (RNN) have been successfully applied for recognition of cursive handwritten documents, both in English and Arabic scripts. Ability of RNNs to model context in sequence data like speech and text makes them a suitable candidate to develop OCR systems for printed Nabataean scripts (including Nastaleeq for which no OCR system is available to date). In this work, we have presented the results of applying RNN to printed Urdu text in Nastaleeq script. Bidirectional Long Short Term Memory (BLSTM) architecture with Connectionist Temporal Classification (CTC) output layer was employed to recognize printed Urdu text. We evaluated BLSTM networks for two cases: one ignoring the characters shape variations and the second is considering them. The recognition error rate at character level for first case is 5.15% and for the second is 13.6%. These results were obtained on synthetically generated UPTI dataset containing artificially degraded images to reflect some real-world scanning artifacts along with clean images. Comparison with shape-matching based method is also presented.
Proceedings of the 4th International Workshop on Multilingual OCR | 2013
Adnan Ul-Hasan; Thomas M. Breuel
Language models or recognition dictionaries are usually considered an essential step in OCR. However, using a language model complicates training of OCR systems, and it also narrows the range of texts that an OCR system can be used with. Recent results have shown that Long Short-Term Memory (LSTM) based OCR yields low error rates even without language modeling. In this paper, we explore the question to what extent LSTM models can be used for multilingual OCR without the use of language models. To do this, we measure cross-language performance of LSTM models trained on different languages. LSTM models show good promise to be used for language-independent OCR. The recognition errors are very low (around 1%) without using any language model or dictionary correction.
document analysis systems | 2016
Adnan Ul-Hasan; Syed Saqib Bukhari; Andreas Dengel
Digitizing historical documents is crucial in preserving the literary heritage. With the availability of low cost capturing devices, libraries and institutes all over the world have old literature preserved in the form of scanned documents. However, searching through these scanned images is still a tedious job as one is unable to search through them. Contemporary machine learning approaches have been applied successfully to recognize text in both printed and handwriting form, however, these approaches require a lot of transcribed training data in order to obtain satisfactory performance. Transcribing the documents manually is a laborious and costly task, requiring many man-hours and language-specific expertise. This paper presents a generic iterative training framework to address this issue. The proposed framework is not only applicable to historical documents, but for present-day documents as well, where manually transcribed training data is unavailable. Starting with the minimal information available, the proposed approach iteratively corrects the training and generalization errors. Specifically, we have used a segmentation-based OCR method to train on individual symbols and then use the semi-corrected recognized text lines as the ground-truth data for segmentation-free sequence learning, which learns to correct the errors in the ground-truth by incorporating context-aware processing. The proposed approach is applied to a collection of 15th century Latin documents. The iterative procedure using segmentation-free OCR was able to reduce the initial character error of about 23% (obtained from segmentation-based OCR) to less than 7% in few iterations.
international conference on document analysis and recognition | 2015
Tushar Karayil; Adnan Ul-Hasan; Thomas M. Breuel
Long Short-Term Memory (LSTM) networks are a suitable candidate for segmentation-free Optical Character Recognition (OCR) tasks due to their good context-aware processing. In this paper, we report the results of applying LSTM networks to Devanagari script, where each consonant-consonant conjuncts and consonant-vowel combinations take different forms based on their position in the word. We also introduce a new database, Deva-DB, of Devanagari script (free of cost) to aid the research towards a robust Devanagari OCR system. On this database, LSTM-based OCRopus system yields error rates ranging from 1.2% to 9.0% depending upon the complexity of the training and test data. Comparison with open-source Tesseract system is also presented for the same database.
document analysis systems | 2012
Adnan Ul-Hasan; Syed Saqib Bukhari; Faisal Shafait; Thomas M. Breuel
Table of Contents (ToC) is an integral part of multiple-page documents like books, magazines, etc. Most of the existing techniques use textual similarity for automatically detecting ToC pages. However, such techniques may not be applied for detection of ToC pages in situations where OCR technology is not available, which is indeed true for historical documents and many modern Nabataean (Arabic) and Indic scripts. It is, therefore, necessary to develop tools to navigate through such documents without the use of OCR. This paper reports a preliminary effort to address this challenge. The proposed algorithm has been applied to find Table of Contents (ToC) pages in Urdu books and an overall initial accuracy of 88% has been achieved.
international conference on document analysis and recognition | 2015
Adnan Ul-Hasan; Faisal Shafaity; Marcus Liwicki
This paper introduces a novel curriculum learning strategy for ligature-based scripts. Long Short-Term Memory Networks require thousands or even millions of iterations on target symbols, depending upon the complexity of the target data, to converge when trained for sequence transcription because they have to localize the individual symbols along with the recognition. Curriculum learning reduces the number of target symbols to be visited before the network converges. In this paper, we propose a ligature-based complexity measure to define the sampling order of the training data. Experiments performed on UPTI database show that the curriculum learning using our strategy can reduce the total number of target symbols before convergence for printed Urdu Nastaleeq OCR task.
document analysis systems | 2016
Fallak Asad; Adnan Ul-Hasan; Faisal Shafait; Andreas Dengel
Documents are routinely captured by digital cameras in todays age owing to the availability of high quality cameras in smart phones. However, recognition of camera-captured documents is substantially more challenging as compared to traditional flat bed scanned documents due to the distortions introduced by the cameras. One of the major performancelimiting artifacts is the motion and out-of-focus blur that is often induced in the document during the capturing process. Existing approaches try to detect presence of blur in the document to inform the user for re-capturing the image. This paper reports, for the first time, an Optical Character Recognition (OCR) system that can directly recognize blurred documents on which the stateof-the-art OCR systems are unable to provide usable results. Our presented system is based on the Long Short-Term Memory (LSTM) networks and has shown promising character recognition results on both the motion-blurred and out-of-focus blurred images. One important feature of this work is that the LSTM networks have been applied directly to the gray-scale document images to avoid error-prone binarization of blurred documents. Experiments are conducted on publicly available SmartDoc-QA dataset that contains a wide variety of image blur degradations. Our presented system achieves 12.3% character error rate on the test documents, which is an over three-fold reduction in the error rate (38.9%) of the best-performing contemporary OCR system (ABBYY Fine Reader) on the same data.
international conference on document analysis and recognition | 2015
Adnan Ul-Hasan; Muhammad Zeshan Afzal; Faisal Shafait; Marcus Liwicki; Thomas M. Breuel
international conference on pattern recognition | 2012
Adnan Ul-Hasan; Syed Saqib Bukhari; Sheikh Faisal Rashid; Faisal Shafait; Thomas M. Breuel