Nibal Nayef | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nibal Nayef is active.

Explore More

Publication

Featured researches published by Nibal Nayef.

international conference on document analysis and recognition | 2015

ICDAR2015 competition on smartphone document capture and OCR (SmartDoc)

Jean-Christophe Burie; Joseph Chazalon; Mickaël Coustaty; Sébastien Eskenazi; Muhammad Muzzamil Luqman; Maroua Mehri; Nibal Nayef; Jean-Marc Ogier; Sophea Prum; Marçal Rusiñol

Smartphones are enabling new ways of capture, hence arises the need for seamless and reliable acquisition and digitization of documents, in order to convert them to editable, searchable and a more human-readable format. Current state-of-the-art works lack databases and baseline benchmarks for digitizing mobile captured documents. We have organized a competition for mobile document capture and OCR in order to address this issue. The competition is structured into two independent challenges: smartphone document capture, and smartphone OCR. This report describes the datasets for both challenges along with their ground truth, details the performance evaluation protocols which we used, and presents the final results of the participating methods. In total, we received 13 submissions: 8 for challenge-1, and 5 for challenge-2.

international conference on pattern recognition | 2014

Document Retrieval Based on Logo Spotting Using Key-Point Matching

Viet Phuong Le; Nibal Nayef; Muriel Visani; Jean-Marc Ogier; Cao De Tran

In this paper, we present an approach to retrieve documents based on logo spotting and recognition. A document retrieval system is proposed inspired from our previous method for logo spotting and recognition. First, the key-points from both the query logo images and a given set of document images are extracted and described by SIFT descriptor, and are matched in the SIFT feature space. They are filtered by the nearest neighbor matching rule based on the two nearest neighbors and are then post-filtered with BRIEF descriptor. Secondly, logo segmentation is performed using spatial density-based clustering, and homography is used to filter the matched key-points as a post processing. Finally, for ranking, we use two measures which are calculated based on the number of matched key-points. Tested on a well-known benchmark database of real world documents containing logos Tobacco-800, our approach achieves better performance than the state-of-the-art methods.

international conference on document analysis and recognition | 2011

Statistical Grouping for Segmenting Symbols Parts from Line Drawings, with Application to Symbol Spotting

Nibal Nayef; Thomas M. Breuel

In this work, we describe the use of statistical grouping for partitioning line drawings into shapes, those shapes represent meaningful parts of the symbols that constitute the line drawings. This grouping method converts a complete line drawing into a set of isolated shapes. This conversion has two effects: (1) making isolated recognition methods applicable for spotting symbols in context, (2) identifying potential regions of interest for symbol spotting methods, hence making them perform faster and more accurately. Our grouping is based on finding salient convex groups of geometric primitives, followed by combining certain found convex groups together. Additionally, we show how such grouping can be used for symbol spotting. When applied on a dataset of architectural line drawings the grouping method achieved above 98.8% recall and 97.3% precision for finding symbols parts. Using the grouping information, the spotting method achieved 99.3% recall and 99.9% precision. Compared to the performance of the same method without grouping information, an overall speed-up factor of 3.2 is achieved with the same -- or better -- recall and precision values.

international conference on document analysis and recognition | 2015

SmartDoc-QA: A dataset for quality assessment of smartphone captured document images - single and multiple distortions

Nibal Nayef; Muhammad Muzzamil Luqman; Sophea Prum; Sébastien Eskenazi; Joseph Chazalon; Jean-Marc Ogier

Smartphones are enabling new ways of capture, hence arises the need for seamless and reliable acquisition and digitization of documents. The quality assessment step is an important part of both the acquisition and the digitization processes. Assessing document quality could aid users during the capture process or help improve image enhancement methods after a document has been captured. Current state-of-the-art works lack databases in the field of document image quality assessment. In order to provide a baseline benchmark for quality assessment methods for mobile captured documents, we present in this paper a dataset for quality assessment that contains both singly- and multiply-distorted document images. The proposed dataset could be used for benchmarking quality assessment methods by the objective measure of OCR accuracy, and could be also used to benchmark quality enhancement methods. There are three types of documents in the dataset: modern documents, old administrative letters and receipts. The document images of the dataset are captured under varying capture conditions (light, different types of blur and perspective angles). This causes geometric and photometric distortions that hinder the OCR process. The ground truth of the dataset images consists of the text transcriptions of the documents, the OCR results of the captured documents and the values of the different capture parameters used for each image. We also present how the dataset could be used for evaluation in the field of no-reference quality assessment. The dataset is freely and publicly available for use by the research community at http://navidomass.univ-lr.fr/SmartDoc-QA.

international conference on document analysis and recognition | 2015

Text and non-text segmentation based on connected component features

Viet Phuong Le; Nibal Nayef; Muriel Visani; Jean-Marc Ogier; Cao De Tran

Document image segmentation is crucial to OCR and other digitization processes. In this paper, we present a learning-based approach for text and non-text separation in document images. The training features are extracted at the level of connected components, a mid-level between the slow noise-sensitive pixel level, and the segmentation-dependent zone level. Given all types, shapes and sizes of connected components, we extract a powerful set of features based on size, shape, stroke width and position of each connected component. Adaboosting with Decision trees is used for labeling connected components. Finally, the classification of connected components into text and non-text is corrected based on classification probabilities and size as well as stroke width analysis of the nearest neighbors of a connected component. The performance of our approach has been evaluated on the two standard datasets: UW-III and ICDAR-2009 competition for document layout analysis. Our results demonstrate that the proposed approach achieves competitive performance for segmenting text and non-text in document images of variable content and degradation.

document recognition and retrieval | 2015

Metric-based no-reference quality assessment of heterogeneous document images

Nibal Nayef; Jean-Marc Ogier

No-reference image quality assessment (NR-IQA) aims at computing an image quality score that best correlates with either human perceived image quality or an objective quality measure, without any prior knowledge of reference images. Although learning-based NR-IQA methods have achieved the best state-of-the-art results so far, those methods perform well only on the datasets on which they were trained. The datasets usually contain homogeneous documents, whereas in reality, document images come from different sources. It is unrealistic to collect training samples of images from every possible capturing device and every document type. Hence, we argue that a metric-based IQA method is more suitable for heterogeneous documents. We propose a NR-IQA method with the objective quality measure of OCR accuracy. The method combines distortion-specific quality metrics. The final quality score is calculated taking into account the proportions of, and the dependency among different distortions. Experimental results show that the method achieves competitive results with learning-based NR-IQA methods on standard datasets, and performs better on heterogeneous documents.

document analysis systems | 2014

Efficient Example-Based Super-Resolution of Single Text Images Based on Selective Patch Processing

Nibal Nayef; Joseph Chazalon; Petra Gomez-Krämer; Jean-Marc Ogier

Example-based super-resolution (SR) methods learn the correspondences between low resolution (LR) and high-resolution (HR) image patches, where the patches are extracted from a training database. To reconstruct a single LR image into a HR one, each LR image patch is processed by the previously trained model to recover its corresponding HR patch. For this reason, they are computationally inefficient. We propose the use of a selective patch processing technique to carry out the super-resolution step more efficiently, while maintaining the output quality. In this technique, only patches of high variance are processed by the costly reconstruction steps, while the rest of the patches are processed by fast bicubic interpolation. We have applied the proposed improvement on representative example-based SR methods to super-resolve text images. The results show a significant speed up for text SR without a drop in theocrat accuracy. In order to carry out an extensive and solid performance evaluation, we also present a public database of text images for training and testing example-based SR methods.

international conference on image processing | 2014

A multi-layer approach for camera-based complex map image retrieval and spotting system

Quoc Bao Dang; Muhammad Muzzamil Luqman; Mickaël Coustaty; Nibal Nayef; Cao De Tran; Jean-Marc Ogier

In this paper, we present a method of camera-based document image retrieval for heterogeneous-content documents using different types of features from different layers of information. We use two kinds of features in this paper (Locally Likely Arrangement Hashing - LLAH - and SIFT reduced dimensions using PCA). Then, a single hash table method is used for indexing these multiple kinds of feature vectors. In addition, we employ a technique for reducing the memory required for indexing the key points in hash table. Experimental results show that the multilayer hashing gives a high accuracy and outperforms classical methods on single layer.

graphics recognition | 2011

Building a symbol library from technical drawings by identifying repeating patterns

Nibal Nayef; Thomas M. Breuel

This paper describes a novel approach for extracting a library of symbols from a large collection of line drawings. This symbol library is a compact and indexable representation of the line drawings. Such a representation is important for efficient symbol retrieval. The proposed approach first identifies the candidate patterns in all images, and then it clusters the similar ones together to create a set of clusters. A representative pattern is chosen from each cluster, and these representative patterns form a library of symbols. We have tested our approach on a database of line drawings, and it achieved high accuracy in capturing and representing the contents of the line drawings.

international conference on pattern recognition | 2014

Deblurring of Document Images Based on Sparse Representations Enhanced by Non-local Means

Nibal Nayef; Petra Gomez-Krämer; Jean-Marc Ogier

Blur is one of the most difficult distortions in camera captured documents. It degrades the visual quality of an image, and makes it difficult to read whether by a human or OCR systems. This paper presents a novel non-blind deblurring method that combines the well known effective techniques of sparse representations and non-local image similarity. The presented problem formulation enables the use of standard sparse coding methods for solving sparse coding-based deblurring when enhanced by a non-local means prior. The method has been tested on both synthetic and real document images degraded with a variety of blur kernels. The resulting deblurred images have high quality in terms of both signal-to-noise ratio and OCR accuracy.

Explore More