Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sophea Prum is active.

Publication


Featured researches published by Sophea Prum.


international conference on document analysis and recognition | 2015

ICDAR2015 competition on smartphone document capture and OCR (SmartDoc)

Jean-Christophe Burie; Joseph Chazalon; Mickaël Coustaty; Sébastien Eskenazi; Muhammad Muzzamil Luqman; Maroua Mehri; Nibal Nayef; Jean-Marc Ogier; Sophea Prum; Marçal Rusiñol

Smartphones are enabling new ways of capture, hence arises the need for seamless and reliable acquisition and digitization of documents, in order to convert them to editable, searchable and a more human-readable format. Current state-of-the-art works lack databases and baseline benchmarks for digitizing mobile captured documents. We have organized a competition for mobile document capture and OCR in order to address this issue. The competition is structured into two independent challenges: smartphone document capture, and smartphone OCR. This report describes the datasets for both challenges along with their ground truth, details the performance evaluation protocols which we used, and presents the final results of the participating methods. In total, we received 13 submissions: 8 for challenge-1, and 5 for challenge-2.


international conference on document analysis and recognition | 2011

Writer Identification Using TF-IDF for Cursive Handwritten Word Recognition

Quang Anh Bui; Muriel Visani; Sophea Prum; Jean-Marc Ogier

In this paper, we present two text-independent writer identification methods in a closed-world context. Both methods use on-line and off-line features jointly with a classifier inspired from information retrieval methods. These methods are local, respectively based on the character and grapheme levels. This writer identification engine may be used to personalize our cursive word recognition engine~\cite{icfhr2010} to the handwriting style of the writer, resulting in an adaptive cursive word recognizer. Experiments assess the effectiveness of the proposed approaches in a context of writer identification as well as integrated to our cursive word recognizer to make it adaptive.


international conference on document analysis and recognition | 2015

SmartDoc-QA: A dataset for quality assessment of smartphone captured document images - single and multiple distortions

Nibal Nayef; Muhammad Muzzamil Luqman; Sophea Prum; Sébastien Eskenazi; Joseph Chazalon; Jean-Marc Ogier

Smartphones are enabling new ways of capture, hence arises the need for seamless and reliable acquisition and digitization of documents. The quality assessment step is an important part of both the acquisition and the digitization processes. Assessing document quality could aid users during the capture process or help improve image enhancement methods after a document has been captured. Current state-of-the-art works lack databases in the field of document image quality assessment. In order to provide a baseline benchmark for quality assessment methods for mobile captured documents, we present in this paper a dataset for quality assessment that contains both singly- and multiply-distorted document images. The proposed dataset could be used for benchmarking quality assessment methods by the objective measure of OCR accuracy, and could be also used to benchmark quality enhancement methods. There are three types of documents in the dataset: modern documents, old administrative letters and receipts. The document images of the dataset are captured under varying capture conditions (light, different types of blur and perspective angles). This causes geometric and photometric distortions that hinder the OCR process. The ground truth of the dataset images consists of the text transcriptions of the documents, the OCR results of the captured documents and the values of the different capture parameters used for each image. We also present how the dataset could be used for evaluation in the field of no-reference quality assessment. The dataset is freely and publicly available for use by the research community at http://navidomass.univ-lr.fr/SmartDoc-QA.


international conference on pattern recognition | 2010

On-Line Handwriting Word Recognition Using a Bi-character Model

Sophea Prum; Muriel Visani; Jean-Marc Ogier

This paper deals with on-line handwriting recognition. Analytic approaches have attracted an increasing interest during the last ten years. These approaches rely on a preliminary segmentation stage, which remains one of the most difficult problems and may affect strongly the quality of the global recognition process. In order to circumvent this problem, this paper introduces a bi-character model, where each character is recognized jointly with its neighboring characters. This model yields two main advantages. First, it reduces the number of confusions due to connections between characters during the character recognition step. Second, it avoids some possible confusion at the character recognition level during the word recognition stage. Our experimentation on significant databases shows some interesting improvements of the recognition rate, since the recognition rate is increased from 65% to 83% by using this bi-character strategy.


international conference on image processing | 2015

An analysis of ground truth binarized image variability of palm leaf manuscripts

Made Windu Antara Kesiman; Sophea Prum; I Made Gede Sunarya; Jean-Christophe Burie; Jean-Marc Ogier

As a very valuable cultural heritage, palm leaf manuscripts offer a new challenge in document analysis system due to the specific characteristics on physical support of the manuscript. With the aim of finding an optimal binarization method for palm leaf manuscript images, creating a new ground truth binarized image is a necessary step in document analysis of palm leaf manuscript. But, regarding to the human intervention in ground truthing process, an important remark about the subjectivity effect on the construction of ground truth binarized image has been analysed and reported. In this paper, we present an experiment in a real condition to analyse the existance of human subjectivity on the construction of ground truth binarized image of palm leaf manuscript images and to measure quantitatively the ground truth variability with several binarization evaluation metrics.


international conference on document analysis and recognition | 2015

An initial study on the construction of ground truth binarized images of ancient palm leaf manuscripts

Made Windu Antara Kesiman; Sophea Prum; Jean-Christophe Burie; Jean-Marc Ogier

Ancient palm leaf manuscripts are one of the very valuable cultural heritages that store various forms of knowledge and historical records of social life in Southeast Asia. The automatic analysis of these documents, in order to extract relevant information, is a real challenge. However, to evaluate the developed extraction algorithms, a ground truth is absolutely necessary. In this paper, we present some of the challenges of the state of the art binarization methods as an initial study for the construction of ground truth binarized images of palm leaf manuscripts. We propose and analyze the need for a specific scheme for the construction of the ground truth of binarized images. The aim of this scheme is to achieve a better ground truth for low quality palm leaf manuscripts. We experimentally tested and evaluated our proposed specific scheme and got promising results. This scheme adapts and performs better in constructing the ground truth of binarized images for palm leaf manuscripts.


international conference on document analysis and recognition | 2013

A Discriminative Approach to On-Line Handwriting Recognition Using Bi-character Models

Sophea Prum; Muriel Visani; Andreas Fischer; Jean-Marc Ogier

Unconstrained on-line handwriting recognition is typically approached within the framework of generative HMM-based classifiers. In this paper, we introduce a novel discriminative method that relies, in contrast, on explicit grapheme segmentation and SVM-based character recognition. In addition to single character recognition with rejection, bi-characters are recognized in order to refine the recognition hypotheses. In particular, bi-character recognition is able to cope with the problem of shared character parts. Whole word recognition is achieved with an efficient dynamic programming method similar to the Viterbi algorithm. In an experimental evaluation on the Unipen-ICROW-03 database, we demonstrate improvements in recognition accuracy of up to 8% for a lexicon of 20,000 words with the proposed method when compared with an HMM-based baseline system. The computational speed is on par with the baseline system.


international conference on frontiers in handwriting recognition | 2010

Cursive On-line Handwriting Word Recognition Using a Bi-character Model for Large Lexicon Applications

Sophea Prum; Muriel Visani; Jean-Marc Ogier

This paper deals with on-line handwriting recognition in a closed-world environment with a large lexicon. Several applications using handwriting recognition have been developed, but most of them consider a lexicon of limited size. Many difficulties, in particular confusions during the segmentation stage, are linked to the use of a large lexicon, with large writing variations and an increased complexity of the connections between characters. In order to circumvent these problems, we introduce in this paper an original method based on a new analytical approach using two levels of recognition models: an isolated character recognizer and an original bi-character recognition model. The idea behind the bi-character model is to recognize jointly two neighboring characters. The objective is to reduce the confusions between characters occurring during the segmentation step. Experiments show an interesting improvement of the recognition rate when introducing the bi-character model, as the recognition rate is increased of 7.2% for a 1000 words lexicon, of 9.1% for a 2000 words lexicon, and up to 15% for a 10000 words lexicon.


international conference on frontiers in handwriting recognition | 2016

Fusion of Explicit Segmentation Based System and Segmentation-Free Based System for On-Line Arabic Handwritten Word Recognition

Hanen Khlif; Sophea Prum; Yousri Kessentini; Slim Kanoun; Jean-Marc Ogier

The complexity and viariability of the Arabic handwriting makes difficult the implementation of an efficient recognition system through the use of a unique recognition engine. In this paper, two handwriting word recognition systems are combined in order to take advantage of their complementarities. The first one is a segmentation free based system that uses the generative classifier HMM. The second system is discriminative based. Relying on analytical approach, it proceeds with explicit segmentation of words into graphemes. Different combination strategies are compared including sum, product, Borda count and Dempster-Shafer rules. The experimental results conducted on ADAB database demonstrate a significant improvement of recognition accuracy of 5% compared to the segmentation free based system and 9% compared to the analytical based system.


international conference on document analysis and recognition | 2015

Competition on Smartphone Document Capture and OCR (SmartDoc)

Jean-Christophe Burie; Joseph Chazalon; Mickaël Coustaty; Sébastien Eskenazi; Muhammad Muzzamil Luqman; Mehri Maroua; Nibal Nayef; Jean-Marc Ogier; Sophea Prum; Marçal Rusiñol

Collaboration


Dive into the Sophea Prum's collaboration.

Top Co-Authors

Avatar

Jean-Marc Ogier

University of La Rochelle

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Joseph Chazalon

University of La Rochelle

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Muriel Visani

University of La Rochelle

View shared research outputs
Top Co-Authors

Avatar

Nibal Nayef

University of La Rochelle

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Marçal Rusiñol

Autonomous University of Barcelona

View shared research outputs
Researchain Logo
Decentralizing Knowledge