Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Praveen Krishnan is active.

Publication


Featured researches published by Praveen Krishnan.


document analysis systems | 2014

Towards a Robust OCR System for Indic Scripts

Praveen Krishnan; Naveen Sankaran; Ajeet Kumar Singh; C. V. Jawahar

The current Optical Character Recognition OCR systems for Indic scripts are not robust enough for recognizing arbitrary collection of printed documents. Reasons for this limitation includes the lack of resources (e.g. not enough examples with natural variations, lack of documentation available about the possible font/style variations) and the architecture which necessitates hard segmentation of word images followed by an isolated symbol recognition. Variations among scripts, latent symbol to UNICODE conversion rules, non-standard fonts/styles and large degradations are some of the major reasons for the unavailability of robust solutions. In this paper, we propose a web based OCR system which (i) follows a unified architecture for seven Indian languages, (ii) is robust against popular degradations, (iii) follows a segmentation free approach, (iv) addresses the UNICODE re-ordering issues, and (v) can enable continuous learning with user inputs and feedbacks. Our system is designed to aid the continuous learning while being usable i.e., we capture the user inputs (say example images) for further improving the OCRs. We use the popular BLSTM based transcription scheme to achieve our target. This also enables incremental training and refinement in a seamless manner. We report superior accuracy rates in comparison with the available OCRs for the seven Indian languages.


international conference on frontiers in handwriting recognition | 2016

Deep Feature Embedding for Accurate Recognition and Retrieval of Handwritten Text

Praveen Krishnan; Kartik Dutta; C. V. Jawahar

We propose a deep convolutional feature representation that achieves superior performance for word spotting and recognition for handwritten images. We focus on: -(i) enhancing the discriminative ability of the convolutional features using a reduced feature representation that can scale to large datasets, and (ii) enabling query-by-string by learning a common subspace for image and text using the embedded attribute framework. We present our results on popular datasets such as the IAM corpus and historical document collections from the Bentham and George Washington pages. On the challenging IAM dataset, we achieve a state of the art mAP of 91.58% on word spotting using textual queries and a mean word error rate of 6.69% for the word recognition task.


european conference on computer vision | 2016

Matching Handwritten Document Images

Praveen Krishnan; C. V. Jawahar

We address the problem of predicting similarity between a pair of handwritten document images written by potentially different individuals. This has applications related to matching and mining in image collections containing handwritten content. A similarity score is computed by detecting patterns of text re-usages between document images irrespective of the minor variations in word morphology, word ordering, layout and paraphrasing of the content. Our method does not depend on an accurate segmentation of words and lines. We formulate the document matching problem as a structured comparison of the word distributions across two document images. To match two word images, we propose a convolutional neural network (cnn) based feature descriptor. Performance of this representation surpasses the state-of-the-art on handwritten word spotting. Finally, we demonstrate the applicability of our method on a practical problem of matching handwritten assignments.


international conference on document analysis and recognition | 2013

Bringing Semantics in Word Image Retrieval

Praveen Krishnan; C. V. Jawahar

Performance of the recognition free approaches for document retrieval, heavily depends on the exact or approximate matching of images (in some feature space) to retrieve documents containing the same word. However, the harder problem in information retrieval is to effectively bring semantics into the retrieval pipeline. This is further challenging when the matching is based on visual features. In this work, we investigate this problem, and suggest a solution by directly transferring the semantics from the textual domain. Our retrieval framework uses (i) the language resources like Word Net and (ii) an annotated corpus of document images, to retrieve semantically relevant words from a large word image database. We demonstrate the method on two languages - English and Hindi, and quantitatively evaluate the performance on annotated word image databases of more than a Million images.


indian conference on computer vision, graphics and image processing | 2012

Content level access to digital library of India pages

Praveen Krishnan; Ravi Shekhar; C. V. Jawahar

In this paper, we propose a framework for content level access to the scanned pages of Digital Library of India (DLI). The current Optical Character Recognition (OCR) systems are not robust and reliable enough for generating accurate text from DLI pages. We propose a search scheme which fuses noisy OCR output and holistic visual features for content level access to the DLI pages. Visual content is captured using Bag of Visual Words (BoVW) approach. We show that our fusion scheme improves over the individual methods in terms of mean Average Precision (mAP) and mean precision at 10 (mPrec@10). We exploit the fact that OCR has a high precision while BoVW has a high recall. We use a modified edit distance to improve the order of results ranked by BoVW. Experiments are carried out on large datasets of DLI pages in Hindi and Telugu languages. We validate our method on more than 10,000 pages and 4 Million words, and report a mAP of around 0.8 and mPrec@10 of more than 0.9. We show improvements over BoVW by introducing query expansion. We also demonstrate a textual query interface for the search system.


computer vision and pattern recognition | 2017

Towards Accurate Handwritten Word Recognition for Hindi and Bangla

Kartik Dutta; Praveen Krishnan; Minesh Mathew; C. V. Jawahar

Building accurate lexicon free handwritten text recognizers for Indic languages is a challenging task, mostly due to the inherent complexities in Indic scripts in addition to the cursive nature of handwriting. In this work, we demonstrate an end-to-end trainable CNN-RNN hybrid architecture which takes inspirations from recent advances of using residual blocks for training convolutional layers, along with the inclusion of spatial transformer layer to learn a model invariant to geometric distortions present in handwriting. In this work we focus building state of the art handwritten word recognizers for two popular Indic scripts – Devanagari and Bangla. To address the need of large scale training data for such low resources languages, we utilize synthetically rendered data for pre-training the network and later fine tune it on the real data. We outperform the previous lexicon based, state of the art methods on the test set of Devanagari and Bangla tracks of RoyDB by a significant margin.


international conference on frontiers in handwriting recognition | 2016

Visual Aesthetic Analysis for Handwritten Document Images

Anshuman Majumdar; Praveen Krishnan; C. V. Jawahar

We present an approach for analyzing the visual aesthetic property of a handwritten document page which matches with human perception. We formulate the problem at two independent levels: (i) coarse level which deals with the overall layout, space usages between lines, words and margins, and (ii) fine level, which analyses the construction of each word and deals with the aesthetic properties of writing styles. We present our observations on multiple local and global features which can extract the aesthetic cues present in the handwritten documents.


arXiv: Computer Vision and Pattern Recognition | 2016

Generating Synthetic Data for Text Recognition.

Praveen Krishnan; C. V. Jawahar


document analysis systems | 2018

Offline Handwriting Recognition on Devanagari Using a New Benchmark Dataset

Kartik Dutta; Praveen Krishnan; Minesh Mathew; C. V. Jawahar


document analysis systems | 2018

Word Spotting and Recognition Using Deep Embedding

Praveen Krishnan; Kartik Dutta; C. V. Jawahar

Collaboration


Dive into the Praveen Krishnan's collaboration.

Top Co-Authors

Avatar

C. V. Jawahar

International Institute of Information Technology

View shared research outputs
Top Co-Authors

Avatar

Kartik Dutta

International Institute of Information Technology

View shared research outputs
Top Co-Authors

Avatar

Minesh Mathew

International Institute of Information Technology

View shared research outputs
Top Co-Authors

Avatar

Ajeet Kumar Singh

International Institute of Information Technology

View shared research outputs
Top Co-Authors

Avatar

Naveen Sankaran

International Institute of Information Technology

View shared research outputs
Top Co-Authors

Avatar

Ravi Shekhar

International Institute of Information Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge