Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gurpreet Singh Lehal is active.

Publication


Featured researches published by Gurpreet Singh Lehal.


international conference on pattern recognition | 2000

A Gurmukhi script recognition system

Gurpreet Singh Lehal; Chandan Singh

A system for recognition of machine printed Gurmukhi script is presented. The recognition system presented operates at sub-character level. The segmentation process breaks a word into sub-characters and the recognition phase consists of classifying these sub-characters and combining them to form Gurmukhi characters. A set of very simple and easy to computer features is used and a hybrid classification scheme consisting of binary decision trees and nearest neighbours is employed. A recognition rate of 96.6% at the processing speed of 175 characters second was achieved on clean images of text without employing any post-processing technique.


document recognition and retrieval | 2000

Text segmentation of machine-printed Gurmukhi script

Gurpreet Singh Lehal; Chandan Singh

This paper describes a scheme for text segmentation of machine printed Gurmukhi script documents. There has been a tremendous research in text segmentation of machine printed Roman script documents. In contrast there has been very little reported research on text segmentation of Indian language scripts in general and Gurmukhi script in particular. Research in the field of text segmentation of Gurmukhi script faces major problems mainly related to the unique characteristics of the script like connectivity of characters on the headline, two or more characters in a word having intersecting minimum bounding rectangles along horizontal direction, multi-component characters, touching characters which are present even in clean documents and horizontally overlapping text segments. In our proposed method we have used horizontal projection profile to successively divide the text area into small sub-areas or horizontal strips each of which contains (1) A set of text lines or (2) A single text line or (3) Sub-parts of text lines. Using vertical projection profile the horizontal strips are physically split into smaller units such as words, characters or sub characters depending on the type of the strip. Finally each of this unit is segmented into a set of connected components. The classifier is trained to recognize these connected components which are later merged to form character(s).


international conference on document analysis and recognition | 2001

A shape based post processor for Gurmukhi OCR

Gurpreet Singh Lehal; Chandan Singh; Ritu Lehal

A shape based post processing system for an OCR of Gurmukhi script has been developed. Based on the size and shape of a word, the Punjabi corpora has been split into different partitions. The statistical information of Punjabi language syllable combination, corpora look up and holistic recognition of most commonly occurring words have been combined to design the post processor. An improvement of 3% in recognition rate from 94.35% to 97.34% has been reported on machine printed images using the post processing techniques.


international conference on pattern recognition | 2006

An Iterative Algorithm for Segmentation of Isolated Handwritten Words in Gurmukhi Script

Dharam Veer Sharma; Gurpreet Singh Lehal

Segmentation of handwritten text in Gurmukhi script is an uphill task primarily because of the structural features of the script and varied writing styles. The presence of a horizontal line connecting characters of a word (i.e. head line), half characters and overlapping of some vowel between middle and lower zone of a word make the task even more difficult. Handwritten text is also prone to the problem of overlapped, connected and merged characters with in a word. Structural features are helpful in segmentation of machine printed text but these are of little help for segmentation of handwritten words. The proposed technique segments the words in an iterative manner by focusing on presence of headline, aspect ratio of characters and vertical and horizontal projection profiles. The proposed approach of segmentation can be used for handwritten text of Indian language scripts like Devnagri, Bangla etc. having structural feature similar to Gurmukhi script


Sadhana-academy Proceedings in Engineering Sciences | 2002

A post-processor for Gurmukhi OCR

Gurpreet Singh Lehal; Chandan Singh

A post-processing system for OCR of Gurmukhi script has been developed. Statistical information of Punjabi language syllable combinations, corpora look-up and certain heuristics based on Punjabi grammar rules have been combined to design the post-processor. An improvement of 3% in recognition rate, from 94.35% to 97.34%, has been reported on clean images using the post-processing techniques.


international conference on multimodal interfaces | 2000

A Recognition System for Devnagri and English Handwritten Numerals

Gurpreet Singh Lehal; Nivedan Bhatt

A system is proposed to recognize handwritten numerals in both Devnagri (Hindi) and English. It is assumed at a time the numerals will be of one of the above two scripts and there are no mixed script numerals in an input string. A set of global and local features, which are derived from the right and left projection profiles of the numeral image, are used. During experiments it was found that the Devnagri numeral set had a much better recognition and rejection rate as compared to the English character set and so the input numeral is first tested by the Devnagri module. The correct recognition enables the system to be set to the appropriate context (Devnagri/English numeral set). Subsequent identification of the other numerals is carried out in that context only.


International Journal of Image and Graphics | 2009

ON SEGMENTATION OF TOUCHING CHARACTERS AND OVERLAPPING LINES IN DEGRADED PRINTED GURMUKHI SCRIPT

M. K. Jindal; Gurpreet Singh Lehal; R. K. Sharma

Character segmentation plays a very important role in a text recognition system. The simple technique of using inter-character gap for segmentation is useful for fine printed documents, but this technique fails to give satisfactory results if the input text contains touching characters. In this paper, we have proposed two algorithms to segment touching characters, and one algorithm to segment overlapping lines in degraded printed Gurmukhi document. Various categories of touching characters in different zones, along with their solutions, have been proposed. The solution methodology extensively uses the structural properties of Gurmukhi script. The algorithm proposed for segmenting horizontally overlapping lines uses a heuristics based upon the height of a character. The problem of multiple horizontally overlapping lines may occur in a number of situations such as printed newspapers, old magazines and books etc. Similarity among Indian scripts allows us to use these algorithms for solving the segmentation problems in other Indian languages also.


Proceeding of the workshop on Document Analysis and Recognition | 2012

Choice of recognizable units for URDU OCR

Gurpreet Singh Lehal

There has been considerable work on Arabic OCR. However, all that work is based on Naskh style. Urdu script is based on Arabic alphabet, but uses Nastalique style. The Nastalique style makes OCR in general and character segmentation in particular, a highly challenging task, so most of the researchers avoid the character segmentation phase and go in for higher unit of recognition. For Urdu, the next higher recognition unit considered by researchers is ligature, which lies between character and word. A ligature is a connected component of one or more characters and usually an Urdu word is composed of 1 to 8 ligatures. A related issue is identification of all possible ligatures for recognition purpose. For this purpose, we have performed a statistical analysis of Urdu corpus to collect and organise the Urdu ligatures. The number of unique ligatures comes to be more than 26,000, and recognition of such a huge class is again a Herculean task. It becomes necessary to reduce the class count and look for alternative recognition unit. From OCR point of view, a ligature can further be segmented into one primary connected component and zero or more secondary connected components. The primary component represents the basic shape of the ligature, while the secondary connected component corresponds to the dots and diacritics marks and special symbols associated with the ligature. To reduce the class count, the ligatures with similar primary components are clubbed together. Further statistical analysis is performed to count and arrange in descending order the primary components and a manageable class of around 2300 recognition units has been generated, which covers 99% of Urdu corpus.


international conference on computational linguistics | 2008

A Punjabi To Hindi Machine Translation System

Gurpreet Singh Josan; Gurpreet Singh Lehal

Punjabi and Hindi are two closely related languages as both originated from the same origin and having lot of syntactic and semantic similarities. These similarities make direct translation methodology an obvious choice for Punjabi-Hindi language pair. The purposed system for Punjabi to Hindi translation has been implemented with various research techniques based on Direct MT architecture and language corpus. The output is evaluated by already prescribed methods in order to get the suitability of the system for the Punjabi Hindi language pair.


bangalore annual compute conference | 2009

Segmentation of touching characters in upper zone in printed Gurmukhi script

M. K. Jindal; R. K. Sharma; Gurpreet Singh Lehal

A new technique for segmenting touching characters in upper zone of printed Gurmukhi script has been presented in this paper. The technique is based on the structural properties of the Gurmukhi script characters. Concavity and convexity of the characters has been studied and using top profile projections, the touching characters in upper zone have been segmented. Recognition rate of 91% has been achieved for segmenting the touching characters in upper zone.

Collaboration


Dive into the Gurpreet Singh Lehal's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Renu Dhir

Dr. B. R. Ambedkar National Institute of Technology Jalandhar

View shared research outputs
Top Co-Authors

Avatar

Parminder Singh

Guru Nanak Dev Engineering College

View shared research outputs
Top Co-Authors

Avatar

Rajneesh Rani

Dr. B. R. Ambedkar National Institute of Technology Jalandhar

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge