Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chandranath Adak is active.

Publication


Featured researches published by Chandranath Adak.


international conference on document analysis and recognition | 2015

Writer Identification from offline isolated Bangla characters and numerals

Chandranath Adak; B. B. Chaudhuri

Writer identification is an essential component in computational forensic. In this paper, we attempt to do this job based only on isolated characters and numerals. For that, at first, some points of interest (keypoints) on the image are detected by structural analysis and SIFT based detector. Then we calculate a set of features within a certain neighborhood of the keypoint and employ fusion rule on multiple probabilistic SVM classifiers output for writer identification. For experimental analysis, a database containing 212,300 isolated Bangla orthosyllabic characters and numerals are generated with the help of 100 writers. We obtain fairly good result to identify a writer. We also try to find a small set of highly discriminative characters storing extra information about the writing style of an individual.


international conference on human-computer interaction | 2013

Gabor filter and rough clustering based edge detection

Chandranath Adak

This paper introduces an efficient edge detection method based on Gabor filter and rough clustering. The input image is smoothed by Gabor function, and the concept of rough clustering is used to focus on edge detection with soft computational approach. Hysteresis thresholding is used to get the actual output, i.e. edges of the input image. To show the effectiveness, the proposed technique is compared with some other edge detection methods.


pattern recognition and machine intelligence | 2013

Extraction of Doodles and Drawings from Manuscripts

Chandranath Adak; B. B. Chaudhuri

In this paper we propose an approach to separate the non-texts from texts of a manuscript. The non-texts are mainly in the form of doodles and drawings of some exceptional thinkers and writers. These have enormous historical values due to study on those writers’ subconscious as well as productive mind. We also propose a computational approach to recover the struck-out texts to reduce human effort. The proposed technique has a preprocessing stage, which removes noise using median filter and segments object region using fuzzy c-means clustering. Now connected component analysis finds the major portions of non-texts, and window examination eliminates the partially attached texts. The struck-out texts are extracted by eliminating straight lines, measuring degree of continuity, using some morphological operations.


international conference on frontiers in handwriting recognition | 2014

An Approach of Strike-Through Text Identification from Handwritten Documents

Chandranath Adak; B. B. Chaudhuri

A handwritten document may contain strike-through texts. If such texts are fed into an OCR system, the output will be garbage. In this paper, we propose a scheme to detect such strike-through texts/words. Using a graph based model, we represent a textual connected component as a graph. The start/end and intersection points of the ink-strokes of a component are marked as graph nodes. There exists an edge between two nodes if they are connected by object (ink) pixels. By eliminating parallel edges and self loops we obtain a simple, undirected, edge-weighted graph of the text-component. The edge-weight is found by adding horizontal/vertical moves weighted by 1 and diagonal moves weighted by √2. In this graph, we find the shortest path which is nearly as long as the width of the text component and maintains a reasonable degree of straightness. This path, if exist, is identified as the strike-through line. Here we deal with handwritten documents in English, Bengali and Devanagari script. Our approach delivers fairly good results.


document analysis systems | 2016

Named Entity Recognition from Unstructured Handwritten Document Images

Chandranath Adak; B. B. Chaudhuri; Michael Myer Blumenstein

Named entity recognition is an important topic in the field of natural language processing, whereas in document image processing, such recognition is quite challenging without employing any linguistic knowledge. In this paper we propose an approach to detect named entities (NEs) directly from offline handwritten unstructured document images without explicit character/word recognition, and with very little aid from natural language and script rules. At the preprocessing stage, the document image is binarized, and then the text is segmented into words. The slant/skew/baseline corrections of the words are also performed. After preprocessing, the words are sent for NE recognition. We analyze the structural and positional characteristics of NEs and extract some relevant features from the word image. Then the BLSTM neural network is used for NE recognition. Our system also contains a post-processing stage to reduce the true NE rejection rate. The proposed approach produces encouraging results on both historical and modern document images, including those from an Australian archive, which are reported here for the very first time.


international symposium on neural networks | 2017

Impact of struck-out text on writer identification

Chandranath Adak; B. B. Chaudhuri; Michael Myer Blumenstein

The presence of struck-out text in handwritten manuscripts may affect the accuracy of automated writer identification. This paper presents a study on such effects of struck-out text. Here we consider offline English and Bengali handwritten document images. At first, the struck-out texts are detected using a hybrid classifier of a CNN (Convolutional Neural Network) and an SVM (Support Vector Machine). Then the writer identification process is activated on normal and struck-out text separately, to ascertain the impact of struck-out texts. For writer identification, we use two methods: (a) a hand-crafted feature-based SVM classifier, and (b) CNN-extracted auto-derived features with a recurrent neural model. For the experimental analysis, we have generated a database from 100 English and 100 Bengali writers. The performance of our system is very encouraging.


Pattern Recognition | 2017

An approach for detecting and cleaning of struck-out handwritten text

B. B. Chaudhuri; Chandranath Adak

Abstract This paper deals with the identification and processing of struck-out texts in unconstrained offline handwritten document images. If run on the OCR engine, such texts will produce nonsense character-string outputs. Here we present a combined (a) pattern classification and (b) graph-based method for identifying such texts. In case of (a), a feature-based two-class (normal vs. struck-out text) SVM classifier is used to detect moderate-sized struck-out components. In case of (b), skeleton of the text component is considered as a graph and the strike-out stroke is identified using a constrained shortest path algorithm. To identify zigzag or wavy struck-outs, all paths are found and some properties of zigzag and wavy line are utilized. Some other types of strike-out stroke are also detected by modifying the above method. The large sized multi-word and multi-line struck-outs are segmented into smaller components and treated as above. The detected struck-out texts can then be blocked from entering the OCR engine. In another kind of application involving historical documents, page images along with their annotated ground-truth are to be generated. In this case the strike-out strokes can be deleted from the words and then fed to the OCR engine. For this purpose an inpainting-based cleaning approach is employed. We worked on 500 pages of documents and obtained an overall F-Measure of 91.56% (91.06%) in English (Bengali) script for struck-out text detection. Also, for strike-out stroke identification and deletion, the F-Measures obtained were 89.65% (89.31%) and 91.16% (89.29%), respectively.


international conference on frontiers in handwriting recognition | 2016

Offline Cursive Bengali Word Recognition Using CNNs with a Recurrent Model

Chandranath Adak; B. B. Chaudhuri; Michael Myer Blumenstein

This paper deals with offline handwritten word recognition of a major Indic script: Bengali. Due to the structure of this script, the characters (mostly ortho-syllables) are frequently overlapping and hard to segment, especially when the writing is cursive. Individual character recognition and the combination of outputs can increase the likelihood of errors. Instead, a better approach can be sending the whole word to a suitable recognizer. Here we use the Convolutional Neural Network (CNN) integrated with a recurrent model for this purpose. Long short-term memory blocks are used as hidden units. Also, the CNN-derived features are employed in a recurrent model with a CTC (Connectionist Temporal Classification) layer to get the output. We have tested our method on three datasets: (a) a publicly available dataset, (b) a new dataset generated by our research group and (c) an unconstrained dataset. The dataset (a) contains 17,091 words, while our dataset (b) contains 107,550 number of words in total. In addition to these, the dataset (c) is comprised of 5,223 words. We have compared our results with those of some earlier work in the area and have found improved performance, which is due to the novel integration of CNNs with the recurrent model.


ieee region 10 conference | 2015

Binarization of old halftone text documents

Chandranath Adak; Prantik Maitra; B. B. Chaudhuri; Michael Myer Blumenstein

A degraded document image should be cleaned before subjecting to Optical Character Recognition (OCR), otherwise the result may be erroneous. Though major studies have been conducted on degraded document image cleaning, halftone documents received less attention. Since halftone documents contain halftone dot patterns, classical binarization techniques do not produce proper output for feeding into the OCR engine. In this paper, old halftone documents are considered for text area cleaning and binarization. At the beginning, the zone of interest (text area) is found using local binary pattern and contour analysis. Reasonably smaller zones are filtered out as noise. Then the foreground pixels are separated using background estimation. After this, an automated spatial smoothing technique is employed on the foreground. At last, a local binarization technique is used to produce the binary image. The proposed method is tested on various old and degraded halftone documents, which has produced fairly good results.


2014 First International Conference on Automation, Control, Energy and Systems (ACES) | 2014

A bilingual machine translation system: English & Bengali

Chandranath Adak

Natural language is a fundamental thing of human-society to communicate and interact with one another. In this globalization era, we interact with different regional people as per our interest in social, cultural, economical, educational and professional domain. There are thousands of natural languages exist in our earth. It is quite tough, rather impossible to know all the languages. So we need a computerized approach to convert one natural language to another as per our necessity. This computerized conversion among multiple languages is known as multilingual machine translation. But in this paper we work with a bilingual model, where we concern with two languages: English and Bengali. We use soft computational approach where fuzzy If-Then rule is applied to choose a lemma from prior knowledge; Penn TreeBank PoS tags and HMM tagger are used as lexical class marker to each word in corpora.

Collaboration


Dive into the Chandranath Adak's collaboration.

Top Co-Authors

Avatar

B. B. Chaudhuri

Indian Statistical Institute

View shared research outputs
Top Co-Authors

Avatar

Prantik Maitra

Indian Statistical Institute

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge