Prateek Sarkar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Prateek Sarkar is active.

Explore More

Publication

Featured researches published by Prateek Sarkar.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005

Style consistent classification of isogenous patterns

Prateek Sarkar; George Nagy

In many applications of pattern recognition, patterns appear together in groups (fields) that have a common origin. For example, a printed word is usually a field of character patterns printed in the same font. A common origin induces consistency of style in features measured on patterns. The features of patterns co-occurring in a field are statistically dependent because they share the same, albeit unknown, style. Style constrained classifiers achieve higher classification accuracy by modeling such dependence among patterns in a field. Effects of style consistency on the distributions of field-features (concatenation of pattern features) can be modeled by hierarchical mixtures. Each field derives from a mixture of styles, while, within a field, a pattern derives from a class-style conditional mixture of Gaussians. Based on this model, an optimal style constrained classifier processes entire fields of patterns rendered in a consistent but unknown style. In a laboratory experiment, style constrained classification reduced errors on fields of printed digits by nearly 25 percent over singlet classifiers. Longer fields favor our classification method because they furnish more information about the underlying style.

international conference on document analysis and recognition | 2009

PixLabeler: User Interface for Pixel-Level Labeling of Elements in Document Images

Eric Saund; Jing Lin; Prateek Sarkar

We present a user interface design for labeling elements in document images at a pixel level. Labels are represented by overlay color, which might map to such terms as “handwriting”, “machine print”, “graphics”, etc. The primary purpose is to streamline processes for manual production of ground truth data, which is necessary for training algorithms and evaluating performance. Unlike general paint-type programs, the UI design is targeted specifically toward selection of collections of foreground pixels that are likely to be meaningful elements in a document image analysis context. Our implementation, called PixLabeler, is available for download and allows customized plug-ins for bootstrapping according to the labeling task.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1998

Spatial sampling of printed patterns

Prateek Sarkar; George Nagy; Jiangying Zhou; Daniel P. Lopresti

The bitmap obtained by scanning a printed pattern depends on the exact location of the scanning grid relative to the pattern. We consider ideal sampling with a regular lattice of delta functions. The displacement of the lattice relative to the pattern is random and obeys a uniform probability density function defined over a unit cell of the lattice. Random-phase sampling affects the edge-pixels of sampled patterns. The resulting number of distinct bitmaps and their relative frequencies can be predicted from a mapping of the original pattern boundary to the unit cell (called a module-grid diagram). The theory is supported by both simulated and experimental results. The module-grid diagram may be useful in helping to understand the effects of edge-pixel variation on optical character recognition.

international conference on document analysis and recognition | 2003

Training on severely degraded text-line images

Prateek Sarkar; Henry S. Baird; Xiaohu Zhang

We show that document image decoding (DID) supervised training algorithms, as a result of recent refinements, achieve high accuracy with low manual effort even under conditions of severe image degradation in both training and test data. We describe improvements in DID training of character template, set-width, and channel (noise) models. Large-scale experimental trials, using synthetically degraded images of text, have established two new and practically important advantages of DID algorithms: 1) high accuracy (> 99% characters correct) in decoding using models trained on even severely degraded images from the same distribution; and 2) greatly improved accuracy (< 1/10 the error rate) across a wide range of image degradations compared to untrained (idealized) models. This ability to train reliably on low-quality images that suffer from massive fragmentation and merging of characters, without the need for manual segmentation and labeling of character images, significantly reduces the manual effort of DID training.

international conference on pattern recognition | 2006

Image classification: Classifying distributions of visual features

Prateek Sarkar

We classify an image by generating a list of salient visual features present in the luminance channel, and matching the resulting variable-length feature list to category-specific generative models for such features. To facilitate quick computation, we use thresholded Viola-Jones rectangular features, each represented by a five-dimensional descriptor For each image category, a probability distribution for feature-lists is given by a latent conditional independence (LCI) model and classification is maximum likelihood. On the NIST tax forms database (Dimmick et al., 1991), where intra-category variations include variable scan-lightness, skew, noise, and machine-printed form-filling, our method improves performance over published results, while requiring very little training data, and without relying on an extensive set of handcrafted features

international conference on pattern recognition | 2000

Classification of style-constrained pattern-fields

Prateek Sarkar; George Nagy

In some classification tasks, all patterns in a field, such as digits in a ZIP-code image, originate from the same, but unknown, source (writer/print style). The class-conditional feature distributions depend on the source of the patterns. Several sources may share the same distribution, or style. The style-conditional distributions are estimated from the training set. The optimal field-classifier computes the class-conditional field-feature-probabilities as the sum of class-and-style-conditional field-feature-probabilities, weighted by the prior probabilities of the styles. We compare the decision regions and error rates of style-weighted classification with both conventional singlet and top-style classification in a minimal family of examples, and discuss some related practical considerations.

document analysis systems | 2010

Information extraction by finding repeated structure

Evgeniy Bart; Prateek Sarkar

Repetition of layout structure is prevalent in document images. In document design, such repetition conveys the underlying logical and functional structure of the data. For example, in invoices, the names, unit prices, quantities and other descriptors of every line item are laid out in a consistent spatial structure. We propose a general method for extracting such repeated structure from documents. After receiving a single example of the structure to be found, the proposed method localizes additional instances of this structure in the same document and in additional documents. A wide variety of perceptually motivated cues (such as alignment and saliency) is used for this purpose. These cues are combined in a probabilistic model, and a novel algorithm for exact inference in this model is proposed and used. We demonstrate that this method can cope with complex instances of repeated structure and generalizes successfully across a wide range of structure variations.

international conference on pattern recognition | 2002

An iterative algorithm for optimal style conscious field classification

Prateek Sarkar

Modeling consistency of style in isogenous fields of patterns (such as character patterns in a word from the same font or writer) can improve classification accuracy. Since such patterns are interdependent, the Bayes classifier requires maximization of a probability score over all field-labels, which are exponentially more numerous with increasing field length. The iterative field classification algorithm prioritizes field-labels, for computation of probability scores, according to an upper bound on the score. Factorizability of the upper bound score allows dynamic prioritization of field-labels. Experiments on classification of numeral field patterns demonstrate computational efficiency of the algorithm.

document analysis systems | 2008

On the Reading of Tables of Contents

Prateek Sarkar; Eric Saund

This paper presents a framework for understanding tables of contents (TOC) of books, journals, and magazines. We propose a universal logical structure representation in terms of a hierarchy of entries, each of which may contain a descriptor and a locator. We enumerate graphical and perceptual cues that provide cues to parsing of tables of contents in terms of this formalism. We make initial suggestions about the form of evaluation metrics for comparing ground truthed tables of contents with the output of recognition algorithms. Typical and a typical tables of contents are used throughout to illustrate significant phenomena that must be dealt with in principled ways in any general TOC interpretation scheme. Finally we discuss implications of our observations on the design of recognition algorithms.

international conference on document analysis and recognition | 2007

A Shared Parts Model for Document Image Recognition

M. Das Gupta; Prateek Sarkar

We address document image classification by visual appearance. An image is represented by a variable-length list of visually salient features. A hierarchical Bayesian network is used to model the joint density of these features. This model promotes generalization from a few samples by sharing component probability distributions among different categories, and by factoring out a common displacement vector shared by all features within an image. The Bayesian network is implemented as a factor graph, and parameter estimation and inference are both done by loopy belief propagation. We explain and illustrate our model on a simple shape classification task. We obtain close to 90% accuracy on classifying journal articles from memos in the UWASH-II dataset, as well as on other classification tasks on a home-grown data set of technical articles.

Explore More