Maya R. Gupta | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Maya R. Gupta is active.

Explore More

Publication

Featured researches published by Maya R. Gupta.

Pattern Recognition | 2007

OCR binarization and image pre-processing for searching historical documents

Maya R. Gupta; Nathaniel P. Jacobson; Eric K. Garcia

We consider the problem of document binarization as a pre-processing step for optical character recognition (OCR) for the purpose of keyword search of historical printed documents. A number of promising techniques from the literature for binarization, pre-filtering, and post-binarization denoising were implemented along with newly developed methods for binarization: an error diffusion binarization, a multiresolutional version of Otsus binarization, and denoising by despeckling. The OCR in the ABBYY FineReader 7.1 SDK is used as a black box metric to compare methods. Results for 12 pages from six newspapers of differing quality show that performance varies widely by image, but that the classic Otsu method and Otsu-based methods perform best on average.

IEEE Transactions on Geoscience and Remote Sensing | 2007

Linear Fusion of Image Sets for Display

Nathaniel P. Jacobson; Maya R. Gupta; Jeffrey B. Cole

Many remote-sensing applications produce large sets of images, such as hyperspectral images or time-indexed image sequences. We explore methods to display such image sets by linearly projecting them onto basis functions designed for the red, green, and blue (RGB) primaries of a standard tristimulus display, for the human visual system, and for the signal-to-noise ratio of the dataset, creating a single color image. Projecting the data onto three basis functions reduces the information but allows each datapoint to be rendered by a single color. Principal components analysis is perhaps the most commonly used linear projection method, but it is data adaptive and, thus, yields inconsistent visualizations that may be difficult to interpret. Instead, we focus on designing fixed basis functions based on optimizing criteria in the perceptual colorspace CIELab and the standardized device colorspace sRGB. This approach yields visualizations with rich meaning that users can readily extract. Example visualizations are shown for passive radar video and Airborne Visible/Infrared Imaging Spectrometer hyperspectral imagery. Additionally, we show how probabilistic classification information can be layered on top of the visualization to create a customized nonlinear representation of an image set.

IEEE Transactions on Information Theory | 2008

Functional Bregman Divergence and Bayesian Estimation of Distributions

Béla András Frigyik; Santosh Srivastava; Maya R. Gupta

A class of distortions termed functional Bregman divergences is defined, which includes squared error and relative entropy. A functional Bregman divergence acts on functions or distributions, and generalizes the standard Bregman divergence for vectors and a previous pointwise Bregman divergence that was defined for functions. A recent result showed that the mean minimizes the expected Bregman divergence. The new functional definition enables the extension of this result to the continuous case to show that the mean minimizes the expected functional Bregman divergence over a set of functions or distributions. It is shown how this theorem applies to the Bayesian estimation of distributions. Estimation of the uniform distribution from independent and identically drawn samples is presented as a case study.

international conference on machine learning | 2009

Learning kernels from indefinite similarities

Yihua Chen; Maya R. Gupta; Benjamin Recht

Similarity measures in many real applications generate indefinite similarity matrices. In this paper, we consider the problem of classification based on such indefinite similarities. These indefinite kernels can be problematic for standard kernel-based algorithms as the optimization problems become non-convex and the underlying theory is invalidated. In order to adapt kernel methods for similarity-based learning, we introduce a method that aims to simultaneously find a reproducing kernel Hilbert space based on the given similarities and train a classifier with good generalization in that space. The method is formulated as a convex optimization problem. We propose a simplified version that can reduce overfitting and whose associated convex conic program can be solved efficiently. We compare the proposed simplified version with six other methods on a collection of real data sets.

IEEE Transactions on Image Processing | 2008

Adaptive Local Linear Regression With Application to Printer Color Management

Maya R. Gupta; Eric K. Garcia; Erika Chin

Local learning methods, such as local linear regression and nearest neighbor classifiers, base estimates on nearby training samples, neighbors. Usually, the number of neighbors used in estimation is fixed to be a global ldquooptimalrdquo value, chosen by cross validation. This paper proposes adapting the number of neighbors used for estimation to the local geometry of the data, without need for cross validation. The term enclosing neighborhood is introduced to describe a set of neighbors whose convex hull contains the test point when possible. It is proven that enclosing neighborhoods yield bounded estimation variance under some assumptions. Three such enclosing neighborhood definitions are presented: natural neighbors, natural neighbors inclusive, and enclosing k-NN. The effectiveness of these neighborhood definitions with local linear regression is tested for estimating lookup tables for color management. Significant improvements in error metrics are shown, indicating that enclosing neighborhoods may be a promising adaptive neighborhood definition for other local learning tasks as well, depending on the density of training samples.

IEEE Transactions on Knowledge and Data Engineering | 2010

Completely Lazy Learning

Eric K. Garcia; Sergey Feldman; Maya R. Gupta; Santosh Srivastava

Local classifiers are sometimes called lazy learners because they do not train a classifier until presented with a test sample. However, such methods are generally not completely lazy because the neighborhood size k (or other locality parameter) is usually chosen by cross validation on the training set, which can require significant preprocessing and risks overfitting. We propose a simple alternative to cross validation of the neighborhood size that requires no preprocessing: instead of committing to one neighborhood size, average the discriminants for multiple neighborhoods. We show that this forms an expected estimated posterior that minimizes the expected Bregman loss with respect to the uncertainty about the neighborhood choice. We analyze this approach for six standard and state-of-the-art local classifiers, including discriminative adaptive metric kNN (DANN), a local support vector machine (SVM-KNN), hyperplane distance nearest neighbor (HKNN), and a new local Bayesian quadratic discriminant analysis (local BDA). The empirical effectiveness of this technique versus cross validation is confirmed with experiments on seven benchmark data sets, showing that similar classification performance can be attained without any training.

ieee automatic speech recognition and understanding workshop | 2001

Robust speech recognition using wavelet coefficient features

Maya R. Gupta; Anna C. Gilbert

We propose a new vein of feature vectors for robust speech recognition that use denoised wavelet coefficients; greater robustness to unexpected additive noise or spectrum distortions begins with more robust acoustic features. The use of wavelet coefficients is motivated by human acoustic process modelling and by the ability of wavelet coefficients to capture important time and frequency features. Wavelet denoising accentuates the most salient information about the speech signal and adds robustness. We show encouraging results using denoised cosine packet features on small-scale experiments with the TIMIT database, its NTIMIT counterpart, and low-pass filter distortions.

international conference on machine learning | 2007

Local similarity discriminant analysis

Luca Cazzanti; Maya R. Gupta

We propose a local, generative model for similarity-based classification. The method is applicable to the case that only pairwise similarities between samples are available. The classifier models the local class-conditional distribution using a maximum entropy estimate and empirical moment constraints. The resulting exponential class conditional-distributions are combined with class prior probabilities and misclassification costs to form the local similarity discriminant analysis (local SDA) classifier. We compare the performance of local SDA to a non-local version, to the local nearest centroid classifier, the nearest centroid classifier, k-NN, and to the recently-developed potential support vector machine (PSVM). Results show that local SDA is competitive with k-NN and the computationally-demanding PSVM while offering the advantages of a generative classifier.

international conference on image processing | 2006

Wavelet Principal Component Analysis and its Application to Hyperspectral Images

Maya R. Gupta; Nathaniel P. Jacobson

We investigate reducing the dimensionality of image sets by using principal component analysis on wavelet coefficients to maximize edge energy in the reduced dimension images. Large image sets, such as those produced with hyperspectral imaging, are often projected into a lower dimensionality space for image processing tasks. Spatial information is important for certain classification and detection tasks, but popular dimensionality reduction techniques do not take spatial information into account. Dimensionality reduction using principal components analysis on wavelet coefficients is investigated. Equivalences and differences to conventional principal components analysis are shown, and an efficient workflow is given. Experiments on AVIRIS images show that the wavelet energy in any given subband of the reduced dimensionality images can be increased with this method.

Pattern Recognition | 2008

Generative models for similarity-based classification

Luca Cazzanti; Maya R. Gupta; Anjali J. Koppal

A maximum-entropy approach to generative similarity-based classifiers model is proposed. First, a descriptive set of similarity statistics is assumed to be sufficient for classification. Then the class-conditional distributions of these descriptive statistics are estimated as the maximum-entropy distributions subject to empirical moment constraints. The resulting exponential class-conditional distributions are used in a maximum a posteriori decision rule, forming the similarity discriminant analysis (SDA) classifier. Simulated and real data experiments compare performance to the k-nearest neighbor classifier, the nearest-centroid classifier, and the potential support vector machine (PSVM).

Explore More