Is this you? Create Your Porfile

Subhransu Maji

University of Massachusetts Amherst

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Subhransu Maji is active.

Explore More

Publication

Featured researches published by Subhransu Maji.

computer vision and pattern recognition | 2008

Classification using intersection kernel support vector machines is efficient

Subhransu Maji; Alexander C. Berg; Jitendra Malik

Straightforward classification using kernelized SVMs requires evaluating the kernel for a test vector and each of the support vectors. For a class of kernels we show that one can do this much more efficiently. In particular we show that one can build histogram intersection kernel SVMs (IKSVMs) with runtime complexity of the classifier logarithmic in the number of support vectors as opposed to linear for the standard approach. We further show that by precomputing auxiliary tables we can construct an approximate classifier with constant runtime and space requirements, independent of the number of support vectors, with negligible loss in classification accuracy on various tasks. This approximation also applies to 1 - chi2 and other kernels of similar form. We also introduce novel features based on a multi-level histograms of oriented edge energy and present experiments on various detection datasets. On the INRIA pedestrian dataset an approximate IKSVM classifier based on these features has the current best performance, with a miss rate 13% lower at 10-6 False Positive Per Window than the linear SVM detector of Dalal & Triggs. On the Daimler Chrysler pedestrian dataset IKSVM gives comparable accuracy to the best results (based on quadratic SVM), while being 15times faster. In these experiments our approximate IKSVM is up to 2000times faster than a standard implementation and requires 200times less memory. Finally we show that a 50times speedup is possible using approximate IKSVM based on spatial pyramid features on the Caltech 101 dataset with negligible loss of accuracy.

european conference on computer vision | 2010

Detecting people using mutually consistent poselet activations

Lubomir D. Bourdev; Subhransu Maji; Thomas Brox; Jitendra Malik

Bourdev and Malik (ICCV 09) introduced a new notion of parts, poselets, constructed to be tightly clustered both in the configuration space of keypoints, as well as in the appearance space of image patches. In this paper we develop a new algorithm for detecting people using poselets. Unlike that work which used 3D annotations of keypoints, we use only 2D annotations which are much easier for naive human annotators. The main algorithmic contribution is in how we use the pattern of poselet activations. Individual poselet activations are noisy, but considering the spatial context of each can provide vital disambiguating information, just as object detection can be improved by considering the detection scores of nearby objects in the scene. This can be done by training a two-layer feed-forward network with weights set using a max margin technique. The refined poselet activations are then clustered into mutually consistent hypotheses where consistency is based on empirically determined spatial keypoint distributions. Finally, bounding boxes are predicted for each person hypothesis and shape masks are aligned to edges in the image to provide a segmentation. To the best of our knowledge, the resulting system is the current best performer on the task of people detection and segmentation with an average precision of 47.8% and 40.5% respectively on PASCAL VOC 2009.

international conference on computer vision | 2011

Semantic contours from inverse detectors

Bharath Hariharan; Pablo Andrés Arbeláez; Lubomir D. Bourdev; Subhransu Maji; Jitendra Malik

We study the challenging problem of localizing and classifying category-specific object contours in real world images. For this purpose, we present a simple yet effective method for combining generic object detectors with bottom-up contours to identify object contours. We also provide a principled way of combining information from different part detectors and across categories. In order to study the problem and evaluate quantitatively our approach, we present a dataset of semantic exterior boundaries on more than 20, 000 object instances belonging to 20 categories, using the images from the VOC2011 PASCAL challenge [7].

computer vision and pattern recognition | 2009

Object detection using a max-margin Hough transform

Subhransu Maji; Jitendra Malik

We present a discriminative Hough transform based object detector where each local part casts a weighted vote for the possible locations of the object center. We show that the weights can be learned in a max-margin framework which directly optimizes the classification performance. The discriminative training takes into account both the codebook appearance and the spatial distribution of its position with respect to the object center to derive its importance. On various datasets we show that the discriminative training improves the Hough detector. Combined with a verification step using a SVM based classifier, our approach achieves a detection rate of 91.9% at 0.3 false positives per image on the ETHZ shape dataset, a significant improvement over the state of the art, while running the verification step on at least an order of magnitude fewer windows than in a sliding window approach.

international conference on computer vision | 2015

Bilinear CNN Models for Fine-Grained Visual Recognition

Tsung-Yu Lin; Aruni RoyChowdhury; Subhransu Maji

We propose bilinear models, a recognition architecture that consists of two feature extractors whose outputs are multiplied using outer product at each location of the image and pooled to obtain an image descriptor. This architecture can model local pairwise feature interactions in a translationally invariant manner which is particularly useful for fine-grained categorization. It also generalizes various orderless texture descriptors such as the Fisher vector, VLAD and O2P. We present experiments with bilinear models where the feature extractors are based on convolutional neural networks. The bilinear form simplifies gradient computation and allows end-to-end training of both networks using image labels only. Using networks initialized from the ImageNet dataset followed by domain specific fine-tuning we obtain 84.1% accuracy of the CUB-200-2011 dataset requiring only category labels at training time. We present experiments and visualizations that analyze the effects of fine-tuning and the choice two networks on the speed and accuracy of the models. Results show that the architecture compares favorably to the existing state of the art on a number of fine-grained datasets while being substantially simpler and easier to train. Moreover, our most accurate model is fairly efficient running at 8 frames/sec on a NVIDIA Tesla K40 GPU. The source code for the complete system will be made available at http://vis-www.cs.umass.edu/bcnn.

international conference on computer vision | 2011

Describing people: A poselet-based approach to attribute classification

Lubomir D. Bourdev; Subhransu Maji; Jitendra Malik

We propose a method for recognizing attributes, such as the gender, hair style and types of clothes of people under large variation in viewpoint, pose, articulation and occlusion typical of personal photo album images. Robust attribute classifiers under such conditions must be invariant to pose, but inferring the pose in itself is a challenging problem. We use a part-based approach based on poselets. Our parts implicitly decompose the aspect (the pose and viewpoint). We train attribute classifiers for each such aspect and we combine them together in a discriminative model. We propose a new dataset of 8000 people with annotated attributes. Our method performs very well on this dataset, significantly outperforming a baseline built on the spatial pyramid match kernel method. On gender recognition we outperform a commercial face recognition system.

computer vision and pattern recognition | 2011

Action recognition from a distributed representation of pose and appearance

Subhransu Maji; Lubomir D. Bourdev; Jitendra Malik

We present a distributed representation of pose and appearance of people called the “poselet activation vector”. First we show that this representation can be used to estimate the pose of people defined by the 3D orientations of the head and torso in the challenging PASCAL VOC 2010 person detection dataset. Our method is robust to clutter, aspect and viewpoint variation and works even when body parts like faces and limbs are occluded or hard to localize. We combine this representation with other sources of information like interaction with objects and other people in the image and use it for action recognition. We report competitive results on the PASCAL VOC 2010 static image action classification challenge.

international conference on computer vision | 2009

Max-margin additive classifiers for detection

Subhransu Maji; Alexander C. Berg

We present methods for training high quality object detectors very quickly. The core contribution is a pair of fast training algorithms for piece-wise linear classifiers, which can approximate arbitrary additive models. The classifiers are trained in a max-margin framework and significantly outperform linear classifiers on a variety of vision datasets. We report experimental results quantifying training time and accuracy on image classification tasks and pedestrian detection, including detection results better than the best previous on the INRIA dataset with faster training.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013

Efficient Classification for Additive Kernel SVMs

Subhransu Maji; Alexander C. Berg; Jitendra Malik

We show that a class of nonlinear kernel SVMs admits approximate classifiers with runtime and memory complexity that is independent of the number of support vectors. This class of kernels, which we refer to as additive kernels, includes widely used kernels for histogram-based image comparison like intersection and chi-squared kernels. Additive kernel SVMs can offer significant improvements in accuracy over linear SVMs on a wide variety of tasks while having the same runtime, making them practical for large-scale recognition or real-time detection tasks. We present experiments on a variety of datasets, including the INRIA person, Daimler-Chrysler pedestrians, UIUC Cars, Caltech-101, MNIST, and USPS digits, to demonstrate the effectiveness of our method for efficient evaluation of SVMs with additive kernels. Since its introduction, our method has become integral to various state-of-the-art systems for PASCAL VOC object detection/image classification, ImageNet Challenge, TRECVID, etc. The techniques we propose can also be applied to settings where evaluation of weighted additive kernels is required, which include kernelized versions of PCA, LDA, regression, k-means, as well as speeding up the inner loop of SVM classifier training algorithms.

computer vision and pattern recognition | 2011

Object segmentation by alignment of poselet activations to image contours

Thomas Brox; Lubomir D. Bourdev; Subhransu Maji; Jitendra Malik

In this paper, we propose techniques to make use of two complementary bottom-up features, image edges and texture patches, to guide top-down object segmentation towards higher precision. We build upon the part-based pose-let detector, which can predict masks for numerous parts of an object. For this purpose we extend poselets to 19 other categories apart from person. We non-rigidly align these part detections to potential object contours in the image, both to increase the precision of the predicted object mask and to sort out false positives. We spatially aggregate object information via a variational smoothing technique while ensuring that object regions do not overlap. Finally, we propose to refine the segmentation based on self-similarity defined on small image patches. We obtain competitive results on the challenging Pascal VOC benchmark. On four classes we achieve the best numbers to-date.

Explore More