M. Pawan Kumar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where M. Pawan Kumar is active.

Explore More

Publication

Featured researches published by M. Pawan Kumar.

computer vision and pattern recognition | 2010

Efficiently selecting regions for scene understanding

M. Pawan Kumar; Daphne Koller

Recent advances in scene understanding and related tasks have highlighted the importance of using regions to reason about high-level scene structure. Typically, the regions are selected beforehand and then an energy function is defined over them. This two step process suffers from the following deficiencies: (i) the regions may not match the boundaries of the scene entities, thereby introducing errors; and (ii) as the regions are obtained without any knowledge of the energy function, they may not be suitable for the task at hand. We address these problems by designing an efficient approach for obtaining the best set of regions in terms of the energy function itself. Each iteration of our algorithm selects regions from a large dictionary by solving an accurate linear programming relaxation via dual decomposition. The dictionary of regions is constructed by merging and intersecting segments obtained from multiple bottom-up over-segmentations. To demonstrate the usefulness of our algorithm, we consider the task of scene segmentation and show significant improvements over state of the art methods.

british machine vision conference | 2004

Extending Pictorial Structures for Object Recognition

M. Pawan Kumar; Philip H. S. Torr; Andrew Zisserman

The goal of this paper is to recognize various deformable objects from images. To this end we extend the class of generative probabilistic models known as pictorial structures. This class of models is particularly suited to represent articulated structures, and has previously been used by Felzenszwalb and Huttenlocher for pose estimation of humans. We extend pictorial structures in three ways: (i) likelihoods are included for both the boundary and the enclosed texture of the animal; (ii) a complete graph is modelled (rather than a tree structure); (iii) it is demonstrated that the model can be tted in polynomial time using belief propagation. We show examples for two types of quadrupeds, cows and horses. We achieve excellent recognition performance for cows with an equal error rate of 3% for 500 positive and 5000 negative images.

international conference on computer vision | 2009

Efficient discriminative learning of parts-based models

M. Pawan Kumar; Andrew Zisserman; Philip H. S. Torr

Supervised learning of a parts-based model can be formulated as an optimization problem with a large (exponential in the number of parts) set of constraints. We show how this seemingly difficult problem can be solved by (i) reducing it to an equivalent convex problem with a small, polynomial number of constraints (taking advantage of the fact that the model is tree-structured and the potentials have a special form); and (ii) obtaining the globally optimal model using an efficient dual decomposition strategy. Each component of the dual decomposition is solved by a modified version of the highly optimized SVM-Light algorithm. To demonstrate the effectiveness of our approach, we learn human upper body models using two challenging, publicly available datasets. Our model accounts for the articulation of humans as well as the occlusion of parts. We compare our method with a baseline iterative strategy as well as a state of the art algorithm and show significant efficiency improvements.

computer vision and pattern recognition | 2010

Energy minimization for linear envelope MRFs

Pushmeet Kohli; M. Pawan Kumar

Markov random fields with higher order potentials have emerged as a powerful model for several problems in computer vision. In order to facilitate their use, we propose a new representation for higher order potentials as upper and lower envelopes of linear functions. Our representation concisely models several commonly used higher order potentials, thereby providing a unified framework for minimizing the corresponding Gibbs energy functions. We exploit this framework by converting lower envelope potentials to standard pairwise functions with the addition of a small number of auxiliary variables. This allows us to minimize energy functions with lower envelope potentials using conventional algorithms such as BP, TRW and α-expansion. Furthermore, we show how the minimization of energy functions with upper envelope potentials leads to a difficult minmax problem. We address this difficulty by proposing a new message passing algorithm that solves a linear programming relaxation of the problem. Although this is primarily a theoretical paper, we demonstrate the efficacy of our approach on the binary (fg/bg) segmentation problem.

international conference on computer vision | 2011

Learning specific-class segmentation from diverse data

M. Pawan Kumar; Haithem Turki; Dan Preston; Daphne Koller

We consider the task of learning the parameters of a segmentation model that assigns a specific semantic class to each pixel of a given image. The main problem we face is the lack of fully supervised data. We address this issue by developing a principled framework for learning the parameters of a specific-class segmentation model using diverse data. More precisely, we propose a latent structural support vector machine formulation, where the latent variables model any missing information in the human annotation. Of particular interest to us are three types of annotations: (i) images segmented using generic foreground or background classes; (ii) images with bounding boxes specified for objects; and (iii) images labeled to indicate the presence of a class. Using large, publicly available datasets we show that our approach is able to exploit the information present in different annotations to improve the accuracy of a state-of-the art region-based model.

international conference on machine learning | 2008

Efficiently solving convex relaxations for MAP estimation

M. Pawan Kumar; Philip H. S. Torr

The problem of obtaining the maximum a posteriori (MAP) estimate of a discrete random field is of fundamental importance in many areas of Computer Science. In this work, we build on the tree reweighted message passing (TRW) framework of (Kolmogorov, 2006; Wainwright et al., 2005). TRW iteratively optimizes the Lagrangian dual of a linear programming relaxation for MAP estimation. We show how the dual formulation of TRW can be extended to include cycle inequalities (Barahona & Mahjoub, 1986) and some recently proposed second order cone (SOC) constraints (Kumar et al., 2007). We propose efficient iterative algorithms for solving the resulting duals. Similar to the method described in (Kolmogorov, 2006), these algorithms are guaranteed to converge. We test our approach on a large set of synthetic data, as well as real data. Our experiments show that the additional constraints (i.e. cycle inequalities and SOC constraints) provide better results in cases where the TRW framework fails (namely MAP estimation for non-submodular energy functions).

computer vision and pattern recognition | 2014

Optimizing Average Precision Using Weakly Supervised Data

Aseem Behl; C. V. Jawahar; M. Pawan Kumar

The performance of binary classification tasks, such as action classification and object detection, is often measured in terms of the average precision (AP). Yet it is common practice in computer vision to employ the support vector machine (SVM) classifier, which optimizes a surrogate 0-1 loss. The popularity of SVM can be attributed to its empirical performance. Specifically, in fully supervised settings, SVM tends to provide similar accuracy to the AP-SVM classifier, which directly optimizes an AP-based loss. However, we hypothesize that in the significantly more challenging and practically useful setting of weakly supervised learning, it becomes crucial to optimize the right accuracy measure. In order to test this hypothesis, we propose a novel latent AP-SVM that minimizes a carefully designed upper bound on the AP-based loss function over weakly supervised samples. Using publicly available datasets, we demonstrate the advantage of our approach over standard loss-based binary classifiers on two challenging problems: action classification and character recognition.

medical image computing and computer-assisted intervention | 2013

Discriminative Parameter Estimation for Random Walks Segmentation

Pierre-Yves Baudin; Danny Goodman; Puneet Kumar; Noura Azzabou; Pierre G. Carlier; Nikos Paragios; M. Pawan Kumar

The Random Walks (RW) algorithm is one of the most efficient and easy-to-use probabilistic segmentation methods. By combining contrast terms with prior terms, it provides accurate segmentations of medical images in a fully automated manner. However, one of the main drawbacks of using the RW algorithm is that its parameters have to be hand-tuned. we propose a novel discriminative learning framework that estimates the parameters using a training dataset. The main challenge we face is that the training samples are not fully supervised. Specifically, they provide a hard segmentation of the images, instead of a probabilistic segmentation. We overcome this challenge by treating the optimal probabilistic segmentation that is compatible with the given hard segmentation as a latent variable. This allows us to employ the latent support vector machine formulation for parameter estimation. We show that our approach significantly outperforms the baseline methods on a challenging dataset consisting of real clinical 3D MRI volumes of skeletal muscles.

international conference on computer vision | 2015

Parsimonious Labeling

Puneet Kumar Dokania; M. Pawan Kumar

We propose a new family of discrete energy minimization problems, which we call parsimonious labeling. Our energy function consists of unary potentials and high-order clique potentials. While the unary potentials are arbitrary, the clique potentials are proportional to the diversity of the set of unique labels assigned to the clique. Intuitively, our energy function encourages the labeling to be parsimonious, that is, use as few labels as possible. This in turn allows us to capture useful cues for important computer vision applications such as stereo correspondence and image denoising. Furthermore, we propose an efficient graph-cuts based algorithm for the parsimonious labeling problem that provides strong theoretical guarantees on the quality of the solution. Our algorithm consists of three steps. First, we approximate a given diversity using a mixture of a novel hierarchical Pn Potts model. Second, we use a divide-and-conquer approach for each mixture component, where each subproblem is solved using an efficient alpha-expansion algorithm. This provides us with a small number of putative labelings, one for each mixture component. Third, we choose the best putative labeling in terms of the energy value. Using both synthetic and standard real datasets, we show that our algorithm significantly outperforms other graph-cuts based approaches.

european conference on computer vision | 2014

Learning to Rank Using High-Order Information

Puneet Kumar Dokania; Aseem Behl; C. V. Jawahar; M. Pawan Kumar

The problem of ranking a set of visual samples according to their relevance to a query plays an important role in computer vision. The traditional approach for ranking is to train a binary classifier such as a support vector machine (svm). Binary classifiers suffer from two main deficiencies: (i) they do not optimize a ranking-based loss function, for example, the average precision (ap) loss; and (ii) they cannot incorporate high-order information such as the a priori correlation between the relevance of two visual samples (for example, two persons in the same image tend to perform the same action). We propose two novel learning formulations that allow us to incorporate high-order information for ranking. The first framework, called high-order binary svm (hob-svm), allows for a structured input. The parameters of hob-svm are learned by minimizing a convex upper bound on a surrogate 0-1 loss function. In order to obtain the ranking of the samples that form the structured input, hob-svm sorts the samples according to their max-marginals. The second framework, called high-order average precision svm (hoap-svm), also allows for a structured input and uses the same ranking criterion. However, in contrast to hob-svm, the parameters of hoap-svm are learned by minimizing a difference-of-convex upper bound on the ap loss. Using a standard, publicly available dataset for the challenging problem of action classification, we show that both hob-svm and hoap-svm outperform the baselines that ignore high-order information.

Explore More