Manik Varma | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Manik Varma is active.

Explore More

Publication

Featured researches published by Manik Varma.

International Journal of Computer Vision | 2005

A Statistical Approach to Texture Classification from Single Images

Manik Varma; Andrew Zisserman

We investigate texture classification from single images obtained under unknown viewpoint and illumination. A statistical approach is developed where textures are modelled by the joint probability distribution of filter responses. This distribution is represented by the frequency histogram of filter response cluster centres (textons). Recognition proceeds from single, uncalibrated images and the novelty here is that rotationally invariant filters are used and the filter response space is low dimensional.Classification performance is compared with the filter banks and methods of Leung and Malik [IJCV, 2001], Schmid [CVPR, 2001] and Cula and Dana [IJCV, 2004] and it is demonstrated that superior performance is achieved here. Classification results are presented for all 61 materials in the Columbia-Utrecht texture database.We also discuss the effects of various parameters on our classification algorithm—such as the choice of filter bank and rotational invariance, the size of the texton dictionary as well as the number of training images used. Finally, we present a method of reliably measuring relative orientation co-occurrence statistics in a rotationally invariant manner, and discuss whether incorporating such information can enhance the classifier’s performance.

international conference on computer vision | 2009

Multiple kernels for object detection

Andrea Vedaldi; Varun Gulshan; Manik Varma; Andrew Zisserman

Our objective is to obtain a state-of-the art object category detector by employing a state-of-the-art image classifier to search for the object in all possible image sub-windows. We use multiple kernel learning of Varma and Ray (ICCV 2007) to learn an optimal combination of exponential χ2 kernels, each of which captures a different feature channel. Our features include the distribution of edges, dense and sparse visual words, and feature descriptors at different levels of spatial organization.

international conference on computer vision | 2007

Learning The Discriminative Power-Invariance Trade-Off

Manik Varma; Debajyoti Ray

We investigate the problem of learning optimal descriptors for a given classification task. Many hand-crafted descriptors have been proposed in the literature for measuring visual similarity. Looking past initial differences, what really distinguishes one descriptor from another is the tradeoff that it achieves between discriminative power and invariance. Since this trade-off must vary from task to task, no single descriptor can be optimal in all situations. Our focus, in this paper, is on learning the optimal tradeoff for classification given a particular training set and prior constraints. The problem is posed in the kernel learning framework. We learn the optimal, domain-specific kernel as a combination of base kernels corresponding to base features which achieve different levels of trade-off (such as no invariance, rotation invariance, scale invariance, affine invariance, etc.) This leads to a convex optimisation problem with a unique global optimum which can be solved for efficiently. The method is shown to achieve state-of-the-art performance on the UIUC textures, Oxford flowers and Cal- tech 101 datasets.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2009

A Statistical Approach to Material Classification Using Image Patch Exemplars

Manik Varma; Andrew Zisserman

In this paper, we investigate material classification from single images obtained under unknown viewpoint and illumination. It is demonstrated that materials can be classified using the joint distribution of intensity values over extremely compact neighborhoods (starting from as small as 3times3 pixels square) and that this can outperform classification using filter banks with large support. It is also shown that the performance of filter banks is inferior to that of image patches with equivalent neighborhoods. We develop novel texton-based representations which are suited to modeling this joint neighborhood distribution for Markov random fields. The representations are learned from training images and then used to classify novel images (with unknown viewpoint and lighting) into texture classes. Three such representations are proposed and their performance is assessed and compared to that of filter banks. The power of the method is demonstrated by classifying 2,806 images of all 61 materials present in the Columbia-Utrecht database. The classification performance surpasses that of recent state-of-the-art filter bank-based classifiers such as Leung and Malik (IJCV 01), Cula and Dana (IJCV 04), and Varma and Zisserman (IJCV 05). We also benchmark performance by classifying all of the textures present in the UIUC, Microsoft Textile, and San Francisco outdoor data sets. We conclude with discussions on why features based on compact neighborhoods can correctly discriminate between textures with large global structure and why the performance of filter banks is not superior to that of the source image patches from which they were derived.

computer vision and pattern recognition | 2003

Texture classification: are filter banks necessary?

Manik Varma; Andrew Zisserman

We question the role that large scale filter banks have traditionally played in texture classification. It is demonstrated that textures can be classified using the joint distribution of intensity values over extremely compact neighborhoods (starting from as small as 3 /spl times/ 3 pixels square), and that this outperforms classification using filter banks with large support. We develop a novel texton based representation, which is suited to modeling this joint neighborhood distribution for MRFs. The representation is learnt from training images, and then used to classify novel images (with unknown viewpoint and lighting) into texture classes. The power of the method is demonstrated by classifying over 2800 images of all 61 textures present in the Columbia-Utrecht database. The classification performance surpasses that of recent state-of-the-art filter bank based classifiers such as Leung & Malik, Cula & Dana, and Varma & Zisserman.

european conference on computer vision | 2002

Classifying Images of Materials: Achieving Viewpoint and Illumination Independence

Manik Varma; Andrew Zisserman

In this paper we present a new approach to material classification under unknown viewpoint and illumination. Our texture model is based on the statistical distribution of clustered filter responses. However, unlike previous 3D texton representations, we use rotationally invariant filters and cluster in an extremely low dimensional space. Having built a texton dictionary, we present a novel method of classifying a single image without requiring any a priori knowledge about the viewing or illumination conditions under which it was photographed. We argue that using rotationally invariant filters while clustering in such a low dimensional space improves classification performance and demonstrate this claim with results on all 61 textures in the Columbia-Utrecht database. We then proceed to show how texture models can be further extended by compensating for viewpoint changes using weak isotropy.The new clustering and classification methods are compared to those of Leung and Malik (ICCV 1999), Schmid (CVPR 2001) and Cula and Dana (CVPR 2001), which are the current state-of-the-art approaches.

international conference on machine learning | 2009

More generality in efficient multiple kernel learning

Manik Varma; Bodla Rakesh Babu

Recent advances in Multiple Kernel Learning (MKL) have positioned it as an attractive tool for tackling many supervised learning tasks. The development of efficient gradient descent based optimization schemes has made it possible to tackle large scale problems. Simultaneously, MKL based algorithms have achieved very good results on challenging real world applications. Yet, despite their successes, MKL approaches are limited in that they focus on learning a linear combination of given base kernels. In this paper, we observe that existing MKL formulations can be extended to learn general kernel combinations subject to general regularization. This can be achieved while retaining all the efficiency of existing large scale optimization algorithms. To highlight the advantages of generalized kernel learning, we tackle feature selection problems on benchmark vision and UCI databases. It is demonstrated that the proposed formulation can lead to better results not only as compared to traditional MKL but also as compared to state-of-the-art wrapper and filter methods for feature selection.

international conference on computer vision | 2007

Locally Invariant Fractal Features for Statistical Texture Classification

Manik Varma; Rahul Garg

We address the problem of developing discriminative, yet invariant, features for texture classification. Texture variations due to changes in scale are amongst the hardest to handle. One of the most successful methods of dealing with such variations is based on choosing interest points and selecting their characteristic scales [Lazebnik et al. PAMI 2005]. However, selecting a characteristic scale can be unstable for many textures. Furthermore, the reliance on an interest point detector and the inability to evaluate features densely can be serious limitations. Fractals present a mathematically well founded alternative to dealing with the problem of scale. However, they have not become popular as texture features due to their lack of discriminative power. This is primarily because: (a) fractal based classification methods have avoided statistical characterisations of textures (which is essential for accurate analysis) by using global features; and (b) fractal dimension features are unable to distinguish between key texture primitives such as edges, corners and uniform regions. In this paper, we overcome these drawbacks and develop local fractal features that are evaluated densely. The features are robust as they do not depend on choosing interest points or characteristic scales. Furthermore, it is shown that the local fractal dimension is invariant to local bi-Lipschitz transformations whereas its extension is able to correctly distinguish between fundamental texture primitives. Textures are characterised statistically by modelling the full joint PDF of these features. This allows us to develop a texture classification framework which is discriminative, robust and achieves state-of-the-art performance as compared to affine invariant and fractal based methods.

international world wide web conferences | 2011

Learning to re-rank: query-dependent image re-ranking using click data

Vidit Jain; Manik Varma

Our objective is to improve the performance of keyword based image search engines by re-ranking their original results. To this end, we address three limitations of existing search engines in this paper. First, there is no straight-forward, fully automated way of going from textual queries to visual features. Image search engines therefore primarily rely on static and textual features for ranking. Visual features are mainly used for secondary tasks such as finding similar images. Second, image rankers are trained on query-image pairs labeled with relevance judgments determined by human experts. Such labels are well known to be noisy due to various factors including ambiguous queries, unknown user intent and subjectivity in human judgments. This leads to learning a sub-optimal ranker. Finally, a static ranker is typically built to handle disparate user queries. The ranker is therefore unable to adapt its parameters to suit the query at hand which again leads to sub-optimal results. We demonstrate that all of these problems can be mitigated by employing a re-ranking algorithm that leverages aggregate user click data. We hypothesize that images clicked in response to a query are mostly relevant to the query. We therefore re-rank the original search results so as to promote images that are likely to be clicked to the top of the ranked list. Our re-ranking algorithm employs Gaussian Process regression to predict the normalized click count for each image, and combines it with the original ranking score. Our approach is shown to significantly boost the performance of the Bing image search engine on a wide range of tail queries.

international world wide web conferences | 2013

Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages

Rahul Agrawal; Archit Gupta; Yashoteja Prabhu; Manik Varma

Recommending phrases from web pages for advertisers to bid on against search engine queries is an important research problem with direct commercial impact. Most approaches have found it infeasible to determine the relevance of all possible queries to a given ad landing page and have focussed on making recommendations from a small set of phrases extracted (and expanded) from the page using NLP and ranking based techniques. In this paper, we eschew this paradigm, and demonstrate that it is possible to efficiently predict the relevant subset of queries from a large set of monetizable ones by posing the problem as a multi-label learning task with each query being represented by a separate label. We develop Multi-label Random Forests to tackle problems with millions of labels. Our proposed classifier has prediction costs that are logarithmic in the number of labels and can make predictions in a few milliseconds using 10 Gb of RAM. We demonstrate that it is possible to generate training data for our classifier automatically from click logs without any human annotation or intervention. We train our classifier on tens of millions of labels, features and training points in less than two days on a thousand node cluster. We develop a sparse semi-supervised multi-label learning formulation to deal with training set biases and noisy labels harvested automatically from the click logs. This formulation is used to infer a belief in the state of each label for each training ad and the random forest classifier is extended to train on these beliefs rather than the given labels. Experiments reveal significant gains over ranking and NLP based techniques on a large test set of 5 million ads using multiple metrics.

Explore More