Vinay P. Namboodiri
Indian Institute of Technology Kanpur
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vinay P. Namboodiri.
indian conference on computer vision, graphics and image processing | 2007
Vinay P. Namboodiri; Subhasis Chaudhuri
An intrinsic property of real aperture imaging has been that the observations tend to be defocused. This artifact has been used in an innovative manner by researchers for depth estimation, since the amount of defocus varies with varying depth in the scene. There have been various methods to model the defocus blur. We model the defocus process using the model of diffusion of heat. The diffusion process has been traditionally used in low level vision problems like smoothing, segmentation and edge detection. In this paper a novel application of the diffusion principle is made for generating the defocus space of the scene. The defocus space is the set of all possible observations for a given scene that can be captured using a physical lens system. Using the notion of defocus space we estimate the depth in the scene and also generate the corresponding fully focused equivalent pin-hole image. The algorithm described here also brings out the equivalence of the two modalities, viz. depth from focus and depth from defocus for structure recovery.
computer vision and pattern recognition | 2008
Vinay P. Namboodiri; Subhasis Chaudhuri
In this paper we investigate the challenging problem of recovering the depth layers in a scene from a single defocused observation. The problem is definitely solvable if there are multiple observations. In this paper we show that one can perceive the depth in the scene even from a single observation. We use the inhomogeneous reverse heat equation to obtain an estimate of the blur, thereby preserving the depth information characterized by the defocus. However, the reverse heat equation, due to its parabolic nature, is divergent. We stabilize the reverse heat equation by considering the gradient degeneration as an effective stopping criterion. The amount of (inverse) diffusion is actually a measure of relative depth. Because of ill-posedness we propose a graph-cuts based method for inferring the depth in the scene using the amount of diffusion as a data likelihood and a smoothness condition on the depth in the scene. The method is verified experimentally on a varied set of test cases.
international conference on image processing | 2008
Vinay P. Namboodiri; Subhasis Chaudhuri; Sunil Hadap
In the area of depth estimation from images an interesting approach has been structure recovery from defocus cue. Towards this end, there have been a number of approaches [4,6]. Here we propose a technique to estimate the regularized depth from defocus using diffusion. The coefficient of the diffusion equation is modeled using a pair-wise Markov random field (MRF) ensuring spatial regularization to enhance the robustness of the depth estimated. This framework is solved efficiently using a graph-cuts based techniques. The MRF representation is enhanced by incorporating a smoothness prior that is obtained from a graph based segmentation of the input images. The method is demonstrated on a number of data sets and its performance is compared with state of the art techniques.
british machine vision conference | 2011
Hakan Bilen; Vinay P. Namboodiri; Luc Van Gool
In this paper we propose a generic framework to incorporate unobserved auxiliary information for classifying objects and actions. This framework allows us to explicitly account for localisation and alignment of representations for generic object and action classes as latent variables. We approach this problem in the discriminative setting as learning a max-margin classifier that infers the class label along with the latent variables. Through this paper we make the following contributions a) We provide a method for incorporating latent variables into object and action classification b) We specifically account for the presence of an explicit class related subregion which can include foreground and/or background. c) We explore a way to learn a better classifier by iterative expansion of the latent parameter space. We demonstrate the performance of our approach by rigorous experimental evaluation on a number of standard object and action recognition datasets.
International Journal of Computer Vision | 2014
Hakan Bilen; Vinay P. Namboodiri; Luc Van Gool
In this paper we propose a generic framework to incorporate unobserved auxiliary information for classifying objects and actions. This framework allows us to automatically select a bounding box and its quadrants from which best to extract features. These spatial subdivisions are learnt as latent variables. The paper is an extended version of our earlier work Bilen et al. (Proceedings of The British Machine Vision Conference, 2011), complemented with additional ideas, experiments and analysis. We approach the classification problem in a discriminative setting, as learning a max-margin classifier that infers the class label along with the latent variables. Through this paper we make the following contributions: (a) we provide a method for incorporating latent variables into object and action classification; (b) these variables determine the relative focus on foreground versus background information that is taken account of; (c) we design an objective function to more effectively learn in unbalanced data sets; (d) we learn a better classifier by iterative expansion of the latent parameter space. We demonstrate the performance of our approach through experimental evaluation on a number of standard object and action recognition data sets.
workshop on applications of computer vision | 2011
Hakan Bilen; Vinay P. Namboodiri; Luc Van Gool
We address the problem of recognizing actions in reallife videos. Space-time interest point-based approaches have been widely prevalent towards solving this problem. In contrast, more spatially extended features such as regions have not been so popular. The reason is, any local region based approach requires the motion flow information for a specific region to be collated temporally. This is challenging as the local regions are deformable and not well delineated from the surroundings. In this paper we address this issue by using robust tracking of regions and we show that it is possible to obtain region descriptors for classification of actions. This paper lays the groundwork for further investigation into region based approaches. Through this paper we make the following contributions a) We advocate identification of salient regions based on motion segmentation b) We adopt a state-of-the art tracker for robust tracking of the identified regions rather than using isolated space-time blocks c) We propose optical flow based region descriptors to encode the extracted trajectories in piece-wise blocks. We demonstrate the performance of our system on real-world data sets.
Pattern Recognition | 2007
Rajashekhar; Subhasis Chaudhuri; Vinay P. Namboodiri
In this paper we propose a geometry-based image retrieval scheme that makes use of projectively invariant features. Cross-ratio (CR) is an invariant feature under projective transformations for collinear points. We compute the CRs of point sets in quadruplets and the CR histogram is used as the feature for retrieval purposes. Being a geometric feature, it allows us to retrieve similar images irrespective of view point and illumination changes. We can retrieve the same building even if the facade has undergone a fresh coat of paints! Color and textural features can also be included, if desired. Experimental results show a favorably very good retrieval accuracy when tested on an image database of size 4000. The method is very effective in retrieving images having man-made objects rich in polygonal structures like buildings, rail tracks, etc.
Computer Graphics Forum | 2017
Rahul Arora; Ishan Darolia; Vinay P. Namboodiri; Karan Singh; Adrien Bousseau
A hallmark of early stage design is a number of quick‐and‐dirty sketches capturing design inspirations, model variations and alternate viewpoints of a visual concept. We present SketchSoup, a workflow that allows designers to explore the design space induced by such sketches. We take an unstructured collection of drawings as input, along with a small number of user‐provided correspondences as input. We register them using a multi‐image matching algorithm, and present them as a 2D interpolation space. By morphing sketches in this space, our approach produces plausible visualizations of shape and viewpoint variations despite the presence of sketch distortions that would prevent standard camera calibration and 3D reconstruction. In addition, our interpolated sketches can serve as inspiration for further drawings, which feed back into the design space as additional image inputs. SketchSoup thus fills a significant gap in the early ideation stage of conceptual design by allowing designers to make better informed choices before proceeding to more expensive 3D modelling and prototyping. From a technical standpoint, we describe an end‐to‐end system that judiciously combines and adapts various image processing techniques to the drawing domain—where the images are dominated not by colour, shading and texture, but by sketchy stroke contours.
computer vision and pattern recognition | 2014
Hakan Bilen; Marco Pedersoli; Vinay P. Namboodiri; Tinne Tuytelaars; Luc Van Gool
In classification of objects substantial work has gone into improving the low level representation of an image by considering various aspects such as different features, a number of feature pooling and coding techniques and considering different kernels. Unlike these works, in this paper, we propose to enhance the semantic representation of an image. We aim to learn the most important visual components of an image and how they interact in order to classify the objects correctly. To achieve our objective, we propose a new latent SVM model for category level object classification. Starting from image-level annotations, we jointly learn the object class and its context in terms of spatial location (where) and appearance (what). Furthermore, to regularize the complexity of the model we learn the spatial and co-occurrence relations between adjacent regions, such that unlikely configurations are penalized. Experimental results demonstrate that the proposed method can consistently enhance results on the challenging Pascal VOC dataset in terms of classification and weakly supervised detection. We also show how semantic representation can be exploited for finding similar content.
international conference on image processing | 2004
S.C. Rajashekahar; Vinay P. Namboodiri
We propose an image retrieval scheme based on projectively invariant features. Since cross-ratio is the fundamental invariant feature under projective transformations for points, we use that as the basic feature parameter. We compute the cross-ratios of point sets in quadruplets and a discrete representation of the distribution of the cross-ratio is obtained from the computed values. The distribution is used as the feature for retrieval purposes. The method is very effective in retrieving images, like buildings, having similar planar 3D structures.