Pushmeet Kohli | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pushmeet Kohli is active.

Explore More

Publication

Featured researches published by Pushmeet Kohli.

user interface software and technology | 2011

KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera

Shahram Izadi; David Kim; Otmar Hilliges; David Molyneaux; Richard A. Newcombe; Pushmeet Kohli; Jamie Shotton; Steve Hodges; Dustin Freeman; Andrew J. Davison; Andrew W. Fitzgibbon

KinectFusion enables a user holding and moving a standard Kinect camera to rapidly create detailed 3D reconstructions of an indoor scene. Only the depth data from Kinect is used to track the 3D pose of the sensor and reconstruct, geometrically precise, 3D models of the physical scene in real-time. The capabilities of KinectFusion, as well as the novel GPU-based pipeline are described in full. Uses of the core system for low-cost handheld scanning, and geometry-aware augmented reality and physics-based interactions are shown. Novel extensions to the core GPU pipeline demonstrate object segmentation and user interaction directly in front of the sensor, without degrading camera tracking or reconstruction. These extensions are used to enable real-time multi-touch interactions anywhere, allowing any planar or non-planar reconstructed physical surface to be appropriated for touch.

european conference on computer vision | 2012

Indoor segmentation and support inference from RGBD images

Nathan Silberman; Derek Hoiem; Pushmeet Kohli; Rob Fergus

We present an approach to interpret the major surfaces, objects, and support relations of an indoor scene from an RGBD image. Most existing work ignores physical interactions or is applied only to tidy rooms and hallways. Our goal is to parse typical, often messy, indoor scenes into floor, walls, supporting surfaces, and object regions, and to recover support relationships. One of our main interests is to better understand how 3D cues can best inform a structured 3D interpretation. We also contribute a novel integer programming formulation to infer physical support relations. We offer a new dataset of 1449 RGBD images, capturing 464 diverse indoor scenes, with detailed annotations. Our experiments demonstrate our ability to infer support relations in complex scenes and verify that our 3D scene cues and inferred support lead to better object segmentation.

computer vision and pattern recognition | 2008

Robust higher order potentials for enforcing label consistency

Pushmeet Kohli; Lubor Ladicky; Philip H. S. Torr

This paper proposes a novel framework for labelling problems which is able to combine multiple segmentations in a principled manner. Our method is based on higher order conditional random fields and uses potentials defined on sets of pixels (image segments) generated using unsupervised segmentation algorithms. These potentials enforce label consistency in image regions and can be seen as a generalization of the commonly used pairwise contrast sensitive smoothness potentials. The higher order potential functions used in our framework take the form of the Robust Pn model and are more general than the Pn Potts model recently proposed by Kohli et al. We prove that the optimal swap and expansion moves for energy functions composed of these potentials can be computed by solving a st-mincut problem. This enables the use of powerful graph cut based move making algorithms for performing inference in the framework. We test our method on the problem of multi-class object segmentation by augmenting the conventional crf used for object segmentation with higher order potentials defined on image regions. Experiments on challenging data sets show that integration of higher order potentials quantitatively and qualitatively improves results leading to much better definition of object boundaries. We believe that this method can be used to yield similar improvements for many other labelling problems.

international conference on computer vision | 2009

Associative hierarchical CRFs for object class image segmentation

Lubor Ladicky; Chris Russell; Pushmeet Kohli; Philip H. S. Torr

Most methods for object class segmentation are formulated as a labelling problem over a single choice of quantisation of an image space - pixels, segments or group of segments. It is well known that each quantisation has its fair share of pros and cons; and the existence of a common optimal quantisation level suitable for all object categories is highly unlikely. Motivated by this observation, we propose a hierarchical random field model, that allows integration of features computed at different levels of the quantisation hierarchy. MAP inference in this model can be performed efficiently using powerful graph cut based move making algorithms. Our framework generalises much of the previous work based on pixels or segments. We evaluate its efficiency on some of the most challenging data-sets for object class segmentation, and show it obtains state-of-the-art results.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013

Efficient Human Pose Estimation from Single Depth Images

Jamie Shotton; Ross B. Girshick; Andrew W. Fitzgibbon; Toby Sharp; Mat Cook; Mark J. Finocchio; Richard Moore; Pushmeet Kohli; Antonio Criminisi; Alex Aben-Athar Kipman; Andrew Blake

We describe two new approaches to human pose estimation. Both can quickly and accurately predict the 3D positions of body joints from a single depth image without using any temporal information. The key to both approaches is the use of a large, realistic, and highly varied synthetic set of training images. This allows us to learn models that are largely invariant to factors such as pose, body shape, field-of-view cropping, and clothing. Our first approach employs an intermediate body parts representation, designed so that an accurate per-pixel classification of the parts will localize the joints of the body. The second approach instead directly regresses the positions of body joints. By using simple depth pixel comparison features and parallelizable decision forests, both approaches can run super-real time on consumer hardware. Our evaluation investigates many aspects of our methods, and compares the approaches to each other and to the state of the art. Results on silhouettes suggest broader applicability to other imaging modalities.

international conference on computer vision | 2011

Efficient regression of general-activity human poses from depth images

Ross B. Girshick; Jamie Shotton; Pushmeet Kohli; Antonio Criminisi; Andrew W. Fitzgibbon

We present a new approach to general-activity human pose estimation from depth images, building on Hough forests. We extend existing techniques in several ways: real time prediction of multiple 3D joints, explicit learning of voting weights, vote compression to allow larger training sets, and a comparison of several decision-tree training objectives. Key aspects of our work include: regression directly from the raw depth image, without the use of an arbitrary intermediate representation; applicability to general motions (not constrained to particular activities) and the ability to localize occluded as well as visible body joints. Experimental results demonstrate that our method produces state of the art results on several data sets including the challenging MSRC-5000 pose estimation test set, at a speed of about 200 frames per second. Results on silhouettes suggest broader applicability to other imaging modalities.

european conference on computer vision | 2010

Graph cut based inference with co-occurrence statistics

Lubor Ladicky; Chris Russell; Pushmeet Kohli; Philip H. S. Torr

Markov and Conditional random fields (CRFs) used in computer vision typically model only local interactions between variables, as this is computationally tractable. In this paper we consider a class of global potentials defined over all variables in the CRF. We show how they can be readily optimised using standard graph cut algorithms at little extra expense compared to a standard pairwise field. This result can be directly used for the problem of class based image segmentation which has seen increasing recent interest within computer vision. Here the aim is to assign a label to each pixel of a given image from a set of possible object classes. Typically these methods use random fields to model local interactions between pixels or super-pixels. One of the cues that helps recognition is global object co-occurrence statistics, a measure of which classes (such as chair or motorbike) are likely to occur in the same image together. There have been several approaches proposed to exploit this property, but all of them suffer from different limitations and typically carry a high computational cost, preventing their application on large images. We find that the new model we propose produces an improvement in the labelling compared to just using a pairwise model.

international conference on computer vision | 2009

Image segmentation with a bounding box prior

Victor S. Lempitsky; Pushmeet Kohli; Carsten Rother; Toby Sharp

User-provided object bounding box is a simple and popular interaction paradigm considered by many existing interactive image segmentation frameworks. However, these frameworks tend to exploit the provided bounding box merely to exclude its exterior from consideration and sometimes to initialize the energy minimization. In this paper, we discuss how the bounding box can be further used to impose a powerful topological prior, which prevents the solution from excessive shrinking and ensures that the user-provided box bounds the segmentation in a sufficiently tight way. The prior is expressed using hard constraints incorporated into the global energy minimization framework leading to an NP-hard integer program. We then investigate the possible optimization strategies including linear relaxation as well as a new graph cut algorithm called pinpointing. The latter can be used either as a rounding method for the fractional LP solution, which is provably better than thresholding-based rounding, or as a fast standalone heuristic. We evaluate the proposed algorithms on a publicly available dataset, and demonstrate the practical benefits of the new prior both qualitatively and quantitatively.

human factors in computing systems | 2012

Instructing people for training gestural interactive systems

Simon Fothergill; Helena M. Mentis; Pushmeet Kohli; Sebastian Nowozin

Entertainment and gaming systems such as the Wii and XBox Kinect have brought touchless, body-movement based interfaces to the masses. Systems like these enable the estimation of movements of various body parts from raw inertial motion or depth sensor data. However, the interface developer is still left with the challenging task of creating a system that recognizes these movements as embodying meaning. The machine learning approach for tackling this problem requires the collection of data sets that contain the relevant body movements and their associated semantic labels. These data sets directly impact the accuracy and performance of the gesture recognition system and should ideally contain all natural variations of the movements associated with a gesture. This paper addresses the problem of collecting such gesture datasets. In particular, we investigate the question of what is the most appropriate semiotic modality of instructions for conveying to human subjects the movements the system developer needs them to perform. The results of our qualitative and quantitative analysis indicate that the choice of modality has a significant impact on the performance of the learnt gesture recognition system; particularly in terms of correctness and coverage.

european conference on computer vision | 2008

Learning CRFs Using Graph Cuts

Martin Szummer; Pushmeet Kohli; Derek Hoiem

Many computer vision problems are naturally formulated as random fields, specifically MRFs or CRFs. The introduction of graph cuts has enabled efficient and optimal inference in associative random fields, greatly advancing applications such as segmentation, stereo reconstruction and many others. However, while fast inference is now widespread, parameter learning in random fields has remained an intractable problem. This paper shows how to apply fast inference algorithms, in particular graph cuts, to learn parameters of random fields with similar efficiency. We find optimal parameter values under standard regularized objective functions that ensure good generalization. Our algorithm enables learning of many parameters in reasonable time, and we explore further speedup techniques. We also discuss extensions to non-associative and multi-class problems. We evaluate the method on image segmentation and geometry recognition.

Explore More