Toby Sharp | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Toby Sharp is active.

Explore More

Publication

Featured researches published by Toby Sharp.

Communications of The ACM | 2013

Real-time human pose recognition in parts from single depth images

Jamie Shotton; Toby Sharp; Alex Aben-Athar Kipman; Andrew W. Fitzgibbon; Mark J. Finocchio; Andrew Blake; Mat Cook; Richard Moore

We propose a new method to quickly and accurately predict 3D positions of body joints from a single depth image, using no temporal information. We take an object recognition approach, designing an intermediate body parts representation that maps the difficult pose estimation problem into a simpler per-pixel classification problem. Our large and highly varied training dataset allows the classifier to estimate body parts invariant to pose, body shape, clothing, etc. Finally we generate confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes. The system runs at 200 frames per second on consumer hardware. Our evaluation shows high accuracy on both synthetic and real test sets, and investigates the effect of several training parameters. We achieve state of the art accuracy in our comparison with related work and demonstrate improved generalization over exact whole-skeleton nearest neighbor matching.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013

Efficient Human Pose Estimation from Single Depth Images

Jamie Shotton; Ross B. Girshick; Andrew W. Fitzgibbon; Toby Sharp; Mat Cook; Mark J. Finocchio; Richard Moore; Pushmeet Kohli; Antonio Criminisi; Alex Aben-Athar Kipman; Andrew Blake

We describe two new approaches to human pose estimation. Both can quickly and accurately predict the 3D positions of body joints from a single depth image without using any temporal information. The key to both approaches is the use of a large, realistic, and highly varied synthetic set of training images. This allows us to learn models that are largely invariant to factors such as pose, body shape, field-of-view cropping, and clothing. Our first approach employs an intermediate body parts representation, designed so that an accurate per-pixel classification of the parts will localize the joints of the body. The second approach instead directly regresses the positions of body joints. By using simple depth pixel comparison features and parallelizable decision forests, both approaches can run super-real time on consumer hardware. Our evaluation investigates many aspects of our methods, and compares the approaches to each other and to the state of the art. Results on silhouettes suggest broader applicability to other imaging modalities.

international conference on computer vision | 2009

Image segmentation with a bounding box prior

Victor S. Lempitsky; Pushmeet Kohli; Carsten Rother; Toby Sharp

User-provided object bounding box is a simple and popular interaction paradigm considered by many existing interactive image segmentation frameworks. However, these frameworks tend to exploit the provided bounding box merely to exclude its exterior from consideration and sometimes to initialize the energy minimization. In this paper, we discuss how the bounding box can be further used to impose a powerful topological prior, which prevents the solution from excessive shrinking and ensures that the user-provided box bounds the segmentation in a sufficiently tight way. The prior is expressed using hard constraints incorporated into the global energy minimization framework leading to an NP-hard integer program. We then investigate the possible optimization strategies including linear relaxation as well as a new graph cut algorithm called pinpointing. The latter can be used either as a rounding method for the fractional LP solution, which is provably better than thresholding-based rounding, or as a fast standalone heuristic. We evaluate the proposed algorithms on a publicly available dataset, and demonstrate the practical benefits of the new prior both qualitatively and quantitatively.

information hiding | 2001

An Implementation of Key-Based Digital Signal Steganography

Toby Sharp

A real-life requirement motivated this case study of secure covert communication. An independently researched process is described in detail with an emphasis on implementation issues regarding digital images. A scheme using stego keys to create pseudo-random sample sequences is developed. Issues relating to using digital signals for steganography are explored. The terms modified remainder and unmodified remainder are defined. Possible attacks are considered in detail from passive wardens and methods of defeating such attacks are suggested. Software implementing the new ideas is introduced, which has been successfully developed, deployed and used for several years without detection.

computer vision and pattern recognition | 2008

Bayesian color constancy revisited

Peter V. Gehler; Carsten Rother; Andrew Blake; Thomas P. Minka; Toby Sharp

Computational color constancy is the task of estimating the true reflectances of visible surfaces in an image. In this paper we follow a line of research that assumes uniform illumination of a scene, and that the principal step in estimating reflectances is the estimation of the scene illuminant. We review recent approaches to illuminant estimation, firstly those based on formulae for normalisation of the reflectance distribution in an image - so-called grey-world algorithms, and those based on a Bayesian formulation of image formation. In evaluating these previous approaches we introduce a new tool in the form of a database of 568 high-quality, indoor and outdoor images, accurately labelled with illuminant, and preserved in their raw form, free of correction or normalisation. This has enabled us to establish several properties experimentally. Firstly automatic selection of grey-world algorithms according to image properties is not nearly so effective as has been thought. Secondly, it is shown that Bayesian illuminant estimation is significantly improved by the improved accuracy of priors for illuminant and reflectance that are obtained from the new dataset.

european conference on computer vision | 2008

GeoS: Geodesic Image Segmentation

Antonio Criminisi; Toby Sharp; Andrew Blake

This paper presents GeoS, a new algorithm for the efficient segmentation of n-dimensional image and video data. The segmentation problem is cast as approximate energy minimization in a conditional random field. A new, parallel filtering operator built upon efficient geodesic distance computation is used to propose a set of spatially smooth, contrast-sensitive segmentation hypotheses. An economical search algorithm finds the solution with minimum energy within a sensible and highly restricted subset of all possible labellings. Advantages include: i) computational efficiency with high segmentation accuracy; ii) the ability to estimate an approximation to the posterior over segmentations; iii) the ability to handle generally complex energy models. Comparison with max-flow indicates up to 60 times greater computational efficiency as well as greater memory efficiency. GeoS is validated quantitatively and qualitatively by thorough comparative experiments on existing and novel ground-truth data. Numerous results on interactive andautomatic segmentation of photographs, video and volumetric medical image data are presented.

computer vision and pattern recognition | 2012

The Vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation

Jonathan Taylor; Jamie Shotton; Toby Sharp; Andrew W. Fitzgibbon

Fitting an articulated model to image data is often approached as an optimization over both model pose and model-to-image correspondence. For complex models such as humans, previous work has required a good initialization, or an alternating minimization between correspondence and pose. In this paper we investigate one-shot pose estimation: can we directly infer correspondences using a regression function trained to be invariant to body size and shape, and then optimize the model pose just once? We evaluate on several challenging single-frame data sets containing a wide variety of body poses, shapes, torso rotations, and image cropping. Our experiments demonstrate that one-shot pose estimation achieves state of the art results and runs in real-time.

european conference on computer vision | 2008

Implementing Decision Trees and Forests on a GPU

Toby Sharp

We describe a method for implementing the evaluation and training of decision trees and forests entirely on a GPU, and show how this method can be used in the context of object recognition.

human factors in computing systems | 2015

Accurate, Robust, and Flexible Real-time Hand Tracking

Toby Sharp; Cem Keskin; Jonathan Taylor; Jamie Shotton; David Kim; Christoph Rhemann; Ido Leichter; Alon Vinnikov; Yichen Wei; Daniel Freedman; Pushmeet Kohli; Eyal Krupka; Andrew W. Fitzgibbon; Shahram Izadi

We present a new real-time hand tracking system based on a single depth camera. The system can accurately reconstruct complex hand poses across a variety of subjects. It also allows for robust tracking, rapidly recovering from any temporary failures. Most uniquely, our tracker is highly flexible, dramatically improving upon previous approaches which have focused on front-facing close-range scenarios. This flexibility opens up new possibilities for human-computer interaction with examples including tracking at distances from tens of centimeters through to several meters (for controlling the TV at a distance), supporting tracking using a moving depth camera (for mobile scenarios), and arbitrary camera placements (for VR headsets). These features are achieved through a new pipeline that combines a multi-layered discriminative reinitialization strategy for per-frame pose estimation, followed by a generative model-fitting stage. We provide extensive technical details and a detailed qualitative and quantitative analysis.

ACM Transactions on Graphics | 2010

Geodesic image and video editing

Antonio Criminisi; Toby Sharp; Carsten Rother; Patrick Pérez

This article presents a new, unified technique to perform general edge-sensitive editing operations on n-dimensional images and videos efficiently. The first contribution of the article is the introduction of a Generalized Geodesic Distance Transform (GGDT), based on soft masks. This provides a unified framework to address several edge-aware editing operations. Diverse tasks such as denoising and nonphotorealistic rendering are all dealt with fundamentally the same, fast algorithm. Second, a new Geodesic Symmetric Filter (GSF) is presented which imposes contrast-sensitive spatial smoothness into segmentation and segmentation-based editing tasks (cutout, object highlighting, colorization, panorama stitching). The effect of the filter is controlled by two intuitive, geometric parameters. In contrast to existing techniques, the GSF filter is applied to real-valued pixel likelihoods (soft masks), thanks to GGDTs and it can be used for both interactive and automatic editing. Complex object topologies are dealt with effortlessly. Finally, the parallelism of GGDTs enables us to exploit modern multicore CPU architectures as well as powerful new GPUs, thus providing great flexibility of implementation and deployment. Our technique operates on both images and videos, and generalizes naturally to n-dimensional data. The proposed algorithm is validated via quantitative and qualitative comparisons with existing, state-of-the-art approaches. Numerous results on a variety of image and video editing tasks further demonstrate the effectiveness of our method.

Explore More