Andrew W. Fitzgibbon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrew W. Fitzgibbon is active.

Explore More

Publication

Featured researches published by Andrew W. Fitzgibbon.

Communications of The ACM | 2013

Real-time human pose recognition in parts from single depth images

Jamie Shotton; Toby Sharp; Alex Aben-Athar Kipman; Andrew W. Fitzgibbon; Mark J. Finocchio; Andrew Blake; Mat Cook; Richard Moore

We propose a new method to quickly and accurately predict 3D positions of body joints from a single depth image, using no temporal information. We take an object recognition approach, designing an intermediate body parts representation that maps the difficult pose estimation problem into a simpler per-pixel classification problem. Our large and highly varied training dataset allows the classifier to estimate body parts invariant to pose, body shape, clothing, etc. Finally we generate confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes. The system runs at 200 frames per second on consumer hardware. Our evaluation shows high accuracy on both synthetic and real test sets, and investigates the effect of several training parameters. We achieve state of the art accuracy in our comparison with related work and demonstrate improved generalization over exact whole-skeleton nearest neighbor matching.

international conference on computer vision | 1999

Bundle Adjustment - A Modern Synthesis

Bill Triggs; Philip F. McLauchlan; Richard I. Hartley; Andrew W. Fitzgibbon

This paper is a survey of the theory and methods of photogrammetric bundle adjustment, aimed at potential implementors in the computer vision community. Bundle adjustment is the problem of refining a visual reconstruction to produce jointly optimal structure and viewing parameter estimates. Topics covered include: the choice of cost function and robustness; numerical optimization including sparse Newton methods, linearly convergent approximations, updating and recursive methods; gauge (datum) invariance; and quality control. The theory is developed for general robust cost functions rather than restricting attention to traditional nonlinear least squares.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1999

Direct least square fitting of ellipses

Andrew W. Fitzgibbon; Maurizio Pilu; Robert B. Fisher

This work presents a new efficient method for fitting ellipses to scattered data. Previous algorithms either fitted general conics or were computationally expensive. By minimizing the algebraic distance subject to the constraint 4ac-b/sup 2/=1, the new method incorporates the ellipticity constraint into the normalization factor. The proposed method combines several advantages: It is ellipse-specific, so that even bad data will always return an ellipse. It can be solved naturally by a generalized eigensystem. It is extremely robust, efficient, and easy to implement.

international symposium on mixed and augmented reality | 2011

KinectFusion: Real-time dense surface mapping and tracking

Richard A. Newcombe; Shahram Izadi; Otmar Hilliges; David Molyneaux; David Kim; Andrew J. Davison; Pushmeet Kohi; Jamie Shotton; Steve Hodges; Andrew W. Fitzgibbon

We present a system for accurate real-time mapping of complex and arbitrary indoor scenes in variable lighting conditions, using only a moving low-cost depth camera and commodity graphics hardware. We fuse all of the depth data streamed from a Kinect sensor into a single global implicit surface model of the observed scene in real-time. The current sensor pose is simultaneously obtained by tracking the live depth frame relative to the global model using a coarse-to-fine iterative closest point (ICP) algorithm, which uses all of the observed depth data available. We demonstrate the advantages of tracking against the growing full surface model compared with frame-to-frame tracking, obtaining tracking and mapping results in constant time within room sized scenes with limited drift and high accuracy. We also show both qualitative and quantitative results relating to various aspects of our tracking and mapping system. Modelling of natural scenes, in real-time with only commodity sensor and GPU hardware, promises an exciting step forward in augmented reality (AR), in particular, it allows dense surfaces to be reconstructed in real-time, with a level of detail and robustness beyond any solution yet presented using passive computer vision.

user interface software and technology | 2011

KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera

Shahram Izadi; David Kim; Otmar Hilliges; David Molyneaux; Richard A. Newcombe; Pushmeet Kohli; Jamie Shotton; Steve Hodges; Dustin Freeman; Andrew J. Davison; Andrew W. Fitzgibbon

KinectFusion enables a user holding and moving a standard Kinect camera to rapidly create detailed 3D reconstructions of an indoor scene. Only the depth data from Kinect is used to track the 3D pose of the sensor and reconstruct, geometrically precise, 3D models of the physical scene in real-time. The capabilities of KinectFusion, as well as the novel GPU-based pipeline are described in full. Uses of the core system for low-cost handheld scanning, and geometry-aware augmented reality and physics-based interactions are shown. Novel extensions to the core GPU pipeline demonstrate object segmentation and user interaction directly in front of the sensor, without degrading camera tracking or reconstruction. These extensions are used to enable real-time multi-touch interactions anywhere, allowing any planar or non-planar reconstructed physical surface to be appropriated for touch.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1996

An experimental comparison of range image segmentation algorithms

Adam W. Hoover; Gillian Jean-Baptiste; Xiaoyi Jiang; Patrick J. Flynn; Horst Bunke; Dmitry B. Goldgof; Kevin W. Bowyer; David W. Eggert; Andrew W. Fitzgibbon; Robert B. Fisher

A methodology for evaluating range image segmentation algorithms is proposed. This methodology involves (1) a common set of 40 laser range finder images and 40 structured light scanner images that have manually specified ground truth and (2) a set of defined performance metrics for instances of correctly segmented, missed, and noise regions, over- and under-segmentation, and accuracy of the recovered geometry. A tool is used to objectively compare a machine generated segmentation against the specified ground truth. Four research groups have contributed to evaluate their own algorithm for segmenting a range image into planar patches.

Image and Vision Computing | 2003

Robust registration of 2D and 3D point sets

Andrew W. Fitzgibbon

Abstract This paper introduces a new method of registering point sets. The registration error is directly minimized using general-purpose non-linear optimization (the Levenberg–Marquardt algorithm). The surprising conclusion of the paper is that this technique is comparable in speed to the special-purpose Iterated Closest Point algorithm, which is most commonly used for this task. Because the routine directly minimizes an energy function, it is easy to extend it to incorporate robust estimation via a Huber kernel, yielding a basin of convergence that is many times wider than existing techniques. Finally, we introduce a data structure for the minimization based on the chamfer distance transform, which yields an algorithm that is both faster and more robust than previously described methods.

european conference on computer vision | 1998

Automatic Camera Recovery for Closed or Open Image Sequences

Andrew W. Fitzgibbon; Andrew Zisserman

We describe progress in completely automatically recovering 3D scene structure together with 3D camera positions from a sequence of images acquired by an unknown camera undergoing unknown movement.

computer vision and pattern recognition | 2001

Simultaneous linear estimation of multiple view geometry and lens distortion

Andrew W. Fitzgibbon

A problem in uncalibrated stereo reconstruction is that cameras which deviate from the pinhole model have to be pre-calibrated in order to correct for nonlinear lens distortion. If they are not, and point correspondence is attempted using the uncorrected images, the matching constraints provided by the fundamental matrix must be set so loose that point matching is significantly hampered. This paper shows how linear estimation of the fundamental matrix from two-view point correspondences may be augmented to include one term of radial lens distortion. This is achieved by (1) changing from the standard radial-lens model to another which (as we show) has equivalent power, but which takes a simpler form in homogeneous coordinates, and (2) expressing fundamental matrix estimation as a quadratic eigenvalue problem (QEP), for which efficient algorithms are well known. I derive the new estimator, and compare its performance against bundle-adjusted calibration-grid data. The new estimator is fast enough to be included in a RANSAC-based matching loop, and we show cases of matching being rendered possible by its use. I show how the same lens can be calibrated in a natural scene where the lack of straight lines precludes most previous techniques. The modification when the multi-view relation is a planar homography or trifocal tensor is described.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013

Efficient Human Pose Estimation from Single Depth Images

Jamie Shotton; Ross B. Girshick; Andrew W. Fitzgibbon; Toby Sharp; Mat Cook; Mark J. Finocchio; Richard Moore; Pushmeet Kohli; Antonio Criminisi; Alex Aben-Athar Kipman; Andrew Blake

We describe two new approaches to human pose estimation. Both can quickly and accurately predict the 3D positions of body joints from a single depth image without using any temporal information. The key to both approaches is the use of a large, realistic, and highly varied synthetic set of training images. This allows us to learn models that are largely invariant to factors such as pose, body shape, field-of-view cropping, and clothing. Our first approach employs an intermediate body parts representation, designed so that an accurate per-pixel classification of the parts will localize the joints of the body. The second approach instead directly regresses the positions of body joints. By using simple depth pixel comparison features and parallelizable decision forests, both approaches can run super-real time on consumer hardware. Our evaluation investigates many aspects of our methods, and compares the approaches to each other and to the state of the art. Results on silhouettes suggest broader applicability to other imaging modalities.

Explore More