David J. Fleet | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David J. Fleet is active.

Explore More

Publication

Featured researches published by David J. Fleet.

computer vision and pattern recognition | 1992

Performance of optical flow techniques

John L. Barron; David J. Fleet; Steven S. Beauchemin; T. A. Burkitt

While different optical flow techniques continue to appear, there has been a lack of quantitative evaluation of existing methods. For a common set of real and synthetic image sequences, we report the results of a number of regularly cited optical flow techniques, including instances of differential, matching, energy-based, and phase-based methods. Our comparisons are primarily empirical, and concentrate on the accuracy, reliability, and density of the velocity measurements; they show that performance can differ significantly among the techniques we implemented.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003

Robust online appearance models for visual tracking

Allan D. Jepson; David J. Fleet; Thomas F. El-Maraghi

We propose a framework for learning robust, adaptive, appearance models to be used for motion-based tracking of natural objects. The model adapts to slowly changing appearance, and it maintains a natural measure of the stability of the observed image structure during tracking. By identifying stable properties of appearance, we can weight them more heavily for motion estimation, while less stable properties can be proportionately downweighted. The appearance model involves a mixture of stable image structure, learned over long time courses, along with two-frame motion information and an outlier process. An online EM-algorithm is used to adapt the appearance model parameters over time. An implementation of this approach is developed for an appearance model based on the filter responses from a steerable pyramid. This model is used in a motion-based tracking algorithm to provide robustness in the face of image outliers, such as those caused by occlusions, while adapting to natural changes in appearance such as those due to facial expressions or variations in 3D pose.

International Journal of Computer Vision | 1990

Computation of component image velocity from local phase information

David J. Fleet; Allan D. Jepson

We present a technique for the computation of 2D component velocity from image sequences. Initially, the image sequence is represented by a family of spatiotemporal velocity-tuned linear filters. Component velocity, computed from spatiotemporal responses of identically tuned filters, is expressed in terms of the local first-order behavior of surfaces of constant phase. Justification for this definition is discussed from the perspectives of both 2D image translation and deviations from translation that are typical in perspective projections of 3D scenes. The resulting technique is predominantly linear, efficient, and suitable for parallel processing. Moreover, it is local in space-time, robust with respect to noise, and permits multiple estimates within a single neighborhood. Promising quantiative results are reported from experiments with realistic image sequences, including cases with sizeable perspective deformation.

european conference on computer vision | 2000

Stochastic Tracking of 3D Human Figures Using 2D Image Motion

Hedvig Sidenbladh; Michael J. Black; David J. Fleet

A probabilistic method for tracking 3D articulated human figures in monocular image sequences is presented. Within a Bayesian framework, we define a generative model of image appearance, a robust likelihood function based on image graylevel differences, and a prior probability distribution over pose and joint angles that models how humans move. The posterior probability distribution over model parameters is represented using a discrete set of samples and is propagated over time using particle filtering. The approach extends previous work on parameterized optical flow estimation to exploit a complex 3D articulated motion model. It also extends previous work on human motion tracking by including a perspective camera model, by modeling limb self occlusion, and by recovering 3D motion from a monocular sequence. The explicit posterior probability distribution represents ambiguities due to image matching, model singularities, and perspective projection. The method relies only on a frame-to-frame assumption of brightness constancy and hence is able to track people under changing viewpoints, in grayscale image sequences, and with complex unknown backgrounds.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2009

TurboPixels: Fast Superpixels Using Geometric Flows

Alex Levinshtein; Adrian Stere; Kiriakos N. Kutulakos; David J. Fleet; Sven J. Dickinson; Kaleem Siddiqi

We describe a geometric-flow-based algorithm for computing a dense oversegmentation of an image, often referred to as superpixels. It produces segments that, on one hand, respect local image boundaries, while, on the other hand, limiting undersegmentation through a compactness constraint. It is very fast, with complexity that is approximately linear in image size, and can be applied to megapixel sized images with high superpixel densities in a matter of minutes. We show qualitative demonstrations of high-quality results on several complex images. The Berkeley database is used to quantitatively compare its performance to a number of oversegmentation algorithms, showing that it yields less undersegmentation than algorithms that lack a compactness constraint while offering a significant speedup over N-cuts, which does enforce compactness.

pattern recognition and machine intelligence | 2008

Gaussian Process Dynamical Models for Human Motion

Jack M. Wang; David J. Fleet; Aaron Hertzmann

We introduce Gaussian process dynamical models (GPDMs) for nonlinear time series analysis, with applications to learning models of human pose and motion from high-dimensional motion capture data. A GPDM is a latent variable model. It comprises a low-dimensional latent space with associated dynamics, as well as a map from the latent space to an observation space. We marginalize out the model parameters in closed form by using Gaussian process priors for both the dynamical and the observation mappings. This results in a nonparametric model for dynamical systems that accounts for uncertainty in the model. We demonstrate the approach and compare four learning algorithms on human motion capture data, in which each pose is 50-dimensional. Despite the use of small data sets, the GPDM learns an effective representation of the nonlinear dynamics in these spaces.

computer vision and pattern recognition | 2006

3D People Tracking with Gaussian Process Dynamical Models

Raquel Urtasun; David J. Fleet; Pascal Fua

We advocate the use of Gaussian Process Dynamical Models (GPDMs) for learning human pose and motion priors for 3D people tracking. A GPDM provides a lowdimensional embedding of human motion data, with a density function that gives higher probability to poses and motions close to the training data. With Bayesian model averaging a GPDM can be learned from relatively small amounts of data, and it generalizes gracefully to motions outside the training set. Here we modify the GPDM to permit learning from motions with significant stylistic variation. The resulting priors are effective for tracking a range of human walking styles, despite weak and noisy image measurements and significant occlusions.

Vision Research | 1996

Neural encoding of binocular disparity: Energy models, position shifts and phase shifts

David J. Fleet; Hermann Wagner; David J. Heeger

Neurophysiological data support two models for the disparity selectivity of binocular simple and complex cells in primary visual cortex. These involve binocular combinations of monocular receptive fields that are shifted in retinal position (the position-shift model) or in phase (the phase-shift model) between the two eyes. This article presents a formal description and analysis of a binocular energy model with these forms of disparity selectivity. We propose how one might measure the relative contributions of phase and position shifts in simple and complex cells. The analysis also reveals ambiguities in disparity encoding that are inherent in these model neurons, suggesting a need for a second stage of processing. We propose that linear pooling of the binocular responses across orientations and scales (spatial frequency) is capable of producing an unambiguous representation of disparity.

Cvgip: Image Understanding | 1991

Phase-based disparity measurement

David J. Fleet; Allan D. Jepson; Michael Jenkin

Abstract The measurement of image disparity is a fundamental precursor to binocular depth estimation. Recently, Jenkin and Jepson (in Computational Processes in Human Vision (V. Pylyshyn, Ed.), Ablex, New Jersey, 1988) and Sanger (Biol. Cybernet, 59, 1988 , 405–418) described promising methods based on the output phase behavior of bandpass Gabor filters. Here we discuss further justification for such techniques based on the stability of bandpass phase behavior as a function of typical distortions that exist between left and right views. In addition, despite this general stability, we show that phase signals are occasionally very sensitive to spatial position and to variations in scale, in which cases incorrect measurements occur. We find that the primary cause for this instability is the existence of singularities in phase signals. With the aid of the local frequency of the filter output (provided by the phase derivative) and the local amplitude information, the regions of phase instability near the singularities are detected so that potentially incorrect measurements can be identified. In addition, we show how the local frequency can be used away from the singularity neighbourhoods to improve the accuracy of the disparity estimates. Some experimental results are reported.

international conference on computer vision | 2005

Priors for people tracking from small training sets

Raquel Urtasun; David J. Fleet; Aaron Hertzmann; Pascal Fua

We advocate the use of scaled Gaussian process latent variable models (SGPLVM) to learn prior models of 3D human pose for 3D people tracking. The SGPLVM simultaneously optimizes a low-dimensional embedding of the high-dimensional pose data and a density function that both gives higher probability to points close to training data and provides a nonlinear probabilistic mapping from the low-dimensional latent space to the full-dimensional pose space. The SGPLVM is a natural choice when only small amounts of training data are available. We demonstrate our approach with two distinct motions, golfing and walking. We show that the SGPLVM sufficiently constrains the problem such that tracking can be accomplished with straightforward deterministic optimization.

Explore More