Pedro F. Felzenszwalb

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pedro F. Felzenszwalb is active.

Explore More

Publication

Featured researches published by Pedro F. Felzenszwalb.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2010

Object Detection with Discriminatively Trained Part-Based Models

Pedro F. Felzenszwalb; Ross B. Girshick; David A. McAllester; Deva Ramanan

We describe an object detection system based on mixtures of multiscale deformable part models. Our system is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges. While deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the PASCAL data sets. Our system relies on new methods for discriminative training with partially labeled data. We combine a margin-sensitive approach for data-mining hard negative examples with a formalism we call latent SVM. A latent SVM is a reformulation of MI--SVM in terms of latent variables. A latent SVM is semiconvex, and the training problem becomes convex once latent information is specified for the positive examples. This leads to an iterative training algorithm that alternates between fixing latent values for positive examples and optimizing the latent SVM objective function.

International Journal of Computer Vision | 2004

Efficient Graph-Based Image Segmentation

Pedro F. Felzenszwalb; Daniel P. Huttenlocher

This paper addresses the problem of segmenting an image into regions. We define a predicate for measuring the evidence for a boundary between two regions using a graph-based representation of the image. We then develop an efficient segmentation algorithm based on this predicate, and show that although this algorithm makes greedy decisions it produces segmentations that satisfy global properties. We apply the algorithm to image segmentation using two different kinds of local neighborhoods in constructing the graph, and illustrate the results with both real and synthetic images. The algorithm runs in time nearly linear in the number of graph edges and is also fast in practice. An important characteristic of the method is its ability to preserve detail in low-variability image regions while ignoring detail in high-variability regions.

International Journal of Computer Vision | 2005

Pictorial Structures for Object Recognition

Pedro F. Felzenszwalb; Daniel P. Huttenlocher

In this paper we present a computationally efficient framework for part-based modeling and recognition of objects. Our work is motivated by the pictorial structure models introduced by Fischler and Elschlager. The basic idea is to represent an object by a collection of parts arranged in a deformable configuration. The appearance of each part is modeled separately, and the deformable configuration is represented by spring-like connections between pairs of parts. These models allow for qualitative descriptions of visual appearance, and are suitable for generic recognition problems. We address the problem of using pictorial structure models to find instances of an object in an image as well as the problem of learning an object model from training examples, presenting efficient algorithms in both cases. We demonstrate the techniques by learning models that represent faces and human bodies and using the resulting models to locate the corresponding objects in novel images.

computer vision and pattern recognition | 2008

A discriminatively trained, multiscale, deformable part model

Pedro F. Felzenszwalb; David A. McAllester; Deva Ramanan

This paper describes a discriminatively trained, multiscale, deformable part model for object detection. Our system achieves a two-fold improvement in average precision over the best performance in the 2006 PASCAL person detection challenge. It also outperforms the best results in the 2007 challenge in ten out of twenty categories. The system relies heavily on deformable parts. While deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the PASCAL challenge. Our system also relies heavily on new methods for discriminative training. We combine a margin-sensitive approach for data mining hard negative examples with a formalism we call latent SVM. A latent SVM, like a hidden CRF, leads to a non-convex training problem. However, a latent SVM is semi-convex and the training problem becomes convex once latent information is specified for the positive examples. We believe that our training methods will eventually make possible the effective use of more latent information such as hierarchical (grammar) models and models involving latent three dimensional pose.

International Journal of Computer Vision | 2006

Efficient Belief Propagation for Early Vision

Pedro F. Felzenszwalb; Daniel P. Huttenlocher

Markov random field models provide a robust and unified framework for early vision problems such as stereo and image restoration. Inference algorithms based on graph cuts and belief propagation have been found to yield accurate results, but despite recent advances are often too slow for practical use. In this paper we present some algorithmic techniques that substantially improve the running time of the loopy belief propagation approach. One of the techniques reduces the complexity of the inference algorithm to be linear rather than quadratic in the number of possible labels for each pixel, which is important for problems such as image restoration that have a large label set. Another technique speeds up and reduces the memory requirements of belief propagation on grid graphs. A third technique is a multi-grid method that makes it possible to obtain good results with a small fixed number of message passing iterations, independent of the size of the input images. Taken together these techniques speed up the standard algorithm by several orders of magnitude. In practice we obtain results that are as accurate as those of other global methods (e.g., using the Middlebury stereo benchmark) while being nearly as fast as purely local methods.

computer vision and pattern recognition | 2004

Efficient belief propagation for early vision

Pedro F. Felzenszwalb; Daniel P. Huttenlocher

Markov random field models provide a robust and unified framework for early vision problems such as stereo, optical flow and image restoration. Inference algorithms based on graph cuts and belief propagation yield accurate results, but despite recent advances are often still too slow for practical use. In this paper we present new algorithmic techniques that substantially improve the running time of the belief propagation approach. One of our techniques reduces the complexity of the inference algorithm to be linear rather than quadratic in the number of possible labels for each pixel, which is important for problems such as optical flow or image restoration that have a large label set. A second technique makes it possible to obtain good results with a small fixed number of message passing iterations, independent of the size of the input images. Taken together these techniques speed up the standard algorithm by several orders of magnitude. In practice we obtain stereo, optical flow and image restoration algorithms that are as accurate as other global methods (e.g., using the Middlebury stereo benchmark) while being as fast as local techniques.

Theory of Computing | 2004

Distance Transforms of Sampled Functions

Pedro F. Felzenszwalb; Daniel P. Huttenlocher

We describe linear-time algorithms for solving a class of problems that involve transforming a cost function on a grid using spatial information. These problems can be viewed as a generalization of classical distance transforms of binary images, where the binary image is replaced by an arbitrary function on a grid. Alternatively they can be viewed in terms of the minimum convolution of two functions, which is an important operation in grayscale morphology. A consequence of our techniques is a simple and fast method for computing the Euclidean distance transform of a binary image. Our algorithms are also applicable to Viterbi decoding, belief propagation, and optimal control.

computer vision and pattern recognition | 2010

Cascade object detection with deformable part models

Pedro F. Felzenszwalb; Ross B. Girshick; David A. McAllester

We describe a general method for building cascade classifiers from part-based deformable models such as pictorial structures. We focus primarily on the case of star-structured models and show how a simple algorithm based on partial hypothesis pruning can speed up object detection by more than one order of magnitude without sacrificing detection accuracy. In our algorithm, partial hypotheses are pruned with a sequence of thresholds. In analogy to probably approximately correct (PAC) learning, we introduce the notion of probably approximately admissible (PAA) thresholds. Such thresholds provide theoretical guarantees on the performance of the cascade method and can be computed from a small sample of positive examples. Finally, we outline a cascade detection algorithm for a general class of models defined by a grammar formalism. This class includes not only tree-structured pictorial structures but also richer models that can represent each part recursively as a mixture of other parts.

Communications of The ACM | 2013

Visual object detection with deformable part models

Pedro F. Felzenszwalb; Ross B. Girshick; David A. McAllester; Deva Ramanan

We describe a state-of-the-art system for finding objects in cluttered images. Our system is based on deformable models that represent objects using local part templates and geometric constraints on the locations of parts. We reduce object detection to classification with latent variables. The latent variables introduce invariances that make it possible to detect objects with highly variable appearance. We use a generalization of support vector machines to incorporate latent information during training. This has led to a general framework for discriminative training of classifiers with latent variables. Discriminative training benefits from large training datasets. In practice we use an iterative algorithm that alternates between estimating latent values for positive examples and solving a large convex optimization problem. Practical optimization of this large convex problem can be done using active set techniques for adaptive subsampling of the training data.

computer vision and pattern recognition | 2000

Efficient matching of pictorial structures

Pedro F. Felzenszwalb; Daniel P. Huttenlocher

A pictorial structure is a collection of parts arranged in a deformable configuration. Each part is represented using a simple appearance model and the deformable configuration is represented by spring-like connections between pairs of parts. While pictorial structures were introduced a number of years ago, they have not been broadly applied to matching and recognition problems. This has been due in part to the computational difficulty of matching pictorial structures to images. In this paper we present an efficient algorithm for finding the best global match of a pictorial stucture to an image. With this improved algorithm, pictorial structures provide a practical and powerful framework for quantitative descriptions of objects and scenes, and are suitable for many generic image recognition problems. We illustrate the approach using simple models of a person and a car.

Explore More