Marc Pollefeys
ETH Zurich
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marc Pollefeys.
international conference on computer vision | 1998
Marc Pollefeys; Reinhard Koch; L. Van Gool
In this paper the feasibility of self-calibration in the presence of varying internal camera parameters is under investigation. A self-calibration method is presented which efficiently deals with all kinds of constraints on the internal camera parameters. Within this framework a practical method is proposed which can retrieve metric reconstruction from image sequences obtained with uncalibrated zooming/focusing cameras. The feasibility of the approach is illustrated on real and synthetic examples.
International Journal of Computer Vision | 2004
Marc Pollefeys; Luc Van Gool; Maarten Vergauwen; Frank Verbiest; Kurt Cornelis; Jan Tops; Reinhard Koch
In this paper a complete system to build visual models from camera images is presented. The system can deal with uncalibrated image sequences acquired with a hand-held camera. Based on tracked or matched features the relations between multiple views are computed. From this both the structure of the scene and the motion of the camera are retrieved. The ambiguity on the reconstruction is restricted from projective to metric through self-calibration. A flexible multi-view stereo matching scheme is used to obtain a dense estimation of the surface geometry. From the computed data different types of visual models are constructed. Besides the traditional geometry- and image-based approaches, a combined approach with view-dependent geometry and texture is presented. As an application fusion of real and virtual scenes is also shown.
International Journal of Computer Vision | 2008
Marc Pollefeys; David Nistér; Jan Michael Frahm; Amir Akbarzadeh; Philippos Mordohai; Brian Clipp; Chris Engels; David Gallup; Seon Joo Kim; Paul Merrell; C. Salmi; Sudipta N. Sinha; B. Talton; Liang Wang; Qingxiong Yang; Henrik Stewenius; Ruigang Yang; Greg Welch; Herman Towles
Abstract The paper presents a system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing steps. It employs commodity graphics hardware and standard CPU’s to achieve real-time performance. We present the main considerations in designing the system and the steps of the processing pipeline. Our system extends existing algorithms to meet the robustness and variability necessary to operate out of the lab. To account for the large dynamic range of outdoor videos the processing pipeline estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in stereo estimation without impacting the real-time performance. The required accuracy for many applications is achieved with a two-step stereo reconstruction process exploiting the redundancy across frames. We show results on real video sequences comprising hundreds of thousands of frames.
International Journal of Computer Vision | 1999
Marc Pollefeys; Reinhard Koch; Luc Van Gool
In this paper the theoretical and practical feasibility of self-calibration in the presence of varying intrinsic camera parameters is under investigation. The papers main contribution is to propose a self-calibration method which efficiently deals with all kinds of constraints on the intrinsic camera parameters. Within this framework a practical method is proposed which can retrieve metric reconstruction from image sequences obtained with uncalibrated zooming/focusing cameras. The feasibility of the approach is illustrated on real and synthetic examples. Besides this a theoretical proof is given which shows that the absence of skew in the image plane is sufficient to allow for self-calibration. A counting argument is developed which—depending on the set of constraints—gives the minimum sequence length for self-calibration and a method to detect critical motion sequences is proposed.
european conference on computer vision | 2006
Jingyu Yan; Marc Pollefeys
We cast the problem of motion segmentation of feature trajectories as linear manifold finding problems and propose a general framework for motion segmentation under affine projections which utilizes two properties of trajectory data: geometric constraint and locality. The geometric constraint states that the trajectories of the same motion lie in a low dimensional linear manifold and different motions result in different linear manifolds; locality, by which we mean in a transformed space a data and its neighbors tend to lie in the same linear manifold, provides a cue for efficient estimation of these manifolds. Our algorithm estimates a number of linear manifolds, whose dimensions are unknown beforehand, and segment the trajectories accordingly. It first transforms and normalizes the trajectories; secondly, for each trajectory it estimates a local linear manifold through local sampling; then it derives the affinity matrix based on principal subspace angles between these estimated linear manifolds; at last, spectral clustering is applied to the matrix and gives the segmentation result. Our algorithm is general without restriction on the number of linear manifolds and without prior knowledge of the dimensions of the linear manifolds. We demonstrate in our experiments that it can segment a wide range of motions including independent, articulated, rigid, non-rigid, degenerate, non-degenerate or any combination of them. In some highly challenging cases where other state-of-the-art motion segmentation algorithms may fail, our algorithm gives expected results.
european conference on computer vision | 2008
Rahul Raguram; Jan Michael Frahm; Marc Pollefeys
The Random Sample Consensus (RANSAC) algorithm is a popular tool for robust estimation problems in computer vision, primarily due to its ability to tolerate a tremendous fraction of outliers. There have been a number of recent efforts that aim to increase the efficiency of the standard RANSAC algorithm. Relatively fewer efforts, however, have been directed towards formulating RANSAC in a manner that is suitable for real-time implementation. The contributions of this work are two-fold: First, we provide a comparative analysis of the state-of-the-art RANSAC algorithms and categorize the various approaches. Second, we develop a powerful new framework for real-time robust estimation. The technique we develop is capable of efficiently adapting to the constraints presented by a fixed time budget, while at the same time providing accurate estimation over a wide range of inlier ratios. The method shows significant improvements in accuracy and speed over existing techniques.
computer vision and pattern recognition | 2003
Ruigang Yang; Marc Pollefeys
In this paper a stereo algorithm suitable for implementation on commodity graphics hardware is presented. This is important since it allows freeing up the main processor for other tasks including high-level interpretation of the stereo results. Our algorithm relies on the traditional sum-of-square-differences (SSD) dissimilarity measure between correlation windows. To achieve good results close to depth discontinuities as well as on low texture areas, a multi-resolution approach is used. The approach efficiently combines SSD measurements for windows of different sizes. Our implementation running on an NVIDIA GeForce4 graphics card achieves 50-70M disparity evaluations per second including all the overhead to download images and read-back the disparity map, which is equivalent to the fastest commercial CPU implementations available. An important advantage of our approach is that rectification is not necessary so that correspondences can just as easily be obtained for images that contain the epipoles. Another advantage is that this approach can easily be extended to multi-baseline stereo.
computer vision and pattern recognition | 2011
David M. Chen; Georges Baatz; Kevin Köser; Sam S. Tsai; Ramakrishna Vedantham; Timo Pylvänäinen; Kimmo Roimela; Xin Chen; Jeff Bach; Marc Pollefeys; Bernd Girod; Radek Grzeszczuk
With recent advances in mobile computing, the demand for visual localization or landmark identification on mobile devices is gaining interest. We advance the state of the art in this area by fusing two popular representations of street-level image data — facade-aligned and viewpoint-aligned — and show that they contain complementary information that can be exploited to significantly improve the recall rates on the city scale. We also improve feature detection in low contrast parts of the street-level data, and discuss how to incorporate priors on a users position (e.g. given by noisy GPS readings or network cells), which previous approaches often ignore. Finally, and maybe most importantly, we present our results according to a carefully designed, repeatable evaluation scheme and make publicly available a set of 1.7 million images with ground truth labels, geotags, and calibration data, as well as a difficult set of cell phone query images. We provide these resources as a benchmark to facilitate further research in the area.
international conference on computer vision | 2007
Paul Merrell; Amir Akbarzadeh; Liang Wang; Philippos Mordohai; Jan Michael Frahm; Ruigang Yang; David Nistér; Marc Pollefeys
We present a viewpoint-based approach for the quick fusion of multiple stereo depth maps. Our method selects depth estimates for each pixel that minimize violations of visibility constraints and thus remove errors and inconsistencies from the depth maps to produce a consistent surface. We advocate a two-stage process in which the first stage generates potentially noisy, overlapping depth maps from a set of calibrated images and the second stage fuses these depth maps to obtain an integrated surface with higher accuracy, suppressed noise, and reduced redundancy. We show that by dividing the processing into two stages we are able to achieve a very high throughput because we are able to use a computationally cheap stereo algorithm and because this architecture is amenable to hardware-accelerated (GPU) implementations. A rigorous formulation based on the notion of stability of a depth estimate is presented first. It aims to determine the validity of a depth estimate by rendering multiple depth maps into the reference view as well as rendering the reference depth map into the other views in order to detect occlusions and free- space violations. We also present an approximate alternative formulation that selects and validates only one hypothesis based on confidence. Both formulations enable us to perform video-based reconstruction at up to 25 frames per second. We show results on the multi-view stereo evaluation benchmark datasets and several outdoors video sequences. Extensive quantitative analysis is performed using an accurately surveyed model of a real building as ground truth.
international conference on computer graphics and interactive techniques | 2008
Sudipta N. Sinha; Drew Steedly; Richard Szeliski; Maneesh Agrawala; Marc Pollefeys
We present an interactive system for generating photorealistic, textured, piecewise-planar 3D models of architectural structures and urban scenes from unordered sets of photographs. To reconstruct 3D geometry in our system, the user draws outlines overlaid on 2D photographs. The 3D structure is then automatically computed by combining the 2D interaction with the multi-view geometric information recovered by performing structure from motion analysis on the input photographs. We utilize vanishing point constraints at multiple stages during the reconstruction, which is particularly useful for architectural scenes where parallel lines are abundant. Our approach enables us to accurately model polygonal faces from 2D interactions in a single image. Our system also supports useful operations such as edge snapping and extrusions. Seamless texture maps are automatically generated by combining multiple input photographs using graph cut optimization and Poisson blending. The user can add brush strokes as hints during the texture generation stage to remove artifacts caused by unmodeled geometric structures. We build models for a variety of architectural scenes from collections of up to about a hundred photographs.