David Nistér | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David Nistér is active.

Explore More

Publication

Featured researches published by David Nistér.

computer vision and pattern recognition | 2006

Scalable Recognition with a Vocabulary Tree

David Nistér; Henrik Stewenius

A recognition scheme that scales efficiently to a large number of objects is presented. The efficiency and quality is exhibited in a live demonstration that recognizes CD-covers from a database of 40000 images of popular music CD’s. The scheme builds upon popular techniques of indexing descriptors extracted from local regions, and is robust to background clutter and occlusion. The local region descriptors are hierarchically quantized in a vocabulary tree. The vocabulary tree allows a larger and more discriminatory vocabulary to be used efficiently, which we show experimentally leads to a dramatic improvement in retrieval quality. The most significant property of the scheme is that the tree directly defines the quantization. The quantization and the indexing are therefore fully integrated, essentially being one and the same. The recognition quality is evaluated through retrieval on a database with ground truth, showing the power of the vocabulary tree approach, going as high as 1 million images.

International Journal of Computer Vision | 2008

Detailed Real-Time Urban 3D Reconstruction from Video

Marc Pollefeys; David Nistér; Jan Michael Frahm; Amir Akbarzadeh; Philippos Mordohai; Brian Clipp; Chris Engels; David Gallup; Seon Joo Kim; Paul Merrell; C. Salmi; Sudipta N. Sinha; B. Talton; Liang Wang; Qingxiong Yang; Henrik Stewenius; Ruigang Yang; Greg Welch; Herman Towles

Abstract The paper presents a system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing steps. It employs commodity graphics hardware and standard CPU’s to achieve real-time performance. We present the main considerations in designing the system and the steps of the processing pipeline. Our system extends existing algorithms to meet the robustness and variability necessary to operate out of the lab. To account for the large dynamic range of outdoor videos the processing pipeline estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in stereo estimation without impacting the real-time performance. The required accuracy for many applications is achieved with a two-step stereo reconstruction process exploiting the redundancy across frames. We show results on real video sequences comprising hundreds of thousands of frames.

computer vision and pattern recognition | 2007

Spatial-Depth Super Resolution for Range Images

Qingxiong Yang; Ruigang Yang; James Davis; David Nistér

We present a new post-processing step to enhance the resolution of range images. Using one or two registered and potentially high-resolution color images as reference, we iteratively refine the input low-resolution range image, in terms of both its spatial resolution and depth precision. Evaluation using the Middlebury benchmark shows across-the-board improvement for sub-pixel accuracy. We also demonstrated its effectiveness for spatial resolution enhancement up to 100 times with a single reference image.

Journal of Field Robotics | 2006

Visual odometry for ground vehicle applications

David Nistér; Oleg Naroditsky; James R. Bergen

We present a system that estimates the motion of a stereo head, or a single moving camera, based on video input. The system operates in real time with low delay, and the motion estimates are used for navigational purposes. The front end of the system is a feature tracker. Point features are matched between pairs of frames and linked into image trajectories at video rate. Robust estimates of the camera motion are then produced from the feature tracks using a geometric hypothesize-and-test architecture. This generates motion estimates from visual input alone. No prior knowledge of the scene or the motion is necessary. The visual estimates can also be used in conjunction with information from other sources, such as a global positioning system, inertia sensors, wheel encoders, etc. The pose estimation method has been applied successfully to video from aerial, automotive, and handheld platforms. We focus on results obtained with a stereo head mounted on an autonomous ground vehicle. We give examples of camera trajectories estimated in real time purely from images over previously unseen distances (600 m) and periods of time.

machine vision applications | 2005

Preemptive RANSAC for live structure and motion estimation

David Nistér

A system capable of performing robust live ego-motion estimation for perspective cameras is presented. The system is powered by random sample consensus with preemptive scoring of the motion hypotheses. A general statement of the problem of efficient preemptive scoring is given. Then a theoretical investigation of preemptive scoring under a simple inlier–outlier model is performed. A practical preemption scheme is proposed and it is shown that the preemption is powerful enough to enable robust live structure and motion estimation.

computer vision and pattern recognition | 2003

An efficient solution to the five-point relative pose problem

David Nistér

An efficient algorithmic solution to the classical five-point relative pose problem is presented. The problem is to find the possible solutions for relative camera motion between two calibrated views given five corresponding points. The algorithm consists of computing the coefficients of a tenth degree polynomial and subsequently finding its roots. It is the first algorithm well suited for numerical implementation that also corresponds to the inherent complexity of the problem. The algorithm is used in a robust hypothesis-and-test framework to estimate structure and motion in real-time.

british machine vision conference | 2006

Real-time Global Stereo Matching Using Hierarchical Belief Propagation.

Qingxiong Yang; Liang Wang; Ruigang Yang; Shengnan Wang; Miao Liao; David Nistér

In this paper, we present a belief propagation based global algorithm that generates high quality results while maintaining real-time performance. To our knowledge, it is the first BP based global method that runs at real-time speed. Our efficiency performance gains mainly from the parallelism of graphics hardware,which leads to a 45 times speedup compared to the CPU implementation. To qualify the accurancy of our approach, the experimental results are evaluated on the Middlebury data sets, showing that our approach is among the best (ranked first in the new evaluation system) for all real-time approaches. In addition, since the running time of general BP is linear to the number of iterations, adopting a large number of iterations is not feasible for practical applications. Hence a novel approach is proposed to adaptively update pixel cost. Unlike general BP methods, the running time of our proposed algorithm dramatically converges.

international conference on computer vision | 2007

Real-Time Visibility-Based Fusion of Depth Maps

Paul Merrell; Amir Akbarzadeh; Liang Wang; Philippos Mordohai; Jan Michael Frahm; Ruigang Yang; David Nistér; Marc Pollefeys

We present a viewpoint-based approach for the quick fusion of multiple stereo depth maps. Our method selects depth estimates for each pixel that minimize violations of visibility constraints and thus remove errors and inconsistencies from the depth maps to produce a consistent surface. We advocate a two-stage process in which the first stage generates potentially noisy, overlapping depth maps from a set of calibrated images and the second stage fuses these depth maps to obtain an integrated surface with higher accuracy, suppressed noise, and reduced redundancy. We show that by dividing the processing into two stages we are able to achieve a very high throughput because we are able to use a computationally cheap stereo algorithm and because this architecture is amenable to hardware-accelerated (GPU) implementations. A rigorous formulation based on the notion of stability of a depth estimate is presented first. It aims to determine the validity of a depth estimate by rendering multiple depth maps into the reference view as well as rendering the reference depth map into the other views in order to detect occlusions and free- space violations. We also present an approximate alternative formulation that selects and validates only one hypothesis based on confidence. Both formulations enable us to perform video-based reconstruction at up to 25 frames per second. We show results on the multi-view stereo evaluation benchmark datasets and several outdoors video sequences. Extensive quantitative analysis is performed using an accurately surveyed model of a real building as ground truth.

computer vision and pattern recognition | 2006

Stereo Matching with Color-Weighted Correlation, Hierachical Belief Propagation and Occlusion Handling

Qı̀ngxióng Yáng; Liang Wang; Ruigang Yang; Henrik Stewenius; David Nistér

In this paper, we formulate an algorithm for the stereo matching problem with careful handling of disparity, discontinuity and occlusion. The algorithm works with a global matching stereo model based on an energy- minimization framework. The global energy contains two terms, the data term and the smoothness term. The data term is first approximated by a color-weighted correlation, then refined in occluded and low-texture areas in a repeated application of a hierarchical loopy belief propagation algorithm. The experimental results are evaluated on the Middlebury data set, showing that our algorithm is the top performer.

international symposium on 3d data processing visualization and transmission | 2006

Towards Urban 3D Reconstruction from Video

Amir Akbarzadeh; Jan Michael Frahm; Philippos Mordohai; Brian Clipp; Chris Engels; David Gallup; Paul Merrell; M. Phelps; Sudipta N. Sinha; B. Talton; Liang Wang; Qingxiong Yang; Henrik Stewenius; Ruigang Yang; Greg Welch; Herman Towles; David Nistér; Marc Pollefeys

The paper introduces a data collection system and a processing pipeline for automatic geo-registered 3D reconstruction of urban scenes from video. The system collects multiple video streams, as well as GPS and INS measurements in order to place the reconstructed models in geo- registered coordinates. Besides high quality in terms of both geometry and appearance, we aim at real-time performance. Even though our processing pipeline is currently far from being real-time, we select techniques and we design processing modules that can achieve fast performance on multiple CPUs and GPUs aiming at real-time performance in the near future. We present the main considerations in designing the system and the steps of the processing pipeline. We show results on real video sequences captured by our system.

Explore More