Henrik Stewenius
University of Kentucky
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Henrik Stewenius.
computer vision and pattern recognition | 2006
David Nistér; Henrik Stewenius
A recognition scheme that scales efficiently to a large number of objects is presented. The efficiency and quality is exhibited in a live demonstration that recognizes CD-covers from a database of 40000 images of popular music CD’s. The scheme builds upon popular techniques of indexing descriptors extracted from local regions, and is robust to background clutter and occlusion. The local region descriptors are hierarchically quantized in a vocabulary tree. The vocabulary tree allows a larger and more discriminatory vocabulary to be used efficiently, which we show experimentally leads to a dramatic improvement in retrieval quality. The most significant property of the scheme is that the tree directly defines the quantization. The quantization and the indexing are therefore fully integrated, essentially being one and the same. The recognition quality is evaluated through retrieval on a database with ground truth, showing the power of the vocabulary tree approach, going as high as 1 million images.
International Journal of Computer Vision | 2008
Marc Pollefeys; David Nistér; Jan Michael Frahm; Amir Akbarzadeh; Philippos Mordohai; Brian Clipp; Chris Engels; David Gallup; Seon Joo Kim; Paul Merrell; C. Salmi; Sudipta N. Sinha; B. Talton; Liang Wang; Qingxiong Yang; Henrik Stewenius; Ruigang Yang; Greg Welch; Herman Towles
Abstract The paper presents a system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing steps. It employs commodity graphics hardware and standard CPU’s to achieve real-time performance. We present the main considerations in designing the system and the steps of the processing pipeline. Our system extends existing algorithms to meet the robustness and variability necessary to operate out of the lab. To account for the large dynamic range of outdoor videos the processing pipeline estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in stereo estimation without impacting the real-time performance. The required accuracy for many applications is achieved with a two-step stereo reconstruction process exploiting the redundancy across frames. We show results on real video sequences comprising hundreds of thousands of frames.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2009
Qingxiong Yang; Liang Wang; Ruigang Yang; Henrik Stewenius; David Nister
In this paper, we formulate a stereo matching algorithm with careful handling of disparity, discontinuity, and occlusion. The algorithm works with a global matching stereo model based on an energy-minimization framework. The global energy contains two terms, the data term and the smoothness term. The data term is first approximated by a color-weighted correlation, then refined in occluded and low-texture areas in a repeated application of a hierarchical loopy belief propagation algorithm. The experimental results are evaluated on the Middlebury data sets, showing that our algorithm is the top performer among all the algorithms listed there.
computer vision and pattern recognition | 2006
Qı̀ngxióng Yáng; Liang Wang; Ruigang Yang; Henrik Stewenius; David Nistér
In this paper, we formulate an algorithm for the stereo matching problem with careful handling of disparity, discontinuity and occlusion. The algorithm works with a global matching stereo model based on an energy- minimization framework. The global energy contains two terms, the data term and the smoothness term. The data term is first approximated by a color-weighted correlation, then refined in occluded and low-texture areas in a repeated application of a hierarchical loopy belief propagation algorithm. The experimental results are evaluated on the Middlebury data set, showing that our algorithm is the top performer.
international symposium on 3d data processing visualization and transmission | 2006
Amir Akbarzadeh; Jan Michael Frahm; Philippos Mordohai; Brian Clipp; Chris Engels; David Gallup; Paul Merrell; M. Phelps; Sudipta N. Sinha; B. Talton; Liang Wang; Qingxiong Yang; Henrik Stewenius; Ruigang Yang; Greg Welch; Herman Towles; David Nistér; Marc Pollefeys
The paper introduces a data collection system and a processing pipeline for automatic geo-registered 3D reconstruction of urban scenes from video. The system collects multiple video streams, as well as GPS and INS measurements in order to place the reconstructed models in geo- registered coordinates. Besides high quality in terms of both geometry and appearance, we aim at real-time performance. Even though our processing pipeline is currently far from being real-time, we select techniques and we design processing modules that can achieve fast performance on multiple CPUs and GPUs aiming at real-time performance in the near future. We present the main considerations in designing the system and the steps of the processing pipeline. We show results on real video sequences captured by our system.
european conference on computer vision | 2008
David Nister; Henrik Stewenius
In this paper we present a new algorithm for computing Maximally Stable Extremal Regions (MSER), as invented by Matas et al. The standard algorithm makes use of a union-find data structure and takes quasi-linear time in the number of pixels. The new algorithm provides exactly identical results in true worst-case linear time. Moreover, the new algorithm uses significantly less memory and has better cache-locality, resulting in faster execution. Our CPU implementation performs twice as fast as a state-of-the-art FPGA implementation based on the standard algorithm. The new algorithm is based on a different computational ordering of the pixels, which is suggested by another immersion analogy than the one corresponding to the standard connected-component algorithm. With the new computational ordering, the pixels considered or visited at any point during computation consist of a single connected component of pixels in the image, resembling a flood-fill that adapts to the grey-level landscape. The computation only needs a priority queue of candidate pixels (the boundary of the single connected component), a single bit image masking visited pixels, and information for as many components as there are grey-levels in the image. This is substantially more compact in practice than the standard algorithm, where a large number of connected components must be considered in parallel. The new algorithm can also generate the component tree of the image in true linear time. The result shows that MSER detection is not tied to the union-find data structure, which may open more possibilities for parallelization.
international conference on computer vision | 2005
Henrik Stewenius; Frederik Schaffalitzky; David Nistér
We present a solution for optimal triangulation in three views. The solution is guaranteed to find the optimal solution because it computes all the stationary points of the (maximum likelihood) objective function. Internally, the solution is found by computing roots of multivariate polynomial equations, directly solving the conditions for stationarity. The solver makes use of standard methods from computational commutative algebra to convert the root-finding problem into a 47 /spl times/ 47 nonsymmetric eigenproblem. Although there are in general 47 roots, counting both real and complex ones, the number of real roots is usually much smaller. We also show experimentally that the number of stationary points that are local minima and lie in front of each camera is small but does depend on the scene geometry.
computer vision and pattern recognition | 2007
Friedrich Fraundorfer; Henrik Stewenius; David Nister
In this paper we investigate how to scale a content based image retrieval approach beyond the RAM limits of a single computer and to make use of its hard drive to store the feature database. The feature vectors describing the images in the database are binned in multiple independent ways. Each bin contains images similar to a representative prototype. Each binning is considered through two stages of processing. First, the prototype closest to the query is found. Second, the bin corresponding to the closest prototype is fetched from disk and searched completely. The query process is repeatedly performing these two stages, each time with a binning independent of the previous ones. The scheme cuts down the hard drive access significantly and results in a major speed up. An experimental comparison between the binning scheme and a raw search shows competitive retrieval quality.
computer vision and pattern recognition | 2007
Christopher Geyer; Henrik Stewenius
We present a minimal-point algorithm for finding fundamental matrices for catadioptric cameras of the parabolic type. Central catadioptric cameras-an optical combination of a mirror and a lens that yields an imaging device equivalent within hemispheres to perspective cameras-have found wide application in robotics, tele-immersion and providing enhanced situational awareness for remote operation. We use an uncalibrated structure-from-motion framework developed for these cameras to consider the problem of estimating the fundamental matrix for such cameras. We present a solution that can compute the para-catadioptirc fundamental matrix with nine point correspondences, the smallest number possible. We compare this algorithm to alternatives and show some results of using the algorithm in conjunction with random sample consensus (RANSAC).
international conference on computer vision | 2005
David Nistér; Henrik Stewenius; Etienne Grossmann
In this paper, we develop a theory of non-parametric self-calibration. Recently, schemes have been devised for non-parametric laboratory calibration, but not for self-calibration. We allow an arbitrary warp to model the intrinsic mapping, with the only restriction that the camera is central and that the intrinsic mapping has a well-defined non-singular matrix derivative at a finite number of points under study. We give a number of theoretical results, both for infinitesimal motion and finite motion, for a finite number of observations and when observing motion over a dense image, for rotation and translation. Our main result is that through observing the flow induced by three instantaneous rotations at a finite number of points of the distorted image, we can perform projective reconstruction of those image points on the undistorted image. We present some results with synthetic and real data.