Johannes L. Schönberger
University of North Carolina at Chapel Hill
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Johannes L. Schönberger.
computer vision and pattern recognition | 2016
Johannes L. Schönberger; Jan Michael Frahm
Incremental Structure-from-Motion is a prevalent strategy for 3D reconstruction from unordered image collections. While incremental reconstruction systems have tremendously advanced in all regards, robustness, accuracy, completeness, and scalability remain the key problems towards building a truly general-purpose pipeline. We propose a new SfM technique that improves upon the state of the art to make a further step towards this ultimate goal. The full reconstruction pipeline is released to the public as an open-source implementation.
computer vision and pattern recognition | 2015
Johannes L. Schönberger; Filip Radenovic; Ondrej Chum; Jan Michael Frahm
Structure-from-Motion for unordered image collections has significantly advanced in scale over the last decade. This impressive progress can be in part attributed to the introduction of efficient retrieval methods for those systems. While this boosts scalability, it also limits the amount of detail that the large-scale reconstruction systems are able to produce. In this paper, we propose a joint reconstruction and retrieval system that maintains the scalability of large-scale Structure-from-Motion systems while also recovering the often lost ability of reconstructing fine details of the scene. We demonstrate our proposed method on a large-scale dataset of 7.4 million images downloaded from the Internet.
european conference on computer vision | 2016
Johannes L. Schönberger; Enliang Zheng; Jan Michael Frahm; Marc Pollefeys
This work presents a Multi-View Stereo system for robust and efficient dense modeling from unstructured image collections. Our core contributions are the joint estimation of depth and normal information, pixelwise view selection using photometric and geometric priors, and a multi-view geometric consistency term for the simultaneous refinement and image-based depth and normal fusion. Experiments on benchmarks and large-scale Internet photo collections demonstrate state-of-the-art performance in terms of accuracy, completeness, and efficiency.
computer vision and pattern recognition | 2015
Jared Heinly; Johannes L. Schönberger; Enrique Dunn; Jan Michael Frahm
We propose a novel, large-scale, structure-from-motion framework that advances the state of the art in data scalability from city-scale modeling (millions of images) to world-scale modeling (several tens of millions of images) using just a single computer. The main enabling technology is the use of a streaming-based framework for connected component discovery. Moreover, our system employs an adaptive, online, iconic image clustering approach based on an augmented bag-of-words representation, in order to balance the goals of registration, comprehensiveness, and data compactness. We demonstrate our proposal by operating on a recent publicly available 100 million image crowd-sourced photo collection containing images geographically distributed throughout the entire world. Results illustrate that our streaming-based approach does not compromise model completeness, but achieves unprecedented levels of efficiency and scalability.
computer vision and pattern recognition | 2017
Johannes L. Schönberger; Hans Hardmeier; Torsten Sattler; Marc Pollefeys
Matching local image descriptors is a key step in many computer vision applications. For more than a decade, hand-crafted descriptors such as SIFT have been used for this task. Recently, multiple new descriptors learned from data have been proposed and shown to improve on SIFT in terms of discriminative power. This paper is dedicated to an extensive experimental evaluation of learned local features to establish a single evaluation protocol that ensures comparable results. In terms of matching performance, we evaluate the different descriptors regarding standard criteria. However, considering matching performance in isolation only provides an incomplete measure of a descriptors quality. For example, finding additional correct matches between similar images does not necessarily lead to a better performance when trying to match images under extreme viewpoint or illumination changes. Besides pure descriptor matching, we thus also evaluate the different descriptors in the context of image-based reconstruction. This enables us to study the descriptor performance on a set of more practical criteria including image retrieval, the ability to register images under strong viewpoint and illumination changes, and the accuracy and completeness of the reconstructed cameras and scenes. To facilitate future research, the full evaluation pipeline is made publicly available.
computer vision and pattern recognition | 2015
Johannes L. Schönberger; Alexander C. Berg; Jan Michael Frahm
Large-scale Structure-from-Motion systems typically spend major computational effort on pairwise image matching and geometric verification in order to discover connected components in large-scale, unordered image collections. In recent years, the research community has spent significant effort on improving the efficiency of this stage. In this paper, we present a comprehensive overview of various state-of-the-art methods, evaluating and analyzing their performance. Based on the insights of this evaluation, we propose a learning-based approach, the PAirwise Image Geometry Encoding (PAIGE), to efficiently identify image pairs with scene overlap without the need to perform exhaustive putative matching and geometric verification. PAIGE achieves state-of-the-art performance and integrates well into existing Structure-from-Motion pipelines.
computer vision and pattern recognition | 2017
Thomas Schöps; Johannes L. Schönberger; Silvano Galliani; Torsten Sattler; Konrad Schindler; Marc Pollefeys; Andreas Geiger
Motivated by the limitations of existing multi-view stereo benchmarks, we present a novel dataset for this task. Towards this goal, we recorded a variety of indoor and outdoor scenes using a high-precision laser scanner and captured both high-resolution DSLR imagery as well as synchronized low-resolution stereo videos with varying fields-of-view. To align the images with the laser scans, we propose a robust technique which minimizes photometric errors conditioned on the geometry. In contrast to previous datasets, our benchmark provides novel challenges and covers a diverse set of viewpoints and scene types, ranging from natural scenes to man-made indoor and outdoor environments. Furthermore, we provide data at significantly higher temporal and spatial resolution. Our benchmark is the first to cover the important use case of hand-held mobile devices while also providing high-resolution DSLR camera images. We make our datasets and an online evaluation server available at http://www.eth3d.net.
computer vision and pattern recognition | 2016
Filip Radenovic; Johannes L. Schönberger; Dinghuang Ji; Jan Michael Frahm; Ondrej Chum; Jiri Matas
Internet photo collections naturally contain a large variety of illumination conditions, with the largest difference between day and night images. Current modeling techniques do not embrace the broad illumination range often leading to reconstruction failure or severe artifacts. We present an algorithm that leverages the appearance variety to obtain more complete and accurate scene geometry along with consistent multi-illumination appearance information. The proposed method relies on automatic scene appearance grouping, which is used to obtain separate dense 3D models. Subsequent model fusion combines the separate models into a complete and accurate reconstruction of the scene. In addition, we propose a method to derive the appearance information for the model under the different illumination conditions, even for scene parts that are not observed under one illumination condition. To achieve this, we develop a cross-illumination color transfer technique. We evaluate our method on a large variety of landmarks from across Europe reconstructed from a database of 7.4M images.
asian conference on computer vision | 2016
Johannes L. Schönberger; True Price; Torsten Sattler; Jan Michael Frahm; Marc Pollefeys
Spatial verification is a crucial part of every image retrieval system, as it accounts for the fact that geometric feature configurations are typically ignored by the Bag-of-Words representation. Since spatial verification quickly becomes the bottleneck of the retrieval process, runtime efficiency is extremely important. At the same time, spatial verification should be able to reliably distinguish between related and unrelated images. While methods based on RANSAC’s hypothesize-and-verify framework achieve high accuracy, they are not particularly efficient. Conversely, verification approaches based on Hough voting are extremely efficient but not as accurate. In this paper, we develop a novel spatial verification approach that uses an efficient voting scheme to identify promising transformation hypotheses that are subsequently verified and refined. Through comprehensive experiments, we show that our method is able to achieve a verification accuracy similar to state-of-the-art hypothesize-and-verify approaches while providing faster runtimes than state-of-the-art voting-based methods.
german conference on pattern recognition | 2015
Johannes L. Schönberger; Alexander C. Berg; Jan Michael Frahm
Typical Structure-from-Motion systems spend major computational effort on geometric verification. Geometric verification recovers the epipolar geometry of two views for a moving camera by estimating a fundamental or essential matrix. The essential matrix describes the relative geometry for two views up to an unknown scale. Two-view triangulation or multi-model estimation approaches can reveal the relative geometric configuration of two views, e.g., small or large baseline and forward or sideward motion. Information about the relative configuration is essential for many problems in Structure-from-Motion. However, essential matrix estimation and assessment of the relative geometric configuration are computationally expensive. In this paper, we propose a learning-based approach for efficient two-view geometry classification, leveraging the by-products of feature matching. Our approach can predict whether two views have scene overlap and for overlapping views it can assess the relative geometric configuration. Experiments on several datasets demonstrate the performance of the proposed approach and its utility for Structure-from-Motion.