Christoph Rhemann | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christoph Rhemann is active.

Explore More

Publication

Featured researches published by Christoph Rhemann.

computer vision and pattern recognition | 2011

Fast cost-volume filtering for visual correspondence and beyond

Christoph Rhemann; Asmaa Hosni; Michael Bleyer; Carsten Rother; Margrit Gelautz

Many computer vision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such solutions can be efficiently achieved by smoothing the label costs with a very fast edge preserving filter. In this paper we propose a generic and simple framework comprising three steps: (i) constructing a cost volume (ii) fast cost volume filtering and (iii) winner-take-all label selection. Our main contribution is to show that with such a simple framework state-of-the-art results can be achieved for several computer vision applications. In particular, we achieve (i) disparity maps in real-time, whose quality exceeds those of all other fast (local) approaches on the Middlebury stereo benchmark, and (ii) optical flow fields with very fine structures as well as large displacements. To demonstrate robustness, the few parameters of our framework are set to nearly identical values for both applications. Also, competitive results for interactive image segmentation are presented. With this work, we hope to inspire other researchers to leverage this framework to other application areas.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013

Fast Cost-Volume Filtering for Visual Correspondence and Beyond

Asmaa Hosni; Christoph Rhemann; Michael Bleyer; Carsten Rother; Margrit Gelautz

Many computer vision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such solutions can be efficiently achieved by smoothing the label costs with a very fast edge-preserving filter. In this paper, we propose a generic and simple framework comprising three steps: 1) constructing a cost volume, 2) fast cost volume filtering, and 3) Winner-Takes-All label selection. Our main contribution is to show that with such a simple framework state-of-the-art results can be achieved for several computer vision applications. In particular, we achieve 1) disparity maps in real time whose quality exceeds those of all other fast (local) approaches on the Middlebury stereo benchmark, and 2) optical flow fields which contain very fine structures as well as large displacements. To demonstrate robustness, the few parameters of our framework are set to nearly identical values for both applications. Also, competitive results for interactive image segmentation are presented. With this work, we hope to inspire other researchers to leverage this framework to other application areas.

british machine vision conference | 2011

PatchMatch Stereo - Stereo Matching with Slanted Support Windows.

Michael Bleyer; Christoph Rhemann; Carsten Rother

Common local stereo methods match support windows at integer-valued disparities. The implicit assumption that pixels within the support region have constant disparity does not hold for slanted surfaces and leads to a bias towards reconstructing frontoparallel surfaces. This work overcomes this bias by estimating an individual 3D plane at each pixel onto which the support region is projected. The major challenge of this approach is to find a pixel’s optimal 3D plane among all possible planes whose number is infinite. We show that an ideal algorithm to solve this problem is PatchMatch [1] that we extend to find an approximate nearest neighbor according to a plane. In addition to PatchMatch’s spatial propagation scheme, we propose (1) view propagation where planes are propagated among left and right views of the stereo pair and (2) temporal propagation where planes are propagated from preceding and consecutive frames of a video when doing temporal stereo. Adaptive support weights are used in matching cost aggregation to improve results at disparity borders. We also show that our slanted support windows can be used to compute a cost volume for global stereo methods, which allows for explicit treatment of occlusions and can handle large untextured regions. In the results we demonstrate that our method reconstructs highly slanted surfaces and achieves impressive disparity details with sub-pixel precision. In the Middlebury table, our method is currently top-performer among local methods and takes rank 2 among approximately 110 competitors if sub-pixel precision is considered.

international conference on image processing | 2009

Local stereo matching using geodesic support weights

Asmaa Hosni; Michael Bleyer; Margrit Gelautz; Christoph Rhemann

Local stereo matching has recently experienced large progress by the introduction of new support aggregation schemes. These approaches estimate a pixels support region via color segmentation. Our contribution lies in an improved method for accomplishing this segmentation. Inside a square support window, we compute the geodesic distance from all pixels to the windows center pixel. Pixels of low geodesic distance are given high support weights and therefore large influence in the matching process. In contrast to previous work, we enforce connectivity by using the geodesic distance transform. For obtaining a high support weight, a pixel must have a path to the center point along which the color does not change significantly. This connectivity property leads to improved segmentation results and consequently to improved disparity maps. The success of our geodesic approach is demonstrated on the Middlebury images. According to the Middlebury benchmark, the proposed algorithm is the top performer among local stereo methods at the current state-of-the-art.

computer vision and pattern recognition | 2009

A perceptually motivated online benchmark for image matting

Christoph Rhemann; Carsten Rother; Jue Wang; Margrit Gelautz; Pushmeet Kohli; Pamela Rott

The availability of quantitative online benchmarks for low-level vision tasks such as stereo and optical flow has led to significant progress in the respective fields. This paper introduces such a benchmark for image matting. There are three key factors for a successful benchmarking system: (a) a challenging, high-quality ground truth test set; (b) an online evaluation repository that is dynamically updated with new results; (c) perceptually motivated error functions. Our new benchmark strives to meet all three criteria. We evaluated several matting methods with our benchmark and show that their performance varies depending on the error function. Also, our challenging test set reveals problems of existing algorithms, not reflected in previously reported results. We hope that our effort will lead to considerable progress in the field of image matting, and welcome the reader to visit our benchmark at www.aIphamatting.com.

computer vision and pattern recognition | 2011

A global sampling method for alpha matting

Kaiming He; Christoph Rhemann; Carsten Rother; Xiaoou Tang; Jian Sun

Alpha matting refers to the problem of softly extracting the foreground from an image. Given a trimap (specifying known foreground/background and unknown pixels), a straightforward way to compute the alpha value is to sample some known foreground and background colors for each unknown pixel. Existing sampling-based matting methods often collect samples near the unknown pixels only. They fail if good samples cannot be found nearby. In this paper, we propose a global sampling method that uses all samples available in the image. Our global sample set avoids missing good samples. A simple but effective cost function is defined to tackle the ambiguity in the sample selection process. To handle the computational complexity introduced by the large number of samples, we pose the sampling task as a correspondence problem. The correspondence search is efficiently achieved by generalizing a randomized algorithm previously designed for patch matching[3]. A variety of experiments show that our global sampling method produces both visually and quantitatively high-quality matting results.

human factors in computing systems | 2015

Accurate, Robust, and Flexible Real-time Hand Tracking

Toby Sharp; Cem Keskin; Jonathan Taylor; Jamie Shotton; David Kim; Christoph Rhemann; Ido Leichter; Alon Vinnikov; Yichen Wei; Daniel Freedman; Pushmeet Kohli; Eyal Krupka; Andrew W. Fitzgibbon; Shahram Izadi

We present a new real-time hand tracking system based on a single depth camera. The system can accurately reconstruct complex hand poses across a variety of subjects. It also allows for robust tracking, rapidly recovering from any temporary failures. Most uniquely, our tracker is highly flexible, dramatically improving upon previous approaches which have focused on front-facing close-range scenarios. This flexibility opens up new possibilities for human-computer interaction with examples including tracking at distances from tens of centimeters through to several meters (for controlling the TV at a distance), supporting tracking using a moving depth camera (for mobile scenarios), and arbitrary camera placements (for VR headsets). These features are achieved through a new pipeline that combines a multi-layered discriminative reinitialization strategy for per-frame pose estimation, followed by a generative model-fitting stage. We provide extensive technical details and a detailed qualitative and quantitative analysis.

british machine vision conference | 2008

Improving Color Modeling for Alpha Matting.

Christoph Rhemann; Carsten Rother; Margrit Gelautz

This paper addresses the problem of extracting an alpha matte from a single photograph given a user-defined trimap. A crucial part of this task is the color modeling step where for each pixel the optimal alpha value, together with its confidence, is estimated individually. This forms the data term of the objective function. It comprises of three steps: (i) Collecting a candidate set of potential foreand background colors; (ii) Selecting high confidence samples from the candidate set; (iii) Estimating a sparsity prior to remove blurry artifacts. We introduce novel ideas for each of these steps and show that our approach considerably improves over state-of-the-art techniques by evaluating it on a large database of 54 images with known high-quality ground truth.

international conference on computer graphics and interactive techniques | 2016

Fusion4D: real-time performance capture of challenging scenes

Mingsong Dou; Sameh Khamis; Yury Degtyarev; Philip Lindsley Davidson; Sean Ryan Fanello; Adarsh Prakash Murthy Kowdle; Sergio Orts Escolano; Christoph Rhemann; David Kim; Jonathan Taylor; Pushmeet Kohli; Vladimir Tankovich; Shahram Izadi

We contribute a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time. Our algorithm supports both incremental reconstruction, improving the surface estimation over time, as well as parameterizing the nonrigid scene motion. Our approach is highly robust to both large frame-to-frame motion and topology changes, allowing us to reconstruct extremely challenging scenes. We demonstrate advantages over related real-time techniques that either deform an online generated template or continually fuse depth data nonrigidly into a single reference model. Finally, we show geometric reconstruction results on par with offline methods which require orders of magnitude more processing time and many more RGBD cameras.

international symposium on mixed and augmented reality | 2013

MonoFusion: Real-time 3D reconstruction of small scenes with a single web camera

Vivek Pradeep; Christoph Rhemann; Shahram Izadi; Christopher Zach; Michael Bleyer; Steven Bathiche

MonoFusion allows a user to build dense 3D reconstructions of their environment in real-time, utilizing only a single, off-the-shelf web camera as the input sensor. The camera could be one already available in a tablet, phone, or a standalone device. No additional input hardware is required. This removes the need for power intensive active sensors that do not work robustly in natural outdoor lighting. Using the input stream of the camera we first estimate the 6DoF camera pose using a sparse tracking method. These poses are then used for efficient dense stereo matching between the input frame and a key frame (extracted previously). The resulting dense depth maps are directly fused into a voxel-based implicit model (using a computationally inexpensive method) and surfaces are extracted per frame. The system is able to recover from tracking failures as well as filter out geometrically inconsistent noise from the 3D reconstruction. Our method is both simple to implement and efficient, making such systems even more accessible. This paper details the algorithmic components that make up our system and a GPU implementation of our approach. Qualitative results demonstrate high quality reconstructions even visually comparable to active depth sensor-based systems such as KinectFusion.

Explore More