Michael Bleyer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michael Bleyer is active.

Explore More

Publication

Featured researches published by Michael Bleyer.

computer vision and pattern recognition | 2011

Fast cost-volume filtering for visual correspondence and beyond

Christoph Rhemann; Asmaa Hosni; Michael Bleyer; Carsten Rother; Margrit Gelautz

Many computer vision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such solutions can be efficiently achieved by smoothing the label costs with a very fast edge preserving filter. In this paper we propose a generic and simple framework comprising three steps: (i) constructing a cost volume (ii) fast cost volume filtering and (iii) winner-take-all label selection. Our main contribution is to show that with such a simple framework state-of-the-art results can be achieved for several computer vision applications. In particular, we achieve (i) disparity maps in real-time, whose quality exceeds those of all other fast (local) approaches on the Middlebury stereo benchmark, and (ii) optical flow fields with very fine structures as well as large displacements. To demonstrate robustness, the few parameters of our framework are set to nearly identical values for both applications. Also, competitive results for interactive image segmentation are presented. With this work, we hope to inspire other researchers to leverage this framework to other application areas.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013

Fast Cost-Volume Filtering for Visual Correspondence and Beyond

Asmaa Hosni; Christoph Rhemann; Michael Bleyer; Carsten Rother; Margrit Gelautz

Many computer vision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such solutions can be efficiently achieved by smoothing the label costs with a very fast edge-preserving filter. In this paper, we propose a generic and simple framework comprising three steps: 1) constructing a cost volume, 2) fast cost volume filtering, and 3) Winner-Takes-All label selection. Our main contribution is to show that with such a simple framework state-of-the-art results can be achieved for several computer vision applications. In particular, we achieve 1) disparity maps in real time whose quality exceeds those of all other fast (local) approaches on the Middlebury stereo benchmark, and 2) optical flow fields which contain very fine structures as well as large displacements. To demonstrate robustness, the few parameters of our framework are set to nearly identical values for both applications. Also, competitive results for interactive image segmentation are presented. With this work, we hope to inspire other researchers to leverage this framework to other application areas.

british machine vision conference | 2011

PatchMatch Stereo - Stereo Matching with Slanted Support Windows.

Michael Bleyer; Christoph Rhemann; Carsten Rother

Common local stereo methods match support windows at integer-valued disparities. The implicit assumption that pixels within the support region have constant disparity does not hold for slanted surfaces and leads to a bias towards reconstructing frontoparallel surfaces. This work overcomes this bias by estimating an individual 3D plane at each pixel onto which the support region is projected. The major challenge of this approach is to find a pixel’s optimal 3D plane among all possible planes whose number is infinite. We show that an ideal algorithm to solve this problem is PatchMatch [1] that we extend to find an approximate nearest neighbor according to a plane. In addition to PatchMatch’s spatial propagation scheme, we propose (1) view propagation where planes are propagated among left and right views of the stereo pair and (2) temporal propagation where planes are propagated from preceding and consecutive frames of a video when doing temporal stereo. Adaptive support weights are used in matching cost aggregation to improve results at disparity borders. We also show that our slanted support windows can be used to compute a cost volume for global stereo methods, which allows for explicit treatment of occlusions and can handle large untextured regions. In the results we demonstrate that our method reconstructs highly slanted surfaces and achieves impressive disparity details with sub-pixel precision. In the Middlebury table, our method is currently top-performer among local methods and takes rank 2 among approximately 110 competitors if sub-pixel precision is considered.

international conference on image processing | 2009

Local stereo matching using geodesic support weights

Asmaa Hosni; Michael Bleyer; Margrit Gelautz; Christoph Rhemann

Local stereo matching has recently experienced large progress by the introduction of new support aggregation schemes. These approaches estimate a pixels support region via color segmentation. Our contribution lies in an improved method for accomplishing this segmentation. Inside a square support window, we compute the geodesic distance from all pixels to the windows center pixel. Pixels of low geodesic distance are given high support weights and therefore large influence in the matching process. In contrast to previous work, we enforce connectivity by using the geodesic distance transform. For obtaining a high support weight, a pixel must have a path to the center point along which the color does not change significantly. This connectivity property leads to improved segmentation results and consequently to improved disparity maps. The success of our geodesic approach is demonstrated on the Middlebury images. According to the Middlebury benchmark, the proposed algorithm is the top performer among local stereo methods at the current state-of-the-art.

international conference on image processing | 2004

A layered stereo algorithm using image segmentation and global visibility constraints

Michael Bleyer; Margrit Gelautz

We propose a new stereo algorithm which uses colour segmentation to allow the handling of large untextured regions and precise localization of depth boundaries. Each segment is modelled as a plane. Robustness of the depth representation is achieved by the use of a layered model. Layers are extracted by mean-shift-based clustering of depth planes. For layer assignment a global cost function is defined. The quality of the disparity map is measured by warping the reference image to the second view and compares it with the real image. Z-buffering enforces visibility and allows the explicit detection of occlusions. An efficient greedy algorithm searches for a local minimum of the cost function. Layer extraction and assignment are alternately applied. Results obtained for benchmark and self-recorded images indicate that the proposed algorithm can compete with the state-of-the-art.

computer vision and pattern recognition | 2011

Object stereo — Joint stereo matching and object segmentation

Michael Bleyer; Carsten Rother; Pushmeet Kohli; Daniel Scharstein; Sudipta N. Sinha

This paper presents a method for joint stereo matching and object segmentation. In our approach a 3D scene is represented as a collection of visually distinct and spatially coherent objects. Each object is characterized by three different aspects: a color model, a 3D plane that approximates the objects disparity distribution, and a novel 3D connectivity property. Inspired by Markov Random Field models of image segmentation, we employ object-level color models as a soft constraint, which can aid depth estimation in powerful ways. In particular, our method is able to recover the depth of regions that are fully occluded in one input view, which to our knowledge is new for stereo matching. Our model is formulated as an energy function that is optimized via fusion moves. We show high-quality disparity and object segmentation results on challenging image pairs as well as standard benchmarks. We believe our work not only demonstrates a novel synergy between the areas of image segmentation and stereo matching, but may also inspire new work in the domain of automatic and interactive object-level scene manipulation.

computer vision and pattern recognition | 2010

Surface stereo with soft segmentation

Michael Bleyer; Carsten Rother; Pushmeet Kohli

This paper proposes a new stereo model which encodes the simple assumption that the scene is composed of a few, smooth surfaces. A key feature of our model is the surface-based representation, where each pixel is assigned to a 3D surface (planes or B-splines). This representation enables several important contributions: Firstly, we formulate a higher-order prior which states that pixels of similar appearance are likely to belong to the same 3D surface. This enables to incorporate the very popular color segmentation constraint in a soft and principled way. Secondly, we use a global MDL prior to penalize the number of surfaces. Thirdly, we are able to incorporate, in a simple way, a prior which favors low curvature surfaces. Fourthly, we improve the asymmetric occlusion model by disallowing pixels of the same surface to occlude each other. Finally, we use the known fusion move approach which enables a powerful optimization of our model, despite the infinite number of possible labelings (surfaces).

Signal Processing-image Communication | 2007

Graph-cut-based stereo matching using image segmentation with symmetrical treatment of occlusions

Michael Bleyer; Margrit Gelautz

This paper describes a dense stereo matching algorithm for epipolar rectified images. The method applies colour segmentation on the reference image. Our basic assumptions are that disparity varies smoothly inside a segment, while disparity boundaries coincide with the segment borders. The use of these assumptions makes the algorithm capable of handling large untextured regions, estimating precise depth boundaries and propagating disparity information to occluded regions, which are challenging tasks for conventional stereo methods. We model disparity inside a segment by a planar equation. Initial disparity segments are clustered to form a set of disparity layers, which are planar surfaces that are likely to occur in the scene. Assignments of segments to disparity layers are then derived by minimization of a global cost function. This cost function is based on the observation that occlusions cannot be dealt with in the domain of segments. Therefore, we propose a novel cost function that is defined on two levels, one representing the segments and the other corresponding to pixels. The basic idea is that a pixel has to be assigned to the same disparity layer as its segment, but can as well be occluded. The cost function is then effectively minimized via graph-cuts. In the experimental results, we show that our method produces good-quality results, especially in regions of low texture and close to disparity boundaries. Results obtained for the Middlebury test set indicate that the proposed method is able to compete with the best-performing state-of-the-art algorithms.

Computer Vision and Image Understanding | 2013

Secrets of adaptive support weight techniques for local stereo matching

Asmaa Hosni; Michael Bleyer; Margrit Gelautz

Highlights? Study of different strategies for computing adaptive support weights in local stereo matching. ? Our study sheds light on potential trade-offs between the accuracy and computational efficiency. ? The experiments are conducted on 35 stereo pairs of Middlebury with ground truth data. ? Our evaluation study is useful for practical applications. In recent years, local stereo matching algorithms have again become very popular in the stereo community. This is mainly due to the introduction of adaptive support weight algorithms that can for the first time produce results that are on par with global stereo methods. The crux in these adaptive support weight methods is to assign an individual weight to each pixel within the support window. Adaptive support weight algorithms differ mainly in the manner in which this weight computation is carried out.In this paper we present an extensive evaluation study. We evaluate the performance of various methods for computing adaptive support weights including the original bilateral filter-based weights, as well as more recent approaches based on geodesic distances or on the guided filter. To obtain reliable findings, we test these different weight functions on a large set of 35 ground truth disparity pairs. We have implemented all approaches on the GPU, which allows for a fair comparison of run time on modern hardware platforms. Apart from the standard local matching using fronto-parallel windows, we also embed the competing weight functions into the recent PatchMatch Stereo approach, which uses slanted sub-pixel windows and represents a state-of-the-art local algorithm. In the final part of the paper, we aim at shedding light on general points of adaptive support weight matching, which, for example, includes a discussion about symmetric versus asymmetric support weight approaches.

international symposium on mixed and augmented reality | 2013

MonoFusion: Real-time 3D reconstruction of small scenes with a single web camera

Vivek Pradeep; Christoph Rhemann; Shahram Izadi; Christopher Zach; Michael Bleyer; Steven Bathiche

MonoFusion allows a user to build dense 3D reconstructions of their environment in real-time, utilizing only a single, off-the-shelf web camera as the input sensor. The camera could be one already available in a tablet, phone, or a standalone device. No additional input hardware is required. This removes the need for power intensive active sensors that do not work robustly in natural outdoor lighting. Using the input stream of the camera we first estimate the 6DoF camera pose using a sparse tracking method. These poses are then used for efficient dense stereo matching between the input frame and a key frame (extracted previously). The resulting dense depth maps are directly fused into a voxel-based implicit model (using a computationally inexpensive method) and surfaces are extracted per frame. The system is able to recover from tracking failures as well as filter out geometrically inconsistent noise from the 3D reconstruction. Our method is both simple to implement and efficient, making such systems even more accessible. This paper details the algorithmic components that make up our system and a GPU implementation of our approach. Qualitative results demonstrate high quality reconstructions even visually comparable to active depth sensor-based systems such as KinectFusion.

Explore More